Program Translation - Introductory Concepts

Key concepts

translator - allows the programmer to express an algorithm in a language other than the machine language for a specific machine. Why?

source language - is the input for a translator

A translator can produce:
(a) host language - typically the machine language of the host computer
(b) target language - typically the machine language of the computer other than the host

Note1: Sometimes a nonmachine language is chosen as the target language. Examples:
pascal source > translator > assembly language
pascal source > translator > P-code
pascal source > translator > C-code

Note2: If the target language is the machine language of the host machine, then the code is often called "machine code"

Why generate nonmachine code?

Languages can be categorized as:
(1) general purpose - allows the solution of a wide variety of problems.
(2) special purpose - allows the solution of only a narrow variety of problems.

Name two of each. What is assembly?

Languages can also be categorized as:
(1) nonprocedural - include two types: (a) functional (applicative) - these languages are based on the specification of value (b) logic - these languages are based on predicate logic, equational logic, or set theory
(2) procedural - include two types: (a) machine dependent - designed to permit the programmer to control the machine in detail (b) machine independent - allow one to represent an algorithm without reference to a specific machine.

Classify each of the following as 1a, 1b, 2a, or 2b:
Pascal, C, Assembly, LISP, FORTRAN, COBOL, SML, Prolog

The translation of an algorithm requires the translator to:
(1) analyze - determine what actions are to be performed
(2) synthesize - produce the desired machine/target language representation

Analysis involves three areas: (1) lexical (2) syntactic (3) semantic

Synthesis involves two kinds: (1) Interpretation (2) Generation

Other concepts of importance involving translation include:

macro processor - allows the programmer the ability to define a sequence of code once and then refer to it by name each time it is to be assembled

conditional translation - provides a mechanism for the conditional performance of part of the translation

compiler - translates a programming language as source to a machine oriented language as target

linker - sometimes known as: binder, linkage editor, consolidator takes independently translated programs whose original source language representation may include symbolic references and produces a single program. Is this program directly ready for execution?

loader - takes as input a program that was produced by an assembler, compiler, or linker and prepares the program to be executed in a physical set of memory locations

relocating loader - loads the program into a set of physical memory locations (obtained by the OS) and inserts correct absolute addresses which correspond to the starting location obtained by the OS

load-and-go assembler - simplest assembler (usually one pass) that accepts a source program and produces a machine language program in main memory which is ready to execute. The machine language program occupies memory location which are fixed at the time of translation.

module assemblers - usually two pass assemblers where the first pass is used to collect symbolic references into a symbol table and the second pass generates code which is not quite machine language. The code generated is referred to as "relocatable code" or "object code". Assembly language programs for module assemblers contain three types of information:
(1) Absolute - this information exists in the form of opcodes, string and numeric constants, and fixed (or absolute) addresses
(2) Relative - include addresses of instructions and storage (e.g. DATA segment). These addresses are fixed relative to the beginning of the program.
(3) External - references used within a module but not defined within that module. This information might be relative or absolute and is not generally known at the time of translation.

Therefore, the object module must include information about:
(1) which addresses are relative
(2) which symbols have been externally defined
(3) which internal symbols might be referenced externally

Who resolves external references?
Who resolves relative references?

This beings us to our final "Program Translation Sequence" as follows:


© 1995 Douglas J. Ryan / ryandj@pacificu.edu