Chapter 2 - Symbol Tables

Symbol Tables

Reading: 60 - 62, 429 - 440

The symbol table is a major construct used by the compiler to collect information about symbols during the analysis portion. This information is then used during the synthesis portion to help generate code.

So what might be the contents of the symbol table?

1. the identifier itself (lexeme)

a. Typically two choices for storage exist

store the identifier in a fixed-size field in the table
store the identifier in a separate string table

Q#1: Discuss the advantages/disadvantages of 1. over 2.

Q#2: When is the identifier information obtained?

2. run-time storage location

a. we must record where the value of the id will be stored during execution.

b. when do we obtain this information?

if the value is kept in memory, we can determine this information when the compiler recognizes the declaration statement for that variable.

Example1: if all data is to be kept in a block of memory, (a) maintain a pointer to the next free space in the block (b) allocate space to the variable (c) move free space up by size of variable.
Example2: if all data is kept on a runtime stack, (a) globals go at the base of the stack, (b) run-time locations for locals are given as an offset from the associated stack frame of the respective procedure.)

if the value is to be kept in a register, wait until the code generation phase to fill in this area of the table.

3. type

a. types can be implicitly or explicitly declared
b. type information is used by whom?

run-time storage requirements
semantic analysis phase

Q#3: Give an example of how the compiler might use the type in the semantic analysis phase.

4. array dimension or number of parameters for a procedure

a. during the semantic analysis phase, these are used for consistency checks
b. this information is filled in by the parser (or semantic analysis phase) when the declaration for the array/procedure are seen.

5. level

a. this is used to maintain scope information in block-structured languages.

Example: (a) maintain a global variable for current level (b) globals are at level zero (c) each time a procedure or a new block is encountered, increment current level (d) all variables within current procedure/block are assigned current level (e) when the end of the block or procedure is encountered we delete all entries in the ST at the current level and then decrement the current level (f) when referencing a variable, we begin by looking from the current level down to zero.

6. cross-reference information

a. the line number of where the variable was declared
b. all line numbers of references to the variable
c. allowance for sorted cross-reference information to be printed

What operations are necessary for our ST?

Insertion
Deletion
Searching
Printing

Q#4: Which of the above operations will most likely be performed most often?

Symbol Table Data Structures

When discussing Symbol Table Data Structures, we need to look at both dynamic and static data structures. As with most data structures we discuss in Computer Science, we always seem to come up against the classic time-space tradeoff issue.

Q#5: If you had to make some general statements concerning time-space tradeoffs in regard to static vs dynamic data structures, what would they be?

Following is a list of possible ST data structures. Let's look at the advantages and disadvantages of each data structure using a worst case scenerio.

1. Unordered Linear List

Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)

2. Ordered Linear List

Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)

3. Binary Search Tree

Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)

4. AVL Tree

Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)