Symbol Tables
Reading: 60 - 62, 429 - 440
The symbol table is a major construct used by the compiler to collect information about symbols during the analysis portion. This information is then used during the synthesis portion to help generate code.
So what might be the contents of the symbol table?
1. the identifier itself (lexeme)
a. Typically two choices for storage exist
- store the identifier in a fixed-size field in the table
- store the identifier in a separate string table
Q#1: Discuss the advantages/disadvantages of 1. over 2.
Q#2: When is the identifier information obtained?
2. run-time storage location
a. we must record where the value of the id will be stored during
execution.
b. when do we obtain this information?
- if the value is kept in memory, we can determine this
information when the compiler recognizes the declaration
statement for that variable.
- Example1: if all data is to
be kept in a block of memory, (a) maintain a pointer to the
next free space in the block (b) allocate space to the
variable (c) move free space up by size of variable.
- Example2: if all data is kept on a runtime stack, (a)
globals go at the base of the stack, (b) run-time locations
for locals are given as an offset from the associated stack
frame of the respective procedure.)
- if the value is to be kept in a register, wait until the
code generation phase to fill in this area of the table.
3. type
a. types can be implicitly or explicitly declared
b. type information is used by whom?
- run-time storage requirements
- semantic analysis phase
Q#3: Give an example of how the compiler might use the type in the
semantic analysis phase.
4. array dimension or number of parameters for a procedure
a. during the semantic analysis phase, these are used for
consistency checks
b. this information is filled in by the parser (or semantic
analysis phase) when the declaration for the array/procedure
are seen.
5. level
a. this is used to maintain scope information in block-structured
languages.
- Example: (a) maintain a global variable for current
level (b) globals are at level zero (c) each time a procedure
or a new block is encountered, increment current level (d) all
variables within current procedure/block are assigned current
level (e) when the end of the block or procedure is encountered
we delete all entries in the ST at the current level and then
decrement the current level (f) when referencing a variable, we
begin by looking from the current level down to zero.
6. cross-reference information
a. the line number of where the variable was declared
b. all line numbers of references to the variable
c. allowance for sorted cross-reference information to be printed
What operations are necessary for our ST?
- Insertion
- Deletion
- Searching
- Printing
Q#4: Which of the above operations will most likely be performed most often?
Symbol Table Data Structures
When discussing Symbol Table Data Structures, we need to look at both dynamic and static data structures. As with most data structures we discuss in Computer Science, we always seem to come up against the classic time-space tradeoff issue.
Q#5: If you had to make some general statements concerning time-space tradeoffs in regard to static vs dynamic data structures, what would they be?
Following is a list of possible ST data structures. Let's look at the advantages and disadvantages of each data structure using a worst case scenerio.
1. Unordered Linear List
Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)
2. Ordered Linear List
Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)
3. Binary Search Tree
Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)
4. AVL Tree
Advantages?
Disadvantages?
Insertion O(?)
Deletion O(?)
Searching O(?)
©Douglas J. Ryan (ryandj@pacificu.edu)