Lexical Analyzer (Part 2 of 7)
Date assigned: 2/16/07
Date due: 2/28/07
Points: 50
Write the lexical analyzer for your compiler. Remember, the lexical analyzer returns a (token class, value) pair on demand to the parser. Your lexical analyzer will also produce a source listing where each line is preceded by a line number. There is no restriction on line length.
White space is defined as a blank, tab, newline, and comment delimited by /* and */. White space is to be skipped over and is not to be treated as a token but can delimit a token.
When your lexical analyzer encounters an identifier, look it up in the ST (but don't add it) to make sure it is not a reserved word.
When a constant is encountered, do the following:
# Class Lexemes Values 1 reserved given in assignment 1 which one words main (1) int (2) if (3) else (4) return (5) for (6) input (7) output (8) 2 identifiers [A-Za-z][A-Za-z0-9]* NULL 3 constant [0-9]+ runtime stack address 4 relop == != <= >= < > which one (1, 2, 3, 4, 5, 6) 5 addop + - which one (1, 2) 6 mulop * / % which one (1, 2, 3) 7 autoop ++ -- which one (1, 2) 8 assignop = NULL 9 addressof & NULL 10 boolop || && which one (1, 2) 11 semicolon ; NULL 12 comma , NULL 13 parenthesis ( ) which one (1, 2) 14 bracket [ ] which one (1, 2) 15 brace { } which one (1, 2)Consider the following C program:
main () { 5 }The results from Lex will be:
001 main () CLASS LEXEME VALUE ----- ------ ----- 1 main 1 13 ( 1 13 ) 2 002 { CLASS LEXEME VALUE ----- ------ ----- 15 { 1 003 5 CLASS LEXEME VALUE ----- ------ ----- 3 5 1 004 } CLASS LEXEME VALUE ----- ------ ----- 15 } 2 CONSTANT TABLE CONSTANT VALUE RUNTIME STACK ADDRESS -------------- --------------------- 5 1
RUNTIME STACK ADDRESS VALUE ------- ----- 0 -1 1 5 2 -1Test your lexical analyzer by constructing a driver that performs the following:
IMPORTANT: Lex returns a SINGLE token to your driver program until the end of file is reached. Your driver program must make a call to lexGetToken() which is in the lex module and lexGetToken() returns a single token to the caller. A global data structure can be used to hold the token information.
Note1: Remember, your compiler is to be in a directory called yourlastname. In that directory there is to be a makefile that I simply use to compile your project creating the executable pcc. Continue creating subdirectories within yourlastname directory for each of the modules needed as we go through the compiler creation process; thus, for this second assignment, you will need to create at least a subdirectory called lex where all of the lexical analyzer routines will go. Also, continue to use subversion as we build this project. You won't be disappointed.
Note2: Use the submit script to submit your solution: zeus$ submit cs480s07 user.tar.gz
Note3: On the day the program is due, turn in a copy of the source code in this order:
Note4: Remember, the first location of the runtime stack is reserved for a function return value, thus begin saving your constants at location 1 on the runtime stack.
Douglas J. Ryan / ryandj@pacificu.edu