


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The first programming assignment for the cs 142 compilers course. Students are required to write a lexical analyzer, also known as a scanner or lexer, for the cool programming language. The assignment covers writing the lexer 'by hand' and includes instructions for setting up the development environment and testing the scanner.
Typology: Assignments
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Programming assignments I–IV (or V) will direct you to design and build a compiler for Cool. Each assignment will cover one component of the compiler: lexical analysis, parsing, semantic analysis, and code generation. Each assignment will ultimately result in a working compiler phase which can interface with other phases. You will have an option of doing your projects in C++ or Java. For this assignment, you are to write a lexical analyzer, also called a scanner or lexer. You will be writing the lexer “by hand” rather than using a lexical analyzer generator such as lex, flex, or jlex. You will write code to approximate the function of a DFA by scanning for Cool tokens and then output the results in the appropriate format. The description of the output and the list of tokens will be provided for you. On-line documentation for all the tools needed for the project will be made available on the ECS web site. You may work either individualy or in pairs for this assignment. Pairs are encouraged.
To get started, create a directory where you want to do the assignment and execute one of the following commands in that directory. For the C++ version of the assignment, you should type
gmake -f /home/cs142/s09/cool/assignments/PA2/Makefile
Note that even though this is the first programming assignment, the directory name is PA2. Future assignments will also have directories that are one more than the assignment number–please don’t get confused! This situation arises because we are skipping the usual first assignment in this offering of the course. For Java, type:
gmake -f /home/cs142/s09/cool/assignments/PA2J/Makefile
(notice the “J” in the path name). This command will copy a number of files to your directory. Some of the files will be copied read-only (using symbolic links). You should not edit these files. In fact, if you make and modify private copies of these files, you may find it impossible to complete the assignment. See the instructions in the README file. The files that you will need to modify are:
description, but it does not do much. Any auxiliary routines that you wish to write should be added directly to this file.Although you will be writing a hand lexer you may find the the flex/jlex manual helpful to understand some of the defined datastructures and other assorted definitions.
Although these files are incomplete as given, the lexer does compile and run (gmake lexer).
You should follow the specification of the lexical structure of Cool given in Section 10 and Figure 1 of the Cool manual. Your scanner should be robust—it should work for any conceivable input. For example, you must handle errors such as an EOF occurring in the middle of a string or comment, as well as string constants that are too long. These are just some of the errors that can occur; see the manual for the rest. You must make some provision for graceful termination if a fatal error occurs. Core dumps or uncaught exceptions are unacceptable.
All errors should be passed along to the parser. You lexer should not print anything. Errors are com- municated to the parser by returning a special error token called ERROR. (Note, you should ignore the token called error [in lowercase] for this assignment; it is used by the parser in PA3.) There are several requirements for reporting and recovering from lexical errors:
is allowed but should be converted to the one character
Your scanner should maintain the variable curr lineno that indicates which line in the source text is currently being scanned. This feature will aid the parser in printing useful error messages. You should ignore the token LET STMT. It is used only by the parser (PA3).
There are at least two ways that you can test your scanner. The first way is to generate sample inputs and run them using lexer, which prints out the line number and the lexeme of every token recognized by your scanner. The other way, when you think your scanner is working, is to try running mycoolc to invoke your lexer together with all other compiler phases (which we provide). This will be a complete Cool compiler that you can try on the sample programs and your program from Assignment I.
When you are ready to turn in the assignment, type gmake submit-clean in the directory where you have prepared your assignment. This action will remove all the unnecessary files, such as object files, class files, core dumps, Emacs autosave files, etc. Following gmake submit-clean, use the handin utility to submit each modified file. e.g. handin cs142 PA2 cool-lex-hand.cc or handin cs142 PA2J CoolLexHand.java The last submission you do will be the one graded. Each submission overwrites the previous one. Remember that late assignments are not accepted. If in doubt, submit early and often. The burden of convincing us that you understand the material is on you. Obtuse code, output, and write-ups will have a negative effect on your grade. Write safe code (e.g., don’t use strcpy), comment appropriately, and modularize well.