

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The project requirements for generating sparc assembly code from an augmented abstract syntax tree and symbol tables in the context of a compiler for the pasc programming language. The project involves computing memory addresses for variables, handling structured data types, performing constant folding, generating code for simple control statements, and managing function calls, prologues, epilogues, and parameter passing. Predefined functions and the run-time environment must also be considered. Guidance on various issues related to code generation for pasc and offers simplifying assumptions to help students complete the project.
Typology: Study Guides, Projects, Research
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Purpose:
This project is intended to give you experience in writing a code generator as well as bring together the various issues of code generation discussed in the text and in class.
Project Summary:
Your task is to write a code generator, the final phase of your compiler, that produces assembly code for the SPARC machine (oldiablo), given an augmented abstract syntax tree and symbol tables produced by the previous phases of your compiler. This phase will consist of assigning memory address for each variable used in the PASC program and translating subtrees of the abstract syntax tree (intermediate language representation) into sequences of assembly instructions that perform the same task.
Predefined Functions in PASC:
You should write your own predefined procedures and functions (in C). The calls in your generated assembly program to these predefined functions should follow the calling convention of CC (or GCC).
Run-time environment:
Recursion in PASC prevents the use of a static storage allocation strategy. However, the PASC language has no features that prevent the deallocation of activation records in a last-in-first-out manner. That is, activation records containing data local to an execution of a procedure, actual parameters, saved machine status, and other information needed to manage the activation of the procedure can be stored on a run-time stack. The exact size of an activation record can be determined at compile time. Global variables and temporaries can be stored in the data segment (not in the stack).
Code generation:
This is the only phase of your compiler which is machine dependent. You will be generating SPARC assembly code. The important/interesting issues in generating code for PASC are discussed in the following paragraphs. For further details on these issues, you should consult Chapters 7 and 8 of the text book. You can make the following simplifying assumptions:
Computing memory addresses: Since address information has not been computed and entered into the symbol table by earlier phases, the first task of the code generator is to compute the offsets of each variable name (both global and local); that is, the address of each local data object and formal
parameter within the activation record for the block in which they are declared. This can be done by initializing a variable offset at the start of each declaration section, and as each declaration is processed, the current value of offset is entered as an attributed of that symbol, and offset is then incremented by the total width of that data object depending on its type. The offsets of fields within records should be computed and stored as an attributed of the field name, typically relative to the start of the record. Globals should be implemented using static storage.
Call–by–value parameters will have a width dependent on the type of the parameter, whereas call–by– reference parameters will have a width equal to 1 word containing an address. The total activation record size of each procedure should be computed at this time and entered in the symbol table as an attributed of the procedure name.
The machine architecture must be taken into account when computing these widths. Offsets of locals and parameters can be implemented as offsets of parameters from the frame pointer. Thus, the computation of offsets of arguments and local variables can be done independently. This information will be used upon every reference to the data object in addition to being used in the allocation of storage for activation records.
Handling structured data types: Storage for an array, record, string, or more complex structured types built from these types and the basic types is allocated as a consecutive block of memory. Access to individual elements of an array or string is handled by generating an address calculation using the base address of the object, the index of the desired element, and the size of the elements. You are free to choose the layout of elements of an array in your implementation (e.g., row major or column major). Access to individual components of a record is done by performing an address calculation that adds the based address of the record and the offset of the desired component within the record. Access to more complex objects can be done by combining the above actions. See Chapter 8 for more details on these calculations.
Constant folding: For efficiency at run-time, all references to constant identifiers should be replaced with the corresponding constant. Also, when generating code for an expression in which all operands are constants known at compile time, the code generator should compute the value of the expression at compile time and avoid generating the expression evaluation. That is, all expressions with constant operands should be folded. You need not perform any global analysis to determine whether the value of an identifier is a constant. This constant folding needs only be performed using local context.
Simple Control Flow: Code for simple control statements, namely conditionals and loops in PASC, can be generated according to the semantics of Pascal using the compare, test, and branch instructions of the assembly language. Unique target labels will also have to be generated. Chapter 8 of the text gives examples of code generation for various simple control statements.
Function calls, prologues and epilogues: Function calls result in the generation of a calling se- quence. Upon a call, an activation record for the callee must be set up and control must be transferred to the callee after saving the appropriate information in the activation record. For each procedure, the generated code sequence will consist of a prologue, the code for the statements of the procedure, and the epilogue. Typically, the prologue indicates the registers that should be saved upon a call to this procedure and allocates space on the stack for local variables, whereas the epilogue consists of restoring the saved machine status and returning control to the point in the caller immediately after the point of call.
Parameter passing: In order to correctly handle the formal parameters within the body of the callee, the symbol table entry for each formal parameter must include an attribute that indicates the parameter passing mode (value or reference). Structure objects passed by either call-by-value or call-by-reference are handled the same as simple objects.
On a procedure call, call-by-value parameters are handled by allocating the local storage for the size of the object in the activation record of the callee and then evaluating the actual parameter and initializing the local store within the callee with the value of the actual parameter. All accesses to that formal parameter will change the value in the local space, with no effect on the caller. On a return, no values