Download Understanding Compilation: High-Level Languages to LC-3 Assembly at Wright State and more Study notes Computer Architecture and Organization in PDF only on Docsity!
Chapter 10/11/12/13Chapter 10/11/12/
High Level Programming Languages
Variables and Operators The runtime stack Emphasis on how C-like languages are converted to LC-3 assembly
2 Wright State University, College of Engineering CEG 320/
A HighA High--Level LanguagesLevel Languages
ļ¬ Gives symbolic names to values
- donāt need to know which register or memory location
ļ¬ Provides abstraction of underlying hardware
- operations do not depend on instruction set
- example: can write āa = b * cā, even though LC-3 doesnāt have a multiply instruction
ļ¬ Provides expressiveness
- use meaningful symbols that convey meaning
- simple expressions for common control patterns (if-then-else)
ļ¬ Enhances code readability
ļ¬ Safeguards against bugs
- can enforce rules or conditions at compile-time or run-time
ļ¬ If it can be specified in a high-level language then it MUST be do-able in
assembly!
4 Wright State University, College of Engineering CEG 320/
Compiling a C ProgramCompiling a C Program
ļ¬ Entire mechanism is usually
called the ācompilerā
ļ¬ Preprocessor
- macro substitution
- conditional compilation
- āsource-levelā transformations ļ¬ output is still C
ļ¬ Compiler
- generates object file ļ¬ machine instructions
ļ¬ Linker
- combine object files (including libraries) into executable image C Source and Header Files C Preprocessor Compiler Source Code Analysis Target Code Synthesis Symbol Table Linker Executable Image Library Object Files
5 Wright State University, College of Engineering CEG 320/
CompilerCompiler
ļ¬ Source Code Analysis
- āfront endā
- parses programs to identify its pieces ļ¬ variables, expressions, statements, functions, etc.
- depends on language (not on target machine)
ļ¬ Code Generation
- āback endā
- generates machine code from analyzed source
- may optimize machine code to make it run more efficiently ļ¬ Consider automated HTML generationā¦
- very dependent on target machine
ļ¬ Symbol Table
- map between symbolic names and items
- like assembler, but more kinds of information
7 Wright State University, College of Engineering CEG 320/
Preprocessor DirectivesPreprocessor Directives
ļ¬ #include <stdio.h>
- Before compiling, copy contents of header file (stdio.h) into source code.
- Header files typically contain descriptions of functions and variables needed by the program. ļ¬ no restrictions -- could be any C source code
ļ¬ #define STOP 0
- Before compiling, replace all instances of the string "STOP" with the string "0"
- Called a macro
- Used for values that won't change during execution, but might change if the program is reused. (Must recompile.)
ļ¬ Every C program must have exactly one function called main().
- Be careful with what you #include!
- main() determines the initial PC.
8 Wright State University, College of Engineering CEG 320/
Output with printfOutput with printf
ļ¬ Variety of I/O functions in C Standard Library.
ļ¬ Must include <stdio.h> to use them.
ļ¬ printf: Can print arbitrary expressions, including formatted variables
printf("%d\n", startPoint - counter);
ļ¬ Print multiple expressions with a single statement
ļ¬ printf("%d %d\n", counter, startPoint -
counter);
ļ¬ Different formatting options:
ļ¬ %d decimal integer
ļ¬ %x hexadecimal integer
ļ¬ %c ASCII character
ļ¬ %f floating-point number
10 Wright State University, College of Engineering CEG 320/
Input with scanfInput with scanf
ļ¬ Many of the same formatting characters are available for user input.
ļ¬ scanf("%c", &nextChar);
- reads a single character and stores it in nextChar
ļ¬ scanf("%f", &radius);
- reads a floating point number and stores it in radius
ļ¬ scanf("%d %d", &length, &width);
- reads two decimal integers (separated by whitespace), stores the first one in length and the second in width
ļ¬ Must use address-of operator (&) for variables being modified.
- Weāll revisit pass by reference/value in a future lecture
11 Wright State University, College of Engineering CEG 320/
Data TypesData Types
ļ¬ Variables are used as names for data items.
ļ¬ Each variable has a type , type qualifiers, and a storage class which tells
the compiler how the data is to be interpreted (and how much space it
needs, etc.).
ļ¬ int counter;
ļ¬ Basic data types:
- Integral: int (at least 16 bits) Qualifiers: signed, unsigned, long
- Floating-point: float (at least 32 bits), double
- Character: char (at least 8 bits)
- Enumerated: enum hobbits {bilbo, frodo, samwise, pippen, merry}
ļ¬ Storage class: automatic, static, register
ļ¬ Derived data types: pointers, arrays, structures
ļ¬ Exact size can vary, depending on processor
- int is supposed to be "natural" integer size;
- for LC-3, that's 16 bits -- 32 bits for most modern processors
13 Wright State University, College of Engineering CEG 320/
Variables and ScopeVariables and Scope
ļ¬ Where are variable stored? Where can they be accessed? Why?
ļ¬ All C variables are defined as being in one of two storage classes
- Automatic storage class (on the stack, uninitialized)
- Static storage class (in the global memory area, initialized to 0)
ļ¬ Compiler infers scope from where variable is declared unless specified
- programmer doesn't have to explicitly state (but can!)
- automatic int x;
- static int y;
ļ¬ Global: accessed anywhere in program (default static)
- Global variable is declared outside all blocks
ļ¬ Local: only accessible in a particular region (default automatic)
- Variable is local to the block in which it is declared
- block defined by open and closed braces { }
- can access variable declared in any "containing" block
14 Wright State University, College of Engineering CEG 320/
Allocating Space for VariablesAllocating Space for Variables
ļ¬ Global data section
- All global variables stored here (actually all static variables)
- R4 points to beginning (global pointer)
ļ¬ Run-time stack
- Used for local variables
- R6 points to top of stack (stack pointer)
- R5 points to top frame on stack (frame pointer)
- New frame for each block (goes away when block exited)
ļ¬ Offset = distance from beginning
of storage area
- Global: LDR R1, R4, #
- Local: LDR R2, R5, #- 3
instructions
global data
run-time
stack
0x 0xFFFF
PC
R
R
R
16 Wright State University, College of Engineering CEG 320/
A software stackA software stack
ļ¬ Implemented in memory
- The Top Of Stack moves as new data is entered ļ¬ Here R6 is the TOS register, a pointer to the Top Of Stack
17 Wright State University, College of Engineering CEG 320/
ExampleExample
ļ¬ #include <stdio.h> ļ¬ int itsGlobal = 0; ļ¬ main() ļ¬ { ļ¬ int itsLocal = 1; /* local to main / ļ¬ printf("Global %d Local %d\n", itsGlobal, itsLocal); ļ¬ { ļ¬ int itsLocal = 2; / local to this block / ļ¬ itsGlobal = 4; / change global variable */ ļ¬ printf("Global %d Local %d\n", itsGlobal, itsLocal); ļ¬ } ļ¬ printf("Global %d Local %d\n", itsGlobal, itsLocal); ļ¬ }
ļ¬ Output
ļ¬ Global 0 Local 1 Global 4 Local 2 Global 4 Local 1
19 Wright State University, College of Engineering CEG 320/
Example: Compiling to LCExample: Compiling to LC-- 33
ļ¬ #include <stdio.h> ļ¬ int inGlobal; ļ¬ main() { ļ¬ int inLocal; ļ¬ int outLocalA; ļ¬ int outLocalB; ļ¬ inLocal = 5; ļ¬ inGlobal = 3; ļ¬ /* perform calculations / ļ¬ outLocalA = inLocal++ & ~inGlobal; ļ¬ outLocalB = (inLocal + inGlobal) - (inLocal - inGlobal); ļ¬ / print results */ ļ¬ printf("The results are: outLocalA = %d, outLocalB = %d\n", ļ¬ outLocalA, outLocalB); ļ¬ } Name Type Offset Scope inGlobal int 0 global inLocal int 0 main outLocalA int - 1 main outLocalB int - 2 main
20 Wright State University, College of Engineering CEG 320/
The stack frameThe stack frame
ļ¬ Local variables are stored in a stack frame associated
with the current scope
- As we change scope, we effectively change both the top and the bottom of the stack
- R6 is the stack pointer ā holds the address of the top of the stack
- R5 is the frame pointer ā holds address of the base of the current frame.
ļ¬ Symbol table āoffsetā gives the distance from
the base of the frame.
- A new frame is pushed on the run-time stack each time a block is entered.
- Because stack grows downward (towards memory address x0000) the base is the highest address of the frame, and variable offsets are negative.
outLocalB
outLocalA
R5^ inLocal
R