Code Generation - Assignment 9 | CMSC 430, Assignments of Computer Science

lecture 9 Material Type: Assignment; Class: INTRO TO COMPILERS; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 02/13/2009

koofers-user-i7x
koofers-user-i7x 🇺🇸

8 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Code generation
source
code
front
end opt. back
end
target
code
IR IR
Code generation steps
1. Source code intermediate representation
generate intermediate representation during parse based on
syntax, symbol tables
2. Intermediate representation target code
instruction selection
choose instructions based on target instr set
memory management
decide storage for variables; allocate registers
linkage code
determine prolog/epilog code
instruction scheduling
choose instruction execution order
CMSC 430 Lecture 9, Page 1
Abstract syntax trees
Abstract syntax tree (AST)
stores syntactic structure of program
+
<x>*
<2><y>
Building the AST
construct node for LHS of production
fill in fields using RHS of production
Example
E0::= E1’+’ E2{E0.val = node(’+’, E1.val, E2.val); }
|E1’*’ E2{E0.val = node(’*’, E1.val, E2.val); }
|id {E0.val = node(id.val); }
|num {E0.val = node(num.val); }
CMSC 430 Lecture 9, Page 2
Instruction selection
Code templates
template for each language construct
ignore surrounding context
simple recursive approach
Language constructs
1. simple expressions
2. control structures
3. procedure calls
4. complex expressions
Applying templates
use syntax during parse
apply tree rewriting to AST
CMSC 430 Lecture 9, Page 3
Intermediate representation
We’ll be targeting RISC-like processors
load-store architecture
register-transfer language
three-address code
explicit loads and stores
Examples
load r1, <addr>$r1value at <addr>
loadi r1, <const>$r1value of <const>
store r1, <addr>$<addr>r1
move r1, r2 $ r1 r2
add r1, r2, r3 $ r1 r2+r3
sub r1, r2, r3 $ r1 r2-r3
mult r1, r2, r3 $ r1 r2*r3
jmp <addr>$ jump to <addr>
CMSC 430 Lecture 9, Page 4
pf3
pf4
pf5

Partial preview of the text

Download Code Generation - Assignment 9 | CMSC 430 and more Assignments Computer Science in PDF only on Docsity!

sourceCode generation code

end^ front

opt.

endback

target code

IR

IR

Code generation steps

  1. Source code

(^) intermediate representation

syntax, symbol tables^ generate intermediate representation during parse based on

  1. Intermediate representation

(^) target code

choose instructions based on target instr set^ instruction selection

decide storage for variables; allocate registers^ memory management

determine prolog/epilog code^ linkage code

choose instruction execution order^ instruction scheduling

CMSC 430

Lecture 9, Page 1

Abstract syntax tree (AST) Abstract syntax trees

(^) stores syntactic structure of program

x

< 2 > < y >

Building the AST

(^) construct node for LHS of production

(^) fill in fields using RHS of production

E^ Example (^0) ::= E

1 (^) ’+’ E

2 { (^) E 0 .val = node(’+’, E

1 .val, E

2 .val);

E

1 (^) ’*’ E

(^2) { (^) E 0 .val = node(’*’, E

1 .val, E

(^2) .val);

id

E

0 .val = node(

id .val);

num

E

0 .val = node(

num

.val);

CMSC 430

Lecture 9, Page 2

Code templates Instruction selection

(^) template for each language construct

(^) ignore surrounding context

(^) simple recursive approach

Language constructs

  1. complex expressions3. procedure calls2. control structures 1. simple expressions Applying templates

(^) use syntax during parse

(^) apply tree rewriting to AST

CMSC 430

Lecture 9, Page 3

We’ll be targeting RISC-like processors Intermediate representation

(^) load-store architecture

(^) register-transfer language

(^) three-address code

(^) explicit loads and stores

Examples load r1,

addr

$ r

value at

addr

loadi r1,

const

$ r

value of

const

store r1,

addr

addr

r

move r1, r

$ r

r

add r1, r2, r

$ r

r2 + r

sub r1, r2, r

$ r

r2 - r

mult r1, r2, r

$ r

r2 * r

jmp

addr

$ jump to

addr

CMSC 430

Lecture 9, Page 4

Expression trees:Simple expressions

(^) adopt a simple treewalk scheme

(^) assign a virtual register to each operator

(^) emit code in postorder walk

Support routines:

(^) addr( str )

(^) — returns the name of a virtual register that

contains the base address for

(^) str

(^) newtemp()

(^) — returns a new virtual register name

Assume:

(^) assume tree reflects precedence, associativity

(^) assume all operands are integers

CMSC 430

Lecture 9, Page 5

expr( node ) Simple expressions

switch(int result, t1, t2, t3;

type of

node )

case PLUS:

t1 = expr(

left child of

node );

t2 = expr(

right child of

node );

break;emit( add, result, t1, t2 );result = newtemp();

case ID:

break;emit( load, result, t3 );result = newtemp();t3 = addr( node.str );

case NUM:

break;emit( loadi, result, node.val );result = newtemp();

return result

CMSC 430

Lecture 9, Page 6

Simple expressions

PLUS

x ID

PLUS

NUM 4

y ID

load r2, r

$ r

addr(x)

loadi r3, 4

$ constant

load r5, r

$ r

addr(y)

add r6, r3, r

$ r

4 + y

add r7, r2, r

$ r

x + (4 + y)

CMSC 430

Lecture 9, Page 7

Assignment statement Control structures

lhs (^) ←

(^) rhs

Strategy

(^) evaluate

(^) rhs

(^) to a value

(an rvalue

(^) evaluate

(^) lhs

(^) to an address

an lvalue

i)

lvalue

(^) is register

(^) move

(^) it

ii) lvalue

(^) is address

(^) store

(^) it

Registers versus memory

(^) non-aliased scalars

(^) can go in a register

(^) aggregate or potentially aliased

in memory

CMSC 430

Lecture 9, Page 8

Numerical Values Two schools of thought on representation:Boolean expressions

(^) assign numerical values to true and false

(^) evaluate booleans like arithmetic expressions

Control Flow

(^) represent boolean value by location in code

(^) convert to numerical value when stored

CMSC 430^ Neither representation dominates the other.

Lecture 9, Page 13

Numerical Values Boolean expressions

(^) assign a value to true

(say

(^) assign a value to false

(say

(^) use hardware —

(^) and

or , not

, xor

CMSC 430^ Choose values that make the hardware work.

Lecture 9, Page 14

Source Expression^ Numerical Values^ Boolean expressions

Generated Code

b or ( c and not d )

t (^) ←

(^) not d

t (^) ←

(^) c and t

t (^) ←

(^) b or t

a (^) < (^) b

if (a

b) br L

t (^) ←

br L

L1: t

L2: nop

CMSC 430^ A numerical representation handles logic well.

Lecture 9, Page 15

Control Flow Boolean expressions

(^) use conditional branches and comparator

(^) chain of branches to evaluate expression

(^) code looks terrible

Clean up: statements.Control flow representation works well for expressions in conditional

(^) branch to next statement

(^) branch to branch

CMSC 430

Lecture 9, Page 16

Source Expression^ Control FlowBoolean expressions

Generated Code

a < b or ( c

d and e

f )

if a

< b br LT

br L

L1: if c

d br L

br LF

L2: if e

f br LT

br LF

LF:

(^) code under

(^) false

or

t

false

br LEXIT

LT:

(^) code under

(^) true

or

t

true

br LEXIT

LEXIT:

CMSC 430preserved for later reuse.^ This works well when the expression’s value is tested but not

Lecture 9, Page 17

Do the semantics require evaluating all terms of an expression? What about ”short circuiting” Boolean expressions

(^) once value established, stop evaluating

(^) true or

expr

(^) ) is

(^) true

(^) false and

expr

) is

(^) false

(^) save cycles in evaluation

Order of evaluation

(^) if specified, must be observed

(^) if not, reorder by cost and short-circuit

CMSC 430

Lecture 9, Page 18

Reality Boolean expressions

(^) either approach works fairly well

(^) numerical code reflects logical constructs

(^) control flow code works well for relations

(^) compiler can choose based on context

Control flow

(^) accounting nightmare — tracking labels

(^) backpatching is the right answer

CMSC 430

Lecture 9, Page 19