intermedite code for compiler design, Lecture notes of Compiler Design

intermidate code for compiler design

Typology: Lecture notes

2017/2018

Uploaded on 04/20/2018

kamal-kishore-1
kamal-kishore-1 🇮🇳

1 document

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Dixita Kagathara Page 1
201
4
| Sem
-
VI
I
|
ion
1
7
070
1
Compiler Design
1)
What is intermediate code?
Intermediate code is:
The output of the parser and the input to the Code Generator.
Relatively machine-independent: allows the compiler to be retargeted.
Relatively easy to manipulate (optimize).
Advantages of Using an Intermediate Language
Retargeting - Build a compiler for a new machine by attaching a new code generator to an
existing front-end.
Optimization - reuse intermediate code optimizers in compilers for different languages and
different machines.
2)
Explain Different Intermediate forms
There are three types of intermediate representation
1. Abstract syntax tree
2. Postfix notation
3. Three address code
1. Abstract syntax tree
A syntax tree depicts the natural hierarchical structure of a source program.
A DAG (Directed Acyclic Graph) gives the same information but in a more compact way
because common sub-expressions are identified.
syntax tree for the assignment statement x=-a*b + -a*b is given below
Intermediate
code generator
Parser
Static type
checker
Code
generator
Assign
+
*
*
Uminus
b
Uminus
b
a a
X
Syntax tree
Assign
+
X
*
Uminus
b
a
DAG
pf3
pf4
pf5
pf8

Partial preview of the text

Download intermedite code for compiler design and more Lecture notes Compiler Design in PDF only on Docsity!

2014 | Sem-VII | Intermediate Code Generation

170701 – Compiler Design

1) What is intermediate code?

Intermediate code is:  The output of the parser and the input to the Code Generator.  Relatively machine-independent: allows the compiler to be retargeted.  Relatively easy to manipulate (optimize).

Advantages of Using an Intermediate Language  Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end.  Optimization - reuse intermediate code optimizers in compilers for different languages and different machines.

2) Explain Different Intermediate forms

There are three types of intermediate representation

  1. Abstract syntax tree
  2. Postfix notation
  3. Three address code
  4. Abstract syntax tree  A syntax tree depicts the natural hierarchical structure of a source program.  A DAG (Directed Acyclic Graph) gives the same information but in a more compact way because common sub-expressions are identified.  syntax tree for the assignment statement x=-ab + -ab is given below

Intermediate Parser (^) code generator Static type checker

Code generator

Assign

Uminus (^) b Uminus (^) b

a a

X

Syntax tree

Assign

X +

Uminus (^) b

a

DAG

  1. Postfix notation  Postfix notation is a linearized representation of a syntax tree.  It a list of nodes of the tree in which a node appears immediately after its children  the postfix notation of above syntax tree is x a –b * a-b*+=
  2. Three address code  The general form of three address code representation is: a:= b op c  Where a, b or c are the operands that can be names or constants.  For the expression like a = b + c + d the three address code will be t1=b+c t2=t1+d  Here t1 and t2 are the temporary names generated by the compiler.  There are at most three addresses allowed (two for operands and one for result). Hence, this representation is called three-address code.

3) Implementation of three address code

 There are three representations used for three address code such as quadruples, triples and indirect triples.  Consider the input statement x:= -ab + -ab  Three address code for above statement given below:

tj : t1=uminus a t 2 := t1 * b : t3= - a t 4 := t 3 * b t 5 := t 2 + t 4 x : x= t 5

Quadruple representation  The quadruple is a structure with at the most four fields such as op,arg1,arg  The op field is used to represent the internal code for operator, the arg1 and arg represent the two operands. And result field is used to store the result of an expression.

4) Syntax directed translation mechanisms

 For obtaining the three address code the SDD translation scheme or semantic rules must be written for each source code statement.  There are various programming constructs for which the semantic rules can be defined.  Using these rules the corresponding intermediate code in the form of three address code can be generated. Various programming constructs are :

  1. Declarative statement
  2. Assignment statement
  3. Arrays
  4. Boolean expressions
  5. Control statement
  6. Switch case
  7. Procedure call Declarative Statement  In the declarative statements the data items along with their data types are declared.  Example

S-> D Offset=

D -> id: T {enter(id.name,T.type,offset); Offset=offset+T.width } T-> integer {T.type:=integer; T.width:=4}

T-> real {T.type:=real; T.width:=8}

T->array[num] of T 1 {T.type:=array(num.val,T 1 .type) T.width:=num.val X T1.width }

T -> *T 1 {T.type:=pointer(T.type) T.width:=4}

 Initially, the value of offset is set to zero. The computation of offset can be done by using the

formula offset = offset + width.

 In the above translation scheme T.type, T.width are the synthesized attribute. The type

indicates the data type of corresponding identifier and width is used to indicate the memory units associated with an identifier of corresponding type.

 The rule D —> id:T is a declarative statement for id declaration. The enter function used for

creating the symbol table entry for identifier along with its type and offset.

 The width of array is obtained by multiplying the width of each element by number of

elements in the array.

 The width of pointer type is supposed to be 4.

Assignment Statements  The assignment statement mainly deals with the expressions. The expressions can be of type integer, real, array and record.  Consider the following grammar S-> id :=E E-> E1 + E E-> E1 * E E-> -E E-> (E1) E-> id  The translation scheme of above grammar is given below: Production Rule Semantic actions S-> id :=E { p=look_up(id.name); If p≠ nil then Emit(p = E.place) Else Error; } E-> E1 + E2 { E.place=newtemp(); Emit (E.place=E1.place ‘+’ E2.place) } E-> E1 * E2 { E.place=newtemp(); Emit (E.place=E1.place ‘*’ E2.place) } E-> -E1 { E.place=newtemp(); Emit (E.place=’uminus’ E1.place) } E-> (E1) {E.place=E1.place}

E-> id { p=look_up(id.name); If p≠ nil then Emit (p = E.place) Else Error; }  The p returns the entry for id.name in the symbol table if it exists there.

 The function Emit is for appending the three address code to the output file. Otherwise an

error will be reported.

 newtemp() is the function for generating new temporary variables.

 E.place is used to hold the value of E. consider the assignment statement x:=(a+b)

E-> NOT E

E E .place:=newtemp() A Emit (E.place ':="NOT' E1.place) } E -> (E1) { E E.place := E1.place }

E -> id 1 relop id 2 { E. place := newtemp() Emit ('if id.place relop.op id 2 .place 'goto' next_state +3 : Emit (E.place':=' '0' ); Emit ('goto' next state +2); Emit (E.place := '1') } E -> TRUE { E.place := newtemp(); Emit (E.place ':=' '1') } E-> FALSE { E.place := newtemp() Emit (E.place ':=' '0') }

 The function Emit generates the three address code and newtemp () is for generation of temporary variables.  For the semantic action for the rule E - > id1 relop id2 contains next_state which gives the index of next three address statement in the output sequence.  Let us take an example and generate the three address code using above translation scheme : p > q AND r < s OR u > v 100: if p > q goto 103 101: t1:= 102: goto 104 103: t1:= 104: if r < s goto 107 105: t2:= 106: goto 108 107: t2= 108:if u>v goto 111 109:t3= 110:goto 112 111:t3= 112:t4=t1 AND t 113:t5=t4 OR t

Flow of control statement  The control statements are if-then-else and while-do.  The grammar for such statements is given below  S —> if E then S1| if E then S1 else S2 | while E do S  Translation scheme is given below S->if E then S1 {E.true=new_label() E.False=new_label() S1.next=S.next S2.next=S.next S.code= E.code ||gen_code(E.true’:’) ||S1.code} S->if E then S1 else S2 {E.true=new_label() E.False=new_label() S1.next=S.next S2.next=S.next S.code=E.code ||gen_code(E.true’:’)||S1.code ||gen_code(‘goto’,s.next) ||gen_code(E.false’:’) || S2.code} S->while E do S1 {S.begin=new_label() E.True=new_label() E.False=S.next S1.next=S.begin S.code=gen_code(S.begin’:’)||E.code ||gen_code(E.true’:’) ||S1.code||gen_code(‘goto’,S.begin)}  Consider the statement: if a<b then a=a+5 else a=a+  Three address code for above statement using semantic rule is 100: if a<b goto L 101: goto 103 102: L1: a=a+ 103: a=a+ (*Array & case statement refer from text book)