IR Code Generation: Approach, Arithmetic Expressions, and Boolean Expressions - Prof. Jing, Study notes of Computer Science

A part of the ir code generation i course notes by jingke li from portland state university. It covers the approach to ir code generation, the handling of arithmetic expressions, and the challenges of handling boolean expressions. The document also provides code examples and explanations.

Typology: Study notes

Pre 2010

Uploaded on 08/16/2009

koofers-user-pnz
koofers-user-pnz 🇺🇸

10 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
IR Code Generation (Part I)
Jingke Li
Portland State University
Jingke Li (Portland State University) CS322 IR Code Generation I 1 / 23
IR Code Generation
Input AST representation of a source language
Output Three-address code or IR Tree code
Approach Syntax-directed translation
A Generic Source Language Grammar:
E -> E1 arithop E2 | E1 relop E2 | E1 logicop E2
E -> ’-’ E1 | ’!’ E1
E -> ’newArray’ E1 // new (integer) array of size E1
E -> E1 ’[’ E2 ’]’ // array element
S -> E1 ’:=’ E2 ’;’
S -> ’if’ ’(’ E ’)’ ’then’ S1 ’else’ S2
S -> ’while’ ’(’ E ’)’ S1
S -> ’print’ E ’;’
S -> ’return’ E ’;’
Jingke Li (Portland State University) CS322 IR Code Generation I 2 / 23
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download IR Code Generation: Approach, Arithmetic Expressions, and Boolean Expressions - Prof. Jing and more Study notes Computer Science in PDF only on Docsity!

IR Code Generation (Part I)

Jingke Li

Portland State University

Jingke Li (Portland State University) CS322 IR Code Generation I 1 / 23

IR Code Generation

Input — AST representation of a source language Output — Three-address code or IR Tree code Approach — Syntax-directed translation

A Generic Source Language Grammar:

E -> E1 arithop E2 | E1 relop E2 | E1 logicop E E -> ’-’ E1 | ’!’ E E -> ’newArray’ E1 // new (integer) array of size E E -> E1 ’[’ E2 ’]’ // array element

S -> E1 ’:=’ E2 ’;’ S -> ’if’ ’(’ E ’)’ ’then’ S1 ’else’ S S -> ’while’ ’(’ E ’)’ S S -> ’print’ E ’;’ S -> ’return’ E ’;’

IR Codegen Issue Overview

  • (^) Arithmetic Expressions — Simple to handle
  • (^) Boolean Expressions — IR version much restrictive; needs major changes
  • Array definitions and references — Slightly complex
  • Statements — Mostly straightforward

Jingke Li (Portland State University) CS322 IR Code Generation I 3 / 23

Arithmetic Expressions

  • Generating Three-Address Code — Introduces a new temp for each operation; uses two attributes: E .s holds the statements evaluating E ; E .t represents the temp that holds the value of E.

E -> E1 arithop E t = new Temp(); E.s := [ E1.s; E2.s; t := E1.t arithop E2.t; ] E.t := t;

E -> ’-’ E t = new Temp(); E.s := [ E1.s; t := - E1.t; ] E.t := t;

Value Representation Approach

One approach for handling Boolean expressions is to encode true and false numerically into 1 and 0, and map Boolean expressions into conditional jump statements.

Example: a < 5 || b > 2

t1 := 1; if (a < 5) goto L1; t1 := 0; L1: t2 := 1; if (b > 2) goto L2; t2 := 0; L2: t3 := 1; if (t1 == 1) goto L3; if (t2 == 1) goto L3; t3 := 0; L3:

(ESEQ [ [MOVE t3 (CONST 1)] [CJUMP == (ESEQ [ [MOVE t1 (CONST 1)] [CJUMP < (NAME a) (CONST 5) L1] [MOVE t1 (CONST 0)] [LABEL L1] ] t1) (CONST 1) L3] [CJUMP == (ESEQ [ [MOVE t2 (CONST 1)] [CJUMP > (NAME b) (CONST 2) L2] [MOVE t2 (CONST 0)] [LABEL L2] ] t2) (CONST 1) L3] [MOVE t3 (CONST 0)] [LABEL L3] ] t)

Jingke Li (Portland State University) CS322 IR Code Generation I 7 / 23

Better Handling for Logical Operations

Many architectures provide hardware support for bit-wise logical operations, such as and, or, xor not, and etc. And we know that 1 and 1 = 1; 1 and 0 = 0; 0 and 1 = 0; 0 and 0 = 0; 1 or 1 = 1; 1 or 0 = 1; 0 or 1 = 1; 0 or 0 = 0;

Taking advantage of this, when using value-representation for Boolean expressions, logical operations can be simply translated into arithmetic operations with corresponding bit-wise operators. For instance, the

expression a<5 || b>2 can be translated to

t1 := 1; if (a < 5) goto L1; t1 := 0; L1: t2 := 1; if (b > 2) goto L2; t2 := 0; L2: t3 := t1 or t2;

(BINOP || (ESEQ [ [MOVE t1 (CONST 1)] [CJUMP < (NAME a) (CONST 5) L1] [MOVE t1 (CONST 0)] [LABEL L1] ] t1) (ESEQ [ [MOVE t2 (CONST 1)] [CJUMP > (NAME b) (CONST 2) L2] [MOVE t2 (CONST 0)] [LABEL L2] ] t2))

Value Representation Approach (cont.)

E -> E1 relop E

  • Three-Address Code: L = new Label(); t = new Temp(); E.s := [ E1.s; E2.s; t := 1; if (E1.t relop E2.t) goto L; t := 0; L: ] E.t := t;
  • IR Tree Code: L = new NAME(); t = new TEMP(); E.tr := (ESEQ [ [MOVE t (CONST 1)] [CJUMP relop E1.tr E2.tr L] [MOVE t (CONST 0)] [LABEL L] ] t)

Jingke Li (Portland State University) CS322 IR Code Generation I 9 / 23

Value Representation Approach (cont.)

More Three-Address Code:

E -> E1 ’||’ E

L = new Label(); t = new Temp(); E.s := [ E1.s; E2.s; t := 1; if (E1.t==1) goto L; if (E2.t==1) goto L; t := 0; L: ] E.t := t;

E -> E1 ’&&’ E

L = new Label(); t = new Temp(); E.s := [ E1.s; E2.s; t := 0; if (E1.t==0) goto L; if (E2.t==0) goto L; t := 1; L: ] E.t := t;

E -> ’!’E1 t = new Temp(); E.s := [ E1.s; t := 1 - E1.t; ] E.t := t;

Control-Flow Representation (cont.)

[More Efficient Code:] if (a < 5) goto L4; if (b > 2) goto L4; goto L5; L4: [code for S1] goto L6; // then clause L5: [code for S2] // else clause L6:

In this new version, there is no need to create all those temps for holding 0s and 1s.

One Issue Remaining — The two labels, L4 and L5, are not available when the Boolean expression (a<5 || b>2) is being processed. How can the conditional jump statements be generated, then?

Answer: Use the idea of “back-patching” — Each block of code may contain jumps to unresolved labels; these labels will be patched when the environment of the block is processed.

Jingke Li (Portland State University) CS322 IR Code Generation I 13 / 23

Back-Patching Example

if (a<5 || b>2) S1 else S2;

  • Handling a<5: if (a < 5) goto ; // needs to be patched
  • (^) Handling b>2: if (b > 2) goto ; // needs to be patched
  • Handling ..||..: if (a < 5) goto ; // .. else fall through if (b > 2) goto ; // is patched to goto // needs to be patched
  • (^) Handling if .. S1 else S2: if (a < 5) goto L4; // is patched to L if (b > 2) goto L4; goto L5; // is patched to L L4: [code for S1] goto L6; // then clause L5: [code for S2] // else clause L6:

Note that in the approach, the logical operations are implemented by properly patching expressions’ labels; no actual new code is generated.

Back-Patching Jump Labels

  • (^) Three-Address Code: Add two attributes E.true — position to jump to when E evals to true; E.false — position to jump to when E evals to false.

E -> E1 relop E

E.s := [ E1.s; E2.s; if (E1.t relop E2.t) goto E.true; E.false: ]

E -> E1 ’||’ E

E1.true := E.true; E1.false := new Label(); E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.false: E2.s; ]

Jingke Li (Portland State University) CS322 IR Code Generation I 15 / 23

Back-Patching: Three-Address Code (cont.)

E -> E1 ’&&’ E

E1.true := new Label(); E1.false := E.false; E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.true: E2.s; ]

E -> ’!’ E

E1.true := E.false; E1.false := E.true; E.s := E1.s;

Converting Back to Value

What if we have boolean x = a<5 || b>2;

We still need to generate a value for the Boolean expression!

This can be implemented by patching the two labels E.true and E.false for the Boolean expression E with two assignment statements for assigning 1 and 0, respectively.

Boolean expression E

t = new Temp(); E.true := new Label(); E.false := new Label(); L := new Label(); E.s := [ E.true: t := 1; goto L; E.false: t := 0; L: ] E.t := t;

Jingke Li (Portland State University) CS322 IR Code Generation I 19 / 23

New Arrays

E -> ’newArray’ E

  • Storage allocation — Follow Java’s array storage convention. The length of array is stored as the 0th element of the array. So the storage for a 10-element array actually has 11 cells.
  • Cell initialization — All elements are automatically initialized to 0.

Pseudo IR Code:

L: new Label; t1,t2,t3: new Temps; E.s := [ E1.s; t1 := (E1.t + 1) * wdSize; // storage size t2 := malloc(t1); // t2 points to cell 0 t2[0] := E1.t; // store array length t3 := t2 + (E1.t * wdSize); // t3 points to last cell L: t3[0] := 0; // init a cell to 0 t3 := t3 - wdSize; // move down a cell if (t3 > t2) goto L; ] // loop back E.t := t2;


Array Elements

E -> E1 ’[’ E2 ’]’

  • (^) Calculating address — addr(a[i]) = base(a) + (i + 1) × wdSize
  • (^) Bounds-checking — Java performs array index bounds-checking to make sure it is within bounds.

L1,L2: new Label; t1,t2,t3,t4: new Temps; E.s := [ E1.s; E2.s; t1 := E1.t[0]; if (E2.t < 0) goto L1; if (E2.t >= t1) goto L1; t2 := E2.t + 1; t3 := t2 * wdSize; t4 := E1.t[t3]; goto L2; L1: param E1.t; param E2.t; call arrayError,2; L2: ] E.t := t4;


Jingke Li (Portland State University) CS322 IR Code Generation I 21 / 23

Statements

  • Assignments: S -> E1 ’:=’ E2 ’;’

S.s := [ E1.s; E2.s; E1.t := E2.t; ]

  • If Statement: S -> ’if’ ’(’ E ’)’ ’then’ S1 ’else’ S

L1,L2,L3: new Labels; E.true := L1; E.false := L2; S.s := [ E.s; L1: S1.s; goto L3; L2: S2.s; L3: ]