Compiler Construction: Two-pass Compiler, Cheat Sheet of Compiler Construction

An overview of the two-pass compiler architecture, which is a common approach used in compiler design. The two-pass compiler consists of a front-end and a back-end, where the front-end maps the legal source code into an intermediate representation (ir), and the back-end then maps the ir into the target machine code. The key modules of the front-end, including the scanner and the parser, and how they work together to recognize legal and illegal programs, report errors, and produce the ir. It also discusses the complexity of the front-end and back-end, with the front-end being o(n) or o(n log n) and the back-end being np-complete (npc). The context-free grammars used to specify the syntax of the programming language and how the parser uses these grammars to build the parse tree or syntax tree. Overall, this document provides a comprehensive introduction to the two-pass compiler architecture and the key components involved in the compilation process.

Typology: Cheat Sheet

2023/2024

Uploaded on 05/12/2024

muhammad-wasey
muhammad-wasey 🇵🇰

1 document

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Compiler
Construction
Lecture 2
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Compiler Construction: Two-pass Compiler and more Cheat Sheet Compiler Construction in PDF only on Docsity!

Compiler

Construction

Lecture 2

Two-pass Compiler

Front End Back End source code IR (^) machine code errors

Two-pass Compiler

  • Back end maps IR into target machine code
  • Admits multiple front ends & multiple passes

Two-pass Compiler

  • Front end is O(n) or O(n log n)
  • Back end is NP-Complete (NPC)

The Front-End

Modules

  • Scanner
  • Parser scanner parser source code tokens IR errors

Scanner

scanner parser source code tokens IR errors

Scanner

  • Example x = x + y becomes < id , x > < assign , = > < id , x > < op , + > < id , y > token type word < id , x >

Scanner

  • we call the pair “<token type, word>” a “token”
  • typical tokens: number , identifier , +, - , new , while , if

Parser

  • Recognizes context-free syntax and reports errors
  • Guides context-sensitive (“semantic”) analysis
  • Builds IR for source program

Context-Free Grammars

  • Context-free syntax is specified with a grammar G= ( S,N,T,P )
  • S is the start symbol
  • N is a set of non-terminal symbols
  • T is set of terminal symbols or words
  • P is a set of productions or rewrite rules

The Front End

  • For this CFG

S = goal

T = { number, id, +, - }

N = { goal , expr , term , op }

P = { 1, 2, 3, 4, 5, 6, 7}

Context-Free Grammars

  • Given a CFG, we can derive sentences by repeated substitution
  • Consider the sentence (expression) x + 2 – y

The Front End

  • To recognize a valid sentence in some CFG, we reverse this process and build up a parse
  • A parse can be represented by a tree: parse tree or syntax tree

Parse

Production Result goal 1 expr 2 expr op term 5 expr op y 7 expr – y 2 expr op term – y 4 expr op 2 – y 6 expr + 2 – y 3 term + 2 – y 5 x + 2^ –^ y^