Introduction to Parsing - Compiler Construction - Lecture Slides, Slides for Compiler Construction. University of Lucknow
rajnikanth
rajnikanth

Introduction to Parsing - Compiler Construction - Lecture Slides, Slides for Compiler Construction. University of Lucknow

PDF (283 KB)
45 pages
1000+Number of visits
Description
Main points of this lecture are: Introduction Parsing, Programming Assignment, Required Readings, Lex Manual, Red Dragon Book Chapter, Free Grammars, Parser Overview, Languages Revisited, Derivations, Formal Languages
20 points
Download points needed to download
this document
Download the document
Preview3 pages / 45
This is only a preview
3 shown on 45 pages
Download the document
This is only a preview
3 shown on 45 pages
Download the document
This is only a preview
3 shown on 45 pages
Download the document
This is only a preview
3 shown on 45 pages
Download the document
Introduction to Programming Languages and Compilers

Introduction to Parsing

Docsity.com

Administrivia

• Programming Assignment 2 is out this week – Due October 1st

– Work in teams begins

• Required Readings – Lex Manual

– Red Dragon Book Chapter 4

Docsity.com

Outline

• Regular languages revisited

• Parser overview

• Context-free grammars (CFG’s)

• Derivations

Docsity.com

Languages and Automata

• Formal languages are very important in CS – Especially in programming languages

• Regular languages – The weakest formal languages widely used

– Many applications

• We will also study context-free languages

Docsity.com

Limitations of Regular Languages

• Intuition: A finite automaton that runs long enough must repeat states

• Finite automaton can’t remember # of times it has visited a particular state

• Finite automaton has finite memory – Only enough to store in which state it is

– Cannot count, except up to a finite limit

• E.g., language of balanced parentheses is not regular: { (i )i | i ¸ 0}

Docsity.com

The Functionality of the Parser

Input: sequence of tokens from lexer

Output: parse tree of the program

Docsity.com

Example

• Cool

if x = y then 1 else 2 fi

• Parser input

IF ID = ID THEN INT ELSE INT FI

• Parser output

IF-THEN-ELSE

=

ID ID

INT INT

Docsity.com

Comparison with Lexical Analysis

Phase Input Output

Lexer Sequence of characters

Sequence of tokens

Parser Sequence of tokens

Parse tree

Docsity.com

The Role of the Parser

• Not all sequences of tokens are programs . . .

• . . . Parser must distinguish between valid and invalid sequences of tokens

• We need – A language for describing valid sequences of tokens

– A method for distinguishing valid from invalid sequences of tokens

Docsity.com

Context-Free Grammars

• Programming language constructs have recursive structure

• An EXPR is if EXPR then EXPR else EXPR fi , or

while EXPR loop EXPR pool , or

• Context-free grammars are a natural notation for this recursive structure

Docsity.com

CFGs (Cont.)

• A CFG consists of – A set of terminals T – A set of non-terminals N

– A start symbol S (a non-terminal)

– A set of productions

Assuming X  N

X => e , or

X => Y1 Y2 ... Yn where Yi  (N U T)

Docsity.com

Notational Conventions

• In these lecture notes – Non-terminals are written upper-case

– Terminals are written lower-case

– The start symbol is the left-hand side of the first production

Docsity.com

Examples of CFGs

A fragment of Cool:

EXPR if EXPR then EXPR else EXPR fi

| while EXPR loop EXPR pool

| id

Docsity.com

Examples of CFGs (cont.)

Simple arithmetic expressions:

 

E E E

| E + E

| E

| id

 

Docsity.com

The Language of a CFG

Read productions as replacement rules:

X => Y1 ... Yn Means X can be replaced by Y1 ... Yn

X => e Means X can be erased (replaced with empty string)

Docsity.com

Key Idea

1. Begin with a string consisting of the start symbol “S”

2. Replace any non-terminal X in the string by a right-hand side of some production

X => Y1 … Yn

3. Repeat (2) until there are no non-terminals in the string

Docsity.com

The Language of a CFG (Cont.)

More formally, write

X1 … Xi … Xn => X1 … Xi-1 Y1 … Ym Xi+1 … Xn

if there is a production

Xi => Y1 … Ym

Docsity.com

The Language of a CFG (Cont.)

Write

X1 … Xn =>* Y1 … Ym

if

X1 … Xn => … => … => Y1 … Ym

in 0 or more steps

Docsity.com

The Language of a CFG

Let G be a context-free grammar with start symbol S. Then the language of G is:

{ a1 … an | S =>* a1 … an and every ai is a terminal }

Docsity.com

Terminals

• Terminals are called because there are no rules for replacing them

• Once generated, terminals are permanent

• Terminals ought to be tokens of the language

Docsity.com

Examples

L(G) is the language of CFG G

Strings of balanced parentheses

Two grammars:

( )S S

S e

( )

|

S S

e

 ( ) | 0i i i

OR

Docsity.com

Cool Example

A fragment of COOL:

EXPR if EXPR then EXPR else EXPR fi

| while EXPR loop EXPR pool

| id

Docsity.com

Cool Example (Cont.)

Some elements of the language

id

if id then id else id fi

while id loop id pool

if while id loop id pool then id else id

if if id then id else id fi then id else id fi

Docsity.com

Arithmetic Example

Simple arithmetic expressions:

Some elements of the language:

E E+E | E E | (E) | id 

id id + id

(id) id id

(id) id id (id)

 

Docsity.com

Notes

The idea of a CFG is a big step. But:

• Membership in a language is “yes” or “no” – we also need parse tree of the input

• Must handle errors gracefully

• Need an implementation of CFG’s (e.g., bison)

Docsity.com

no comments were posted
This is only a preview
3 shown on 45 pages
Download the document