Lexical Analyzer and Parser in LR Parsing | CS 5300, Study notes of Computer Science

Material Type: Notes; Professor: Allan; Class: COMPILER DESIGN; Subject: Computer Science; University: Utah State University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-01t
koofers-user-01t 🇺🇸

10 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
LR Parsing
CS 5300 - SJAllan 2
Lexical Analyzer and Parser
Lexical
Analyzer Parser
token
get next
token
source
program
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Lexical Analyzer and Parser in LR Parsing | CS 5300 and more Study notes Computer Science in PDF only on Docsity!

LR Parsing

CS 5300 - SJAllan 2

Lexical Analyzer and Parser

Lexical

Analyzer

Parser

token

get next

token

source

program

CS 5300 - SJAllan 3

Shift-Reduce or LR(k) Parsing

ƒ Bottom-up parsing technique

ƒ LR(k) parsing

– L – scan the input from left-to-right

– R – construct a right-most derivation

– k – number of look-ahead symbols needed

ƒ Called a shift-reduce parsing method

– Shift and reduce are the major actions done by

the parser

CS 5300 - SJAllan 4

LR(k) Parsers

ƒ Advantages

– Can be constructed for virtually all programming

language constructs for which a CFG can be written

– Is more general than other parsers

ƒ Better than LL(k) parsers

– Can detect syntactic errors as soon as possible on a

left-to-right scan of the input

ƒ Disadvantages

– Too much work to implement by hand; you must use a

parser generator

CS 5300 - SJAllan 7

Why LR(k) Grammars are Useful

ƒ The parser know when to cease scanning given a

sentential form αβγ, i.e., it can detect the boundary

between β and γ

ƒ The parser is able to identify the handle β

ƒ The parser is able to uniquely select a production

A → β that corresponds to the handle and to this

sentential form

– Grammars can be LR(k) and yet have productions A →

β, B → β with the same right-hand side

ƒ The parser knows when to stop

CS 5300 - SJAllan 8

LR(K) (Shift-Reduce) Parsing

ƒ Parsing performed by a finite state machine

ƒ Parsing algorithm is language-independent

ƒ FSM driven by table (s) generated

automatically from grammar

ƒ Language Generator Tables

CS 5300 - SJAllan 9

Model of LR Parser

a 1 … ai … an $

LR

Parsing

Program

action goto

Output

sm

xm

sm-

xm-

s 0

Parse table

Stack

Input

CS 5300 - SJAllan 10

LR Parsing Algorithm

set ip to point to the first symbol in ω$ initialize stack to s (^0) repeat forever let s be topmost state on stack let a be symbol pointed to by ip if action[s,a] = shift s’ push a then s’ onto stack advance ip to next input symbol else if action[s,a] = reduce A Æ B pop 2*|B| symbols of stack let s’ be state now on top of stack push A then goto[s’,A] onto stack output production A Æ B else if action[s,a] == accept return success else error()

CS 5300 - SJAllan 13

Another Example

(14) 0 E 1 $ accept

(13) 0 E 1 + 6 T 9 $ reduce by E Æ E + T

(12) 0 E 1 + 6 F 3 $ reduce by T Æ F

(11) 0 E 1 + 6 id 5 $ reduce by F Æ id

(10) 0 E 1 + 6 id $ shift

(9) 0 E 1 + id $ shift

(8) 0 T 2 + id $ reduce by E Æ T

(7) 0 T 2 * 7 F 10 + id $ reduce by T Æ T * F

(6) 0 T 2 * 7 id 5 + id $ reduce by F Æ id

(5) 0 T 2 * 7 id + id $ shift

(4) 0 T 2 * id + id $ shift

(3) 0 F 3 * id + id $ reduce by T Æ F

(2) 0 id 5 * id + id $ reduce by F Æ id

(1) 0 id * id + id $ shift

Stack Input Action

CS 5300 - SJAllan 14

Yet Another Example (G

4. T→F 9. F→e

3. T→T F 8. F→b

2. E→T 7.^ F→a

1. E→E+T 6. F→(E)

0. E’→E$ 5. F→F*

12 S5 S6 S7 S4 R1 R1 R1 9 13 R6 R6 R6 R6 R6 R6 R6 R

11 S13 S

10 R5 R5 R5 R5 R5 R5 R5 R

9 R3 R3 R3 R3 R3 R3 S10 R

8 S5 S6 S7 S4 12 3

7 R9 R9 R9 R9 R9 R9 R9 R

6 R8 R8 R8 R8 R8 R8 R8 R

5 R7 R7 R7 R7 R7 R7 R7 R

4 S5 S6 S7 S4 11 2 3

3 R4 R4 R4 R4 R4 R4 S10 R

2 S5 S6 S7 S4 R2 R2 R2 9

1 S8 acc

0 S5 S6 S7 S4 1 2 3

a b e ( ) + * $ E T F

action goto

Grammar

Parse Table

CS 5300 - SJAllan 15

Yet Another Example

0 4 11 8 3 )*b$ R

0 4 11 8 5 )*b$ R7 0 1 $ acc

0 4 11 8 a)*b$ S5 0 2 $ R

0 4 11 +a)*b$ S8 0 2 9 $ R

0 4 2 +a)*b$ R2 0 2 6 $ R

0 4 2 9 +a)*b$ R3 0 2 b$ S

0 4 2 6 +a)*b$ R8 0 3 b$ R

0 4 2 b+a)*b$ S6 0 3 10 b$ R

0 4 3 b+a)*b$ R4 0 3 *b$ S

0 4 5 b+a)*b$ R7 0 4 11 13 *b$ R

0 4 ab+a)b$ S5 0 4 11 )b$ S

0 (ab+a)b$ S4 0 4 11 8 12 )b$ R

Stack Input Action Stack Input Action

CS 5300 - SJAllan 16

LR(k) Parsing

ƒ The only difference between parsing one LR

language and another is the information in

the action and goto fields of the parse table

ƒ The parsing algorithm is always the same

CS 5300 - SJAllan 19

Item Set Construction

ƒ Start operation:

– If S is the start symbol, and S→α is some production,

the item [S→⋅α] is associated with the start state

ƒ Completion operation (closure):

– If [A→α⋅Bγ], where B∈Vn , is an item in some state I,

then every item of the form [B→⋅β] must be included in

state I. This rule is repeated until no more new items

can be added to state I

ƒ Read operation:

– Let [A→α⋅Xγ], where X∈(Vn∪Vt ), be an item associated

with some state I. Then [A→αX⋅γ] is associated with

state J (possibly the same as I), and a transition from I

to J on symbol X exists

CS 5300 - SJAllan 20

Construction of Finite State System

1. Give the start state a number and use the start operation

to put one item into it

  • Use the completion operation to get more items into this state

ƒ Eventually the completion operation has to end

2. Use the read operation to start one or more new states,

based on the present state

  • It is possible for this new state to be equivalent to some previous state

ƒ If so, these states are merged

3. Complete the new state started previously by applying the

completion operation

4. Repeat steps 2 and 3 until no new states are added

CS 5300 - SJAllan 21

Example of LR(0) Items

0. S’→S

  1. S→aSbS
  2. S→a
    1. [S’→⋅S] 3. [S→aS⋅bS] [S→⋅aSbS] [S→⋅a] 4. [S→aSb⋅S] [S→⋅aSbS]
    2. [S’→S⋅] [S→⋅a]
    3. [S→a⋅SbS] 5. [S→aSbS⋅] [S→a⋅] [S→⋅aSbS] [S→⋅a]

a b S

Grammar

LR(0) Items

Transition Table

CS 5300 - SJAllan 22

Inadequate or Inconsistent States

ƒ Definition:

– Any state containing both a completed item

[A→α⋅] and any other item is said to be

inadequate or inconsistent

– Such a state represents a conflict in a parsing

decision

ƒ If the item sets contain no inadequate

states, the grammar is said to be LR(0)

CS 5300 - SJAllan 25

FOLLOW

ƒ FOLLOW(A), for A∈Vn, is the set of

terminals a that can appear immediately to

the right if A in some sentential form

ƒ More formally, a is in FOLLOW(A) if and

only if there exists a derivation of the form

S⇒

αAaβ

ƒ $ is in FOLLOW(A) if and only if there

exists a derivation of the form S⇒

αA

CS 5300 - SJAllan 26

Computing FOLLOW

ƒ Place $ in FOLLOW(S)

ƒ If there is a production A Æ αBβ, then

everything in FIRST(β) (except for ε) is in

FOLLOW(B)

ƒ If there is a production A Æ αB, or a

production A Æ αBβ where FIRST(β)

contains ε,then everything in FOLLOW(A) is

also in FOLLOW(B)

CS 5300 - SJAllan 27

FIRST and FOLLOW Example

E Æ TE’

E’ Æ +TE’ | ε

T Æ FT’

T’ Æ *FT’ | ε

F Æ (E) | id

FIRST(E) = FIRST(T) = FIRST(F) = {(, id }

FIRST(E’) = {+, ε}

FIRST(T’) = {*, ε}

FOLLOW(E) = FOLLOW(E’) = {), $}

FOLLOW(T) = FOLLOW(T’) = {+, ), $}

FOLLOW(F) = {+, *, $}

CS 5300 - SJAllan 28

SLR(1) Parse Table Construction

1. Construct LR(0) item sets {I 0 , …, In }

2. State i is constructed from I i:

a. If [A→α⋅aβ] ∈ Ii , goto(Ii ,a) = Ij , and a ∈ Vt, then set action [i,a] = shift j b. If [A→α⋅] ∈ Ii then action [i,a] = reduce A→α ∀ Follow (A) c. If [S’→S⋅] ∈ Ii , then action [i,$] = accept

3. The goto transitions for state i are constructed for all

nonterminals A using the rule: If goto(Ii,A) = Ij, then

goto [i,A] = j

4. All entires not defined by rules 2 and 3 are made error

5. The intial state of the parser is the one constructed from

the set of items containing [S’→S⋅]

CS 5300 - SJAllan 31

Item Sets Using G

I 0 : [E’→⋅E] I 4 : [F→(⋅E)] I 9 : [T→TF⋅]

[E→⋅E+T] [E→⋅E+T] [F→F⋅∗]

[E→⋅T] [E→⋅T] I 10 : [F→F∗⋅]

[T→⋅TF] [T→⋅TF] I 11 : [F→(E⋅)]

[T→⋅F] [T→⋅F] [E→E⋅+T]

[F→⋅F∗] [F→⋅F∗] I 12 : [E→E+T⋅]

[F→⋅(E)] [F→⋅(E)] [T→T⋅F]

[F→⋅a] [F→⋅a] [F→⋅F∗] [F→⋅b] [F→⋅b] [F→⋅(E)] [F→⋅e] [F→⋅e] [F→⋅a] I 1 : [E’→E⋅] I 5 : [F→a⋅] [F→⋅b] [E→E⋅+T] I 6 : [F→b⋅] [F→⋅e] I 2 : [E→T⋅] I 7 : [F→e⋅] I 13 : [F→(E)⋅] [T→T⋅F] I 8 : [E→E+⋅T] [F→⋅F∗] [T→⋅TF] [F→⋅(E)] [T→⋅F] [F→⋅a] [F→⋅F∗] [F→⋅b] [F→⋅(E)] [F→⋅e] [F→⋅a] I 3 : [T→F⋅] [F→⋅b] [F→F⋅∗] [F→⋅e]

Follow(E) = {+, ), $} Follow(T) = {+, ), $, (, a, b, e} Follow(F) = {+, ), $, ∗, (, a, b, e}

The SLR(1) table is shown on slide 14

CS 5300 - SJAllan 32

Grammar G

10 R

9 R

8 R

7 R5/S10 R

6 S

5 R

4 R5 S8/R

3 S7 6

2 S

1 acc

0 S4 S3 1 2

a b c d $ S A

  1. S’→S action^ goto
  2. S→Aa
  3. S→dAb
  4. S→dca
  5. S→cb
  6. A→c Grammar G 3

SLR(1) Parse Table

CS 5300 - SJAllan 33

Looking at G

ƒ Consider the string cb

ƒ We run into problems in state 4

– Notice that we can never have a b following an

A (under the assumption we reduce)

– Ab is not a viable prefix

ƒ A viable prefix is so called because it is

always possible to add terminal symbols to

the end of a viable prefix to obtain a right

sentential form

CS 5300 - SJAllan 34

Construction of LR(1) Parse Table

ƒ Using LR(0) item, inadequate are resolved as

follows:

– On the item [A→α⋅], the reduce operation is used on the

Follow(A)

– As we have seen, this doesn’t always work

ƒ We want to carry more information in the item

– We will add look ahead symbols

– [A→α⋅β,a] where

ƒ A → αβ ƒ a is a terminal or $

CS 5300 - SJAllan 37

LR(1) Parsing Tables – Example 1

  1. [S’→⋅S,$] 1. [S’→S⋅,$] 2. [S→A⋅a,$] [S→⋅Aa,$] [S→⋅dAb,$] [S→⋅dca,$] [S→⋅cb,$] [A→⋅c,a]
  2. [S→d⋅Ab,$] 4. [S→c⋅b,$] 5. [S→Aa⋅,$] [S→d⋅ca,$] [A→c⋅,a] [A→⋅c,b]
  3. [S→dA⋅b,$] 7. [S→dc⋅a,$] 8. [S→cb⋅,$] [A→c⋅,b]
  4. [S→dAb⋅,$] 10. [S→dca⋅,$]

0. S’→S

  1. S→Aa
  2. S→dAb
  3. S→dca
  4. S→cb
  5. A→c

CS 5300 - SJAllan 38

LR(1) Parsing Tables – Example 1

10 R

9 R

8 R

7 S10 R

6 S

5 R

4 R5 S

3 S7 6

2 S

1 acc

0 S4 S3 1 2

a b c d $ S A

action goto

CS 5300 - SJAllan 39

LR(1) Parsing Tables – Example 2

0. [S’→⋅S,$] 1. [S’→S⋅,$] 2. [S→C⋅C,$]

[S→⋅CC,$] [C→⋅eC,$] [C→⋅eC,e/d] [C→⋅d,$] [C→⋅d,e/d]

  1. [C→e⋅C,e/d] 4. [C→d⋅,e/d] 5. [S→CC⋅,$] [C→eC,e/d] [C→⋅d,e/d]
  2. [C→e⋅C,$] 7. [C→d⋅,$] 8. [C→eC⋅,e/d] [C→⋅eC,$] [C→⋅d,$]
  3. [C→eC⋅,$]

0. S’→S

1. S→CC

  1. C→eC
  2. C→d

CS 5300 - SJAllan 40

LR(1) Parsing Tables – Example 2

9 R

8 R2 R

7 R

6 S6 S7 9

5 R

4 R3 S

3 S3 S4 8

2 S6 S7 5

1 acc

0 S3 S4 1 2

e d $ S C

action goto