Context Free Grammars: Understanding Syntax in Programming Languages, Study notes of Programming Languages

An overview of context free grammars (cfgs), their significance in precisely describing the syntax of programming languages, and their role in the compilation process. It covers the basics of cfgs, their relationship with regular expressions (res), and tips for designing effective grammars.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-4jo
koofers-user-4jo 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CMSC 330: Organization of
Programming Languages
Context Free Grammars 2
CMSC 330 2
Last Lecture
Why should we study CFGs?
Precisely describe syntax of programming languages
What are the four parts of a CFG?
Terminals, nonterminals, productions, start symbol
How do we tell if a string is accepted by a CFG?
Find a derivation from start symbol to string
¾By applying productions to nonterminals at each step
What’s a parse tree?
Representation of derivation of string
CMSC 330 3
REs and CFGs in Practice
REs turn raw text into a stream of tokens
E.g., “if”, “then”, “identifier”, etc.
This process is calling scanning or lexing
Whitespace and comments are simply skipped
These tokens become the input for the parser
CFGs turn tokens into parse trees
This process is called parsing
Parse trees become the input for the code generator
CMSC 330 4
Steps of Compilation
CMSC 330 5
Leftmost and Rightmost Derivation
Leftmost derivation
Leftmost nonterminal is replaced in each step
Rightmost derivation
Rightmost nonterminal is replaced in each step
Example
Grammar
¾S AB, A a, B b
Leftmost derivation for “ab”
¾S AB aB ab
Rightmost derivation for “ab”
¾S AB Ab ab
CMSC 330 6
Parse Tree For Derivations
Parse tree may be same for both leftmost &
rightmost derivations
Example Grammar: S a | SbS String: aba
Leftmost Derivation
SSbS abSaba
Rightmost Derivation
SSbSSba aba
Parse trees don’t show order productions are applied
Every parse tree has a unique leftmost and a
unique rightmost derivation
pf3
pf4
pf5

Partial preview of the text

Download Context Free Grammars: Understanding Syntax in Programming Languages and more Study notes Programming Languages in PDF only on Docsity!

CMSC 330: Organization of

Programming Languages

Context Free Grammars 2

CMSC 330 2

Last Lecture

Why should we study CFGs?

  • Precisely describe syntax of programming languages

What are the four parts of a CFG?

  • Terminals, nonterminals, productions, start symbol

How do we tell if a string is accepted by a CFG?

  • Find a derivation from start symbol to string ¾ By applying productions to nonterminals at each step

What’s a parse tree?

  • Representation of derivation of string

CMSC 330 3

REs and CFGs in Practice

REs turn raw text into a stream of tokens

  • E.g., “if”, “then”, “identifier”, etc.
  • This process is calling scanning or lexing
  • Whitespace and comments are simply skipped
  • These tokens become the input for the parser

CFGs turn tokens into parse trees

  • This process is called parsing
  • Parse trees become the input for the code generator

CMSC 330 4

Steps of Compilation

CMSC 330 5

Leftmost and Rightmost Derivation

Leftmost derivation

  • Leftmost nonterminal is replaced in each step

Rightmost derivation

  • Rightmost nonterminal is replaced in each step

Example

  • Grammar ¾ S → AB, A → a, B → b
  • Leftmost derivation for “ab” ¾ S ⇒ AB ⇒ aB ⇒ ab
  • Rightmost derivation for “ab” ¾ S ⇒ AB ⇒ Ab ⇒ ab CMSC 330 6

Parse Tree For Derivations

Parse tree may be same for both leftmost &

rightmost derivations

  • Example Grammar: S → a | SbS String: aba Leftmost Derivation S ⇒ SbS ⇒ abS ⇒ aba Rightmost Derivation S ⇒ SbS ⇒ Sba ⇒ aba
  • Parse trees don’t show order productions are applied
  • Every parse tree has a unique leftmost and a

unique rightmost derivation

CMSC 330 7

Parse Tree For Derivations (cont.)

Not every string has a unique parse tree

  • Example Grammar: S → a | SbS String: ababa Leftmost Derivation S ⇒ SbS ⇒ abS ⇒ abSbS ⇒ ababS ⇒ ababa Another Leftmost Derivation S ⇒ SbS ⇒ SbSbS ⇒ abSbS ⇒ ababS ⇒ ababa

CMSC 330 8

Ambiguity

A grammar is ambiguous if a string may have

multiple leftmost (or rightmost) derivations

  • Equivalent to multiple parse trees
  • Can be hard to determine
    1. S → aS | T T → bT | U U → cU | ε
    2. S → T | T T → Tx | Tx | x | x
    3. S → SS | () | (S)

No

Yes

?

CMSC 330 9

Ambiguity (cont.)

Example

  • Grammar: S → SS | () | (S) String: ()()()
  • 2 distinct leftmost derivations (and parse trees) ¾ S ⇒ SS ⇒ SSS ⇒()SS ⇒()()S ⇒()()() ¾ S ⇒ SS ⇒ ()S ⇒()SS ⇒()()S ⇒()()()

CMSC 330 10

More Derivations

Is the following derivation leftmost or rightmost?

  • S ⇒ aS ⇒ aT ⇒ aU ⇒ acU ⇒ ac ¾ Both! At most one non-terminal in each sentential form, so there's no choice in which non-terminals to expand

How about the following derivation?

  • Grammar: S → a | SbS String: ababa
  • S ⇒ SbS ⇒ SbSbS ⇒ SbabS ⇒ ababS ⇒ ababa ¾ Neither! Selects left, center, left, and rightmost nonterminals

CMSC 330 11

Tips for Designing Grammars

1. Use recursive productions to generate an

arbitrary number of symbols

A → xA | ε // Zero or more x’s A → yA | y // One or more y’s

2. Use separate non-terminals to generate

disjoint parts of a language, and then combine

in a production

{ ab } // a’s followed by b’s S → AB A → aA | ε // Zero or more a’s B → bB | ε // Zero or more b’s CMSC 330 12

Tips for Designing Grammars (cont.)

3. To generate languages with matching, balanced,

or related numbers of symbols, write productions

which generate strings from the middle

{a n^ b n^ | n ≥ 0} // N a’s followed by N b’s S → aSb | ε Example derivation: S ⇒ aSb ⇒ aaSbb ⇒ aabb {a n^ b 2n^ | n ≥ 0} // N a’s followed by 2N b’s S → aSbb | ε Example derivation: S ⇒ aSbb ⇒ aaSbbbb ⇒ aabbbb

CMSC 330 19

Parse Tree

Else belongs to outer if

CMSC 330 20

Dealing With Ambiguous Grammars

Ambiguity is bad

  • Syntax is correct
  • But semantics differ depending on choice ¾ Different associativity (a-b)-c vs. a-(b-c) ¾ Different precedence (a-b)c vs. a-(bc) ¾ Different control flow if (if else) vs. if (if) else

Two approaches

  • Rewrite grammar
  • Use special parsing rules ¾ Depending on parsing method (learn in CMSC 430)

CMSC 330 21

Fixing the Expression Grammar

Require right operand to not be bare expression

E → E+T | E-T | E*T | T

T → a | b | c | (E)

Corresponds to left-associativity

Now only one parse tree for a-b-c

  • Find derivation

CMSC 330 22

What if We Wanted Right-Associativity?

Left-recursive productions

  • Used for left-associative operators
  • Example E → E+T | E-T | E*T | T T → a | b | c | (E)

Right-recursive productions

  • Used for right-associative operators
  • Example E → T+E | T-E | T*E | T T → a | b | c | (E)

CMSC 330 23

Parse Tree Shape

The kind of recursion determines the shape of

the parse tree

left recursion (^) right recursion

CMSC 330 24

A Different Problem

How about the string a+b*c?

E → E+T | E-T | E*T | T

T → a | b | c | (E)

Doesn’t have correct

precedence for *

  • When a nonterminal has productions for several operators, they effectively have the same precedence

CMSC 330 25

Final Expression Grammar

E → E+T | E-T | T lowest precedence operators T → T*P | P higher precedence P → a | b | c | (E) highest precedence (parentheses)

Practice

  • Construct tree and left and and right derivations for ¾ a+bc a(b+c) a*b+c a-b-c
  • See what happens if you change the last set of productions to P → a | b | c | E | (E)
  • See what happens if you change the first set of productions to E → E +T | E-T | T | P CMSC 330 26

Summary

Context free grammars

  • Leftmost & rightmost derivations
  • Ambiguity
  • Designing grammars
  • Associativity & precedence