Principles of Programming Languages XXII: Understanding Grammar and Building Parse Trees, Study notes of Programming Languages

An introduction to the principles of programming languages, focusing on grammar and building parse trees. It explains the concept of a grammar as a formalism that describes valid sequences of terminals in a programming language. The document also covers context-free grammars, backus naur form (bnf), and recursive rules. Additionally, it discusses left and right recursive grammars and their implications on evaluation and associativity.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-x1g
koofers-user-x1g 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Wael AboelsaadatWael Aboelsaadat
http://www.cs.toronto.edu/~waelhttp://www.cs.toronto.edu/~wael
Acknowledgement: Some of the material in these lectures are based on material by Prof. Diane Horton &
Prof. Anthony Bonner
Principles of Programming Principles of Programming
Languages XXIILanguages XXII
TodayToday
Grammar
Language Specification & CompilationLanguage Specification & Compilation Grammar: introductionGrammar: introduction
Grammar:
A Grammar is a formalism that describes which sequence of
terminals are meaningful in a PL. Mathematically, it is defined
as a quadruple (N, T, P, S) where:
Nis the set of symbols called Nonterminals
T is the set of symbols called Terminals
P is the set of productions
Ssubsetof N is the nonterminal called the starting symbol
Example:
G = (N,T,P,S) where N = {S} , T = {a,b},
P={S->aS, S->bS, S-> }
Production:
A production is a rule of the form X àY where X is a string of
symbols (terminals or nonterminals) containing at least one
nonterminal, and Y is a string of symbols (terminals or nonterminals)
pf3
pf4
pf5

Partial preview of the text

Download Principles of Programming Languages XXII: Understanding Grammar and Building Parse Trees and more Study notes Programming Languages in PDF only on Docsity!

Wael AboelsaadatWael Aboelsaadat

[email protected]@cs.toronto.edu

http://www.cs.toronto.edu/~waelhttp://www.cs.toronto.edu/~wael

Acknowledgement: Some of the material in these lectures are base d on material by Prof. Diane Horton & Prof. Anthony Bonner

Principles of ProgrammingPrinciples of Programming

Languages XXIILanguages XXII

TodayToday

  • Grammar

Language Specification & CompilationLanguage Specification & Compilation Grammar: introductionGrammar: introduction

  • Grammar:
    • A Grammar is a formalism that describes which sequence of

terminals are meaningful in a PL. Mathematically, it is defined

as a quadruple ( N , T , P , S ) where:

  • N is the set of symbols called Nonterminals
  • T is the set of symbols called Terminals
  • P is the set of productions
  • S subsetof N is the nonterminal called the starting symbol
  • Example: G = (N,T,P,S) where N = {S} , T = {a,b}, P={S->aS, S->bS, S-> }
  • Production:
  • A production is a rule of the form X ‡ Y where X is a string of symbols ( terminals or nonterminals ) containing at least one nonterminal, and Y is a string of symbols ( terminals or nonterminals )

Grammar: context freeGrammar: context free

  • A context free grammar (CFG) is a grammar in which |X| = 1,

i.e. X is a single nonterminal

  • LHS: 1 nonterminal
  • RHS: a sequence of terminals and nonterminals
  • E.g.
    • S -> ab (CFG)
    • SA -> ab (non CFG)
  • CFG is sufficient to describe most of the constructs in

programming languages

  • Programming languages describable by CFG are recognizable

by push down automata ( analogues to FSA with a stack )

Grammar:Grammar: backusbackus NaurNaur formform

  • Backus Naur Form (BNF) is a metalanguage for describing

programming languages

  • A BNF grammar is a context free grammar
  • Notation:
    • Nonterminals are enclosed in angle brackets, i.e. “ <“ and “ >”
    • Uses “ ::=“ instead of “ ‡” in productions
    • Productions having the same left hand side can be grouped

together using the alteration symbol“ |”

e.g ::= a | b |

  • Lists are described using recursive rules
    • A rule is recursive if its left-hand side appears on the right-hand side, e.g. <ident.list> ::= identifier | identifier, <ident.list>

Grammar: BNF recursive rulesGrammar: BNF recursive rules

  • Left Recursive BNF Grammar:
    • A BNF grammar rule is left recursive if its LHS appears at the left end of the RHS e.g. <ident.list> ::= <ident.list> , identifier | identifier e.g. A ‡ A x | y , yxx
  • Right Recursive BNF Grammar:
    • A BNF grammar rule is right recursive if its LHS appears at the right end of the RHS e.g. <ident.list> ::= identifier | identifier, <ident.list> e.g. A ‡ x A | y , xxy

The order of recursion has implications on the order ofThe order of recursion has implications on the order of evaluation and associativity.evaluation and associativity.

Grammar: extended BNFGrammar: extended BNF

  • Notation:
    • (…|…|…) Any one of the alterations
    • […] Optional part
    • (…)* or {…} or […]* repeat zero or more times
    • (…) -^ or {…} -^ or […] -^ repeat one or more times
    • "x" or 'x' terminal symbol
    • Unquoted words non-terminal symbol
  • Example:
    • Using the above notation < expression > : : = < expression > + < term > | < expression > - < term > | < term > could be written in the form of an iteration, as follows: < expression > : : = < term > [ ( + | -) < term > ]*

Grammar: grammars are not uniqueGrammar: grammars are not unique Grammar: ambiguityGrammar: ambiguity

Grammar: inherently ambiguousGrammar: inherently ambiguous Grammar: sources of ambiguityGrammar: sources of ambiguity

  • Associativity and precedence of operators
    • Solution:
      • Change the grammar to reflect operator precedence XY-Z means ((XY) – Z)
  • Extent of a substructure
    • E.g.
      • Dangling else
    • Solution…
  • Obscure recursion
    • E.g.
      • exp ‡ exp exp
      • A ‡ A B
    • Solution: ??

Grammar: is this ambiguous?Grammar: is this ambiguous?

< assign > : : = < identifier > = < expression > < identifier > : : = A|B|C < expression > : : = < expression > + < expression > | < expression > - < expression > | ( < expression > ) | < identifier >

Grammar: is this ambiguous?Grammar: is this ambiguous?

< assign > : : = < identifier > = < expression > < identifier > : : = A|B|C < expression > : : = < expression > + < expression > | < expression > - < expression > | ( < expression > ) | < identifier > Yes, because the sentence A = B - C - A has two different parse trees The grammar does not force "normal" left-to-right evaluation of addition and subtraction.

Grammar: is it really a problem?Grammar: is it really a problem?

  • The operation of addition is associative in mathematics. Hence A + B + C can be performed as either ( A + B ) + C or A + ( B + C ).

The multiply operation is also associative.

Therefore one might say the previous ambiguous grammar would be satisfactory for addition and multiplication.

But would it? Computer arithmetic is not exact, and one might want be able to control the order …

Grammar: an unambiguous versionGrammar: an unambiguous version

< assign > : : = < identifier > = < expression > < identifier > : : = A|B|C < expression > : : = < expression > + < term > | < expression > - < term > | < term > < term > : : = < term >*< factor > | < factor > < factor > : : = ( < expression > ) | < identifier >

Tree for A = B + C * A