









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to the fundamentals of programming languages, focusing on syntax and semantics. It covers the concepts of syntax, semantics, and programming language implementation. The document also delves into describing language syntax using lexical grammar, context-free grammar, and backus-naur form (bnf). The importance of syntax description is discussed, and the document explains how bnfs are used to express context-free grammars. Examples and exercises are included to help students understand the concepts.
Typology: Study notes
1 / 17
This page cannot be seen from the preview
Don't miss anything!










Syntax and Semantics
The symbols and rules to write legal programs
The meaning of legal programs
Syntax −> semantics Translate program syntax into machine actions
Syntax date ::= dd/dd/dddd d = 0|1|2|3|4|5|6|7|8| Semantics 01/02/2005 => Jan 02, 2005 (or Feb 01,2005)?
Why Describing Syntax?
Needs to implement syntax analysis in C/C++/Java etc.
Support communications between programmers and translators/compilers Support automated generation and validation of syntax analyzers Every automation requires an interface language Regular expressions and BNFs are themselves languages for describing language syntax
BNF: Expressing Context-free Grammars Each BNF includes A set of terminals: the words/tokens of the language A set of non-terminals: variables that could be replaced with different sequences of terminals A set of productions Rules identifying the structure of each non-terminal Each production has format A ::= B where A is a single non-terminal B is a sequence of terminals and non-terminals A start non-terminal: the top-level syntax of the language Example: BNF for expressions e ::= n | e+e | e−e | e * e | e / e n ::= d | nd d ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Non-terminals: e, n, d; start non-terminal: e Terminals: 0,1,2,3,4,5,6,7,8, What language does the grammar describe?
Derivations and Parse Trees (Semantics of CFG) Derivation: top-down replacement of non-terminals Each replacement follows a production rule One or more derivations for each valid program Derivations for 5 + 15 * 20 e=> ee* => e+ee* =>n+ee=>d+ee=> 5+ee* =>5+ne=>5+nd e=>5+dde=>5+1de=> 5+15e* =>…=> 5+15* E=> e+e =>…=> 5+e => 5+ee* =>…=> 5+15e* =>…=> 5+15* e e e e (^) e n d 5
n d 1 n d 5 n d 2 n d 0 e e e n e d 5
n d 1 n d 5 n d 2 n d 0 e Parse trees:
Parse Trees A parse tree of each program satisfies Each leaf node represent a terminal Each non-leaf node represent a non-terminal The children of each non-leaf node A, from left to right, form the right-side of a production rule for A (with A at left-side) The root of the parse tree is the starting non-terminal A parse tree represents a syntactically correct program Regenerates a program reading terminals at its leaves from left to right Parsing (checking syntactical correctness) Constructing a parse tree for a program Top-down and bottom-up parsers
Abstract vs. Concrete Syntax Concrete syntax: the syntax that programmers write Example: different notations of expressions Prefix + 5 * 15 20 Infix 5 + 15 * 20 Postfix 5 15 20 * + Abstract syntax: the program structure recognized by compilers/interpreters Identifies only the meaningful components What is the operation and which are the operands? e e e e 5
15 20 e Parse Tree for 5+15*
20 5 15
Abstract Syntax Tree for 5 + 15 * 20
cs3723 11 Abstract syntax trees
Operators and keywords do not appear as leaves They define the meaning of the interior (parent) node Chains of single productions may be collapsed If-then-else B S1 S S IF B THEN^ S1 ELSE S E E +^ T T^5 3 + (^3 )
Ambiguous Grammars
some program has multiple parse trees Multiple choices of production rules during derivation
Parse trees are used to interpret programs Multiple ways to interpret a program e e e e (^) e n d 5
n d 1 n d 5 n d 2 n d 0 e e e n e d 5
n d 1 n d 5 n d 2 n d 0 e
Rewrite ambiguous Grammars Solution1: introduce precedence and associativity rules to dictate the choices of applying production rules Original grammar: e ::= n | e+e | e−e | e * e | e / e Precedence and associativity * / >> + - all operators are left associative Derivation for n+nn e=>e+e=>n+e=>n+ee=>n+ne=>n+nn Solution2: rewrite production rules by introducing additional non-terminals Alternative grammar E ::= E + T | E – T | T T ::= T * F | T / F | F F ::= n Derivation for n + n * n E=>E+T=>T+T=>F+T=>n+T=>n+TF=>n+FF=>n+nF=>n+nn How to modify the grammar if + and - has high precedence than * and / All operators are right associative
Additional exercises
Terminals: digits(0',1',...,9'),(', )',;' and ->' Each node of the graph is represented by an integer number, Each edge is represented by a pair of nodes connected with->' eg., 3->4 is an edge from node 3' to node4' Each graph description is a sequence of edges Eg. ( 1->2; 2->5; 5->1) Write a parse tree and an abstract syntax tree for ( 1->2; 2->5; 5->1)
Additional Exercises (practice on your own) Give a CFG to describe the set of symmetric strings over {a,b} Give a CFG to describe the set of strings over {a,b} that have the same numbers of a’s and b’s? Give a CFG for the syntax of regular expressions over {0,1}
. For example “0|1”, “0”, (01|10) are in the languages “0|” and “*0” are not in the language Can you give a CFG to describe the set of strings that have the format xx, where x is an arbitrary string over {a,b}