Syntax and Semantics of Programming Languages: Understanding CS3723, Study notes of Programming Languages

An introduction to the fundamentals of programming languages, focusing on syntax and semantics. Syntax refers to the rules and symbols used to write legal programs, while semantics deals with the meaning of these programs. Describing language syntax using lexical and grammar syntax, the importance of syntax description, and semantics description using informal and formal methods. It also discusses context-free grammars and their role in defining language aspects.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-th3
koofers-user-th3 🇺🇸

10 documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
cs3723 1
Fundamantals
Syntax and Semantics of
Programming Languages
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Syntax and Semantics of Programming Languages: Understanding CS3723 and more Study notes Programming Languages in PDF only on Docsity!

Fundamantals

Syntax and Semantics of

Programming Languages

Syntax and Semantics

 Syntax

 The symbols and rules to write legal programs

 Semantics

 The meaning of legal programs

 Programming language implementation

 Syntax −> semantics

 Translate program syntax into machine actions

 Example: date specification

 Syntax

 date ::= dd/dd/dddd d = 0|1|2|3|4|5|6|7|8|

 Semantics

 01/02/2005 => Jan 02, 2005 (or Feb 01,2005)?

Why Describing Syntax?

 A translator/compiler needs to understand

programs via syntax analysis

 Needs to implement syntax analysis in C/C++/Java etc.

 Why does the syntax need to be formally

defined?

 Regular expressions and BNFs are themselves

languages for describing language syntax

 Supports communications between programmers and

translators/compilers

 Supports automated generation and validation of syntax

analyzers

Every configurable automation requires an interface

language

Describing Semantics

 Informal definitions

 Tutorials: learn by working examples

 Reference Manuals

 Natural language explanation for each syntax rule

 Formal definitions (skip)

 Attribute grammars

 Associate attributes (values) with each grammar symbol

 Associate semantics rules with each grammar rule

 Operational semantics

 Interpret the language on an abstract machine or using

another language

 Denotational semantics

 Define language constructs as mathematical functions

 Axiomatic semantics (proof rules)

 Define properties (invariance) of language constructs

 Goal: communication, automation and validation

What language aspects can be

defined by CFG?

 Can we use CFG to define more than just syntax?

 e ::= num | string | id | e+e

 Support both alternative (|) and recursion

 Cannot incorporate context information

 Cannot determine the type of variable names

 Declaration of variables is in the context (symbol table)

 Cannot ensure variables are always defined before used

int w;

0 = w;

for (w = 1; w < 100; w = 2w)

a = “c” + 3;

a = “c” + w

Derivations and Parse Trees

(Semantics of CFG)

 Derivation: top-down replacement of non-terminals

 Each replacement follows a production rule

 One or more derivations for each program

 Derivations for 5 + 15 * 20

 e=> ee* => e+ee* =>n+ee=>d+ee=> 5+ee* =>5+ne=>5+nd

e=>5+dde=>5+1de=> 5+15e* =>…=> 5+15*

 E=> e+e =>…=> 5+e => 5+ee* =>…=> 5+15e* =>…=> 5+15*

e

e

e

e e

n

d

n

d

n

d

n

d

n

d

e

e

e

n e

d

n

d

n

d

n

d

n d

e

Parse trees:

Abstract vs. Concrete Syntax

 Concrete syntax: the syntax that programmers write

 Example: different notations of expressions

 Prefix + 5 * 15 20

 Infix 5 + 15 * 20

 Postfix 5 15 20 * +

 Abstract syntax: the program structure recognized by

compilers/interpreters

 Identifies only the meaningful components

 What is the operation and which are the operands?

e

e

e

e

e

Parse Tree for

Abstract Syntax Tree for 5 + 15 * 20

cs3723 11

Abstract syntax trees

 Condensed form of parse tree for representing

language constructs

 Operators and keywords do not appear as leaves

 They define the meaning of the interior (parent) node

 Chains of single productions may be collapsed

If-then-else

B S1 S
S
IF B THEN^ S1 ELSE S
E
E
+ T
T

Ambiguous Grammars

 A grammar is syntactically ambiguous if

 some program has multiple parse trees

 Multiple choices of production rules during derivation

 Consequence of multiple parse trees

 Parse trees are used to interpret programs

 Multiple ways to interpret a program

e

e

e

e e

n

d

n

d

n

d

n

d

n

d

e

e

e

n e

d

n

d

n

d

n

d

n d

e

Rewrite ambiguous Grammars

 Solution1: introduce precedence and associativity rules to
dictate the choices of applying production rules

 Original grammar: e ::= n | e+e | e−e | e * e | e / e

 Precedence and associativity

 * / >> + - all operators are left associative

 Derivation for n+n*n

 e=>e+e=>n+e=>n+ee=>n+ne=>n+n*n

 Solution2: rewrite production rules by introducing additional
non-terminals

 Alternative grammar E ::= E + T | E – T | T

T ::= T * F | T / F | F

F ::= n

 Derivation for n + n * n

 E=>E+T=>T+T=>F+T=>n+T=>n+TF=>n+FF=>n+nF=>n+nn

 How to modify the grammar if

 + and - has high precedence than * and /

 All operators are right associative

Additional exercises

 Give a context-free grammar for a small graph

description language

 Terminals: digits(0',1',...,9'),(', )',;' and `->'

 Each node of the graph is represented by an integer

number,

 Each edge is represented by a pair of nodes connected

with `->'

 eg., 3->4 is an edge from node 3' to node4'

 Each graph description is a sequence of edges

 Eg. ( 1->2; 2->5; 5->1)

 Write a parse tree and an abstract syntax tree

for ( 1->2; 2->5; 5->1)

Additional Exercises

(practice on your own)

 Give a CFG to describe the set of symmetric strings over

{a,b}

 Give a CFG to describe the set of strings over {a,b} that

have the same numbers of a’s and b’s?

 Give a CFG for the syntax of regular expressions over {0,1}

. For example

 “0|1”, “0”, (01|10) are in the languages

 “0|” and “*0” are not in the language

 Can you give a CFG to describe the set of strings that have

the format xx, where x is an arbitrary string over {a,b}