Syntactic Analysis-Compiler Construction-Lecture Notes, Study notes of Compiler Construction

Compile is software which translates high level programming language to computer basic language. Many compilers exist e.g. Borland, Java etc. Compiler construction have few steps which are taught in this course. This lecture includes: Syntatic, Analysis, Errors, C , Int, Return, Boolean, Expression, Parser, Tokens, Semantic

Typology: Study notes

2011/2012

Uploaded on 08/04/2012

qader.khan
qader.khan 🇵🇰

4.3

(6)

12 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
L
Le
ec
ct
tu
ur
re
e
1
11
1
Syntactic Analysis
Consider the following C++ function. There are a number of syntax errors present.
1. int* foo(int i, int j))
2. {
3. for(k=0; i j; )
4. fi( i > j )
5. return j;
6. }
Line 1 has extra parenthesis at the end. The boolean expression in the for loop in line 3 is
incorrect. Line 4 has a missing semicolon at the end. All such errors are due to the fact
the function does not abide by the syntax of the C++ language grammar.
Semantic Analysis
Consider the English language sentence “He wrote the computer”. The sentence is
syntactically correct but semantically wrong. The meaning of the sentence is incorrect;
one does not “write” a computer. Issues related to meaning fall under the heading of
semantic analysis. The following C++ function has semantic errors. The type of the local
variable sum has not been declared. The returned value does not match the return value
type of the function (int*). The function is syntactically correct.
int* foo(int i, int j)
{
for(k=0; i < j; j++ )
if( i < j-2 )
sum = sum+i
return sum;
}
Role of the Parser
Not all sequences of tokens are program. Parser must distinguish between valid and
invalid sequences of tokens. What we need is an expressive way to describe the syntax of
programs and an acceptor mechanism that determines if input token stream satisfies the
syntax of the programming language. The acceptor mechanism determines if input token
stream satisfies the syntax of a programming language.
Parsing is the process of discovering a derivation for some sentence of a language. The
mathematical model of syntax is represented by a grammar G. The language generated
pf3

Partial preview of the text

Download Syntactic Analysis-Compiler Construction-Lecture Notes and more Study notes Compiler Construction in PDF only on Docsity!

LeLeccttuurree 1 111

Syntactic Analysis

Consider the following C++ function. There are a number of syntax errors present.

*1. int foo(int i, int j))

  1. {
  2. for(k=0; i j; )
  3. fi( i > j )
  4. return j;
  5. }**

Line 1 has extra parenthesis at the end. The boolean expression in the for loop in line 3 is incorrect. Line 4 has a missing semicolon at the end. All such errors are due to the fact the function does not abide by the syntax of the C++ language grammar.

Semantic Analysis

Consider the English language sentence “He wrote the computer”. The sentence is syntactically correct but semantically wrong. The meaning of the sentence is incorrect; one does not “write” a computer. Issues related to meaning fall under the heading of semantic analysis. The following C++ function has semantic errors. The type of the local variable sum has not been declared. The returned value does not match the return value type of the function ( int* ). The function is syntactically correct.

int foo(int i, int j) { for(k=0; i < j; j++ ) if( i < j-2 ) sum = sum+i return sum; }*

Role of the Parser

Not all sequences of tokens are program. Parser must distinguish between valid and invalid sequences of tokens. What we need is an expressive way to describe the syntax of programs and an acceptor mechanism that determines if input token stream satisfies the syntax of the programming language. The acceptor mechanism determines if input token stream satisfies the syntax of a programming language.

Parsing is the process of discovering a derivation for some sentence of a language. The mathematical model of syntax is represented by a grammar G. The language generated

by the grammar is indicated by L(G). Syntax of most programming languages can be represented by Context Free Grammars (CFG).

A CFG is a four tuple G=(S,N,T,P)

  1. S is the _start symbol
  2. N_ is a set of _non-terminals
  3. T_ is a set of _terminals
  4. P_ is a set of productions

Why aren’t Regular Expressions used to represent syntax? The reasons is that regular languages do not have enough power to express syntax of programming languages. Moreover, finite automaton can’t remember number of times it has visited a particular state.

Consider the following example of CFG

SheepNoise? SheepNoise baa | baa

This CFG defines the set of noises sheep make. We can use the SheepNoise grammar to create sentences of the language. We use the productions as rewriting rules

Rule Sentential Form

- SheepNoise 1 SheepNoise baa 1 SheepNoise baa baa 2 baa baa baa

While it is cute, this example quickly runs out intellectual steam. To explore uses of CFGs, we need a more complex grammar. Consider the grammar for arithmetic expressions:

1 expr? expr op expr 2 | num 3 | id 4 op? + 5 | – 6 | * 7 | /

Grammar rules in a similar form were first used in the description of the Algol- programming language. The syntax of C, C++ and Java is derived heavily from Algol-60.