Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Languages, Regular Expression - Compiler Construction - Lecture Notes, Study notes of Compiler Construction

Quaid-i-Azam University (QAU)Compiler Construction

Languages, Alphabet, Set of strings of charaters, Finite sequence of character, Regular expression, Finite automation, Set of transitions, Set of accepting states are the points from this lecture. You can find series of lecture notes for compiler construction here.

Typology: Study notes

2011/2012

Uploaded on 11/06/2012

asim.amjid 🇵🇰

4.4

(47)

41 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

Sohail Aslam Compiler Construction Notes

1

L

Le

ec

ct

tu

ur

re

e

6

How to Describe Tokens?

Regular Languages are the most popular for specifying tokens because

• These are based on simple and useful theory,

• Are easy to understand and

• Efficient implementations exist for generating lexical analysers based on such

languages.

Languages

Let Σ ?be a set of characters. Σ is called the alphabet. A language over Σ is set of strings

of characters drawn from Σ.? Here are some examples of languages:

• Alphabet = English characters

Language = English sentences

• Alphabet = ASCII

Language = C++, Java, C# programs

Languages are sets of strings (finite sequence of characters). We need some notation for

specifying which sets we want. For lexical analysis we care about regular languages.

Regular languages can be described using regular expressions. Each regular expression is

a notation for a regular language (a set of words). If A is a regular expression, we write

L(A) to refer to language denoted by A.

Regular Expression

A regular expression (RE) is defined inductively

a ordinary character from Σ

ε the empty string

R|S either R or S

RS R followed by S (concatenation)

R* concatenation of R zero or more times (R* = ε|R|RR|RRR...)

Regular expression extensions are used as convenient notation of complex RE:

R? ε | R (zero or one R)

R+ RR* (one or more R)

(R) R (grouping)

[abc] a|b|c (any of listed)

[a-z] a|b|....|z (range)

[^ab] c|d|... (anything but ‘a’‘b’)

Discover Study notes of Compiler Construction Quaid-i-Azam University (QAU)

Partial preview of the text

Download Languages, Regular Expression - Compiler Construction - Lecture Notes and more Study notes Compiler Construction in PDF only on Docsity!

Le Leccttuurree 6 6

How to Describe Tokens?

Regular Languages are the most popular for specifying tokens because

These are based on simple and useful theory,
Are easy to understand and
Efficient implementations exist for generating lexical analysers based on such languages.

Languages

Let Σ ?be a set of characters. Σ is called the alphabet. A language over Σ is set of strings of characters drawn from Σ.?Here are some examples of languages:

Alphabet = English characters Language = English sentences
Alphabet = ASCII Language = C++, Java, C# programs

Languages are sets of strings (finite sequence of characters). We need some notation for specifying which sets we want. For lexical analysis we care about regular languages. Regular languages can be described using regular expressions. Each regular expression is a notation for a regular language (a set of words). If A is a regular expression, we write L(A) to refer to language denoted by A.

Regular Expression

A regular expression ( RE ) is defined inductively a ordinary character from Σ ε the empty string

R|S either R or S RS R followed by S (concatenation) R* concatenation of R zero or more times (R* = ε|R|RR|RRR...)

Regular expression extensions are used as convenient notation of complex RE:

R? ε | R (zero or one R) R+^ RR* (one or more R) (R) R (grouping) [abc] a|b|c (any of listed) [a-z] a|b|....|z (range) [^ab] c|d|... (anything but ‘a’‘b’)

Here are some Regular Expressions and the strings of the language denoted by the RE.

RE Strings in L(R) a “a” ab “ab” a|b “a” “b” (ab)* “” “ab” “abab” ... (a|ε)b “ab” “b”

Here are examples of common tokens found in programming languages.

digit ‘0’|’1’|’2’|’3’|’4’|’5’|’6’|’7’|’8’|’9’ integer digit digit* identifier [a-zA- Z_][a- zA-Z0-9_]*

Finite Automaton

We need mechanism to determine if an input string w belongs to L(R), the language denoted by regular expression R. Such a mechanism is called an acceptor.

input

string

language

w

L

acceptor

yes, if w ε L

no, if w ε L

The acceptor is based on Finite Automata (FA). A Finite Automaton consists of

An input alphabet Σ
A set of states
A start (initial) state
A set of transitions
A set of accepting (final) states

A finite automaton accepts a string if we can follow transitions labeled with characters in the string from start state to some accepting state. Here are some examples of FA.

A FA that accepts only “1”

A FA that accepts any number of 1’s followed by a single 0