DFA Minimization - Compiler Construction - Lecture Notes, Study notes of Compiler Construction

DFA minimization, Lexical analyzers, Lexical analyzers generators, Using flex, Large number of states, Hopcroft algorithm, Groups of equivalent states, Optimized acceptor are the points from this lecture. You can find series of lecture notes for compiler construction here.

Typology: Study notes

2011/2012

Uploaded on 11/06/2012

asim.amjid
asim.amjid 🇵🇰

4.4

(47)

41 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sohail Aslam Compiler Construction Notes
1
L
Le
ec
ct
tu
ur
re
e
9
9
DFA Minimization
The generated DFA may have a large number of states. The Hopcroft’s algorithm can be
used to minimize DFA states. The behind the algorithm is to find groups of equivalent
states. All transitions from states in one group G1 go to states in the same group G2.
Construct the minimized DFA such that there is one state for each group of states from
the initial DFA. Here is the minimized version of the DFA created earlier; states A and C
have been merged.
A,C
a
b
b b EE
BD
a
b
a
a
We can construct an optimized acceptor with the following structure:
input
string
RE
w
R
yes, if wεL(R)
no, if wεL(R)
RE=>NFA
NFA=>DFA
Min. DFA
Simulate
DFA
Lexical Analyzers
Lexical analyzers (scanners) use the same mechanism but they have multiple RE
descriptions for multiple tokens and have a character stream at the input. The lexical
analyzer returns a sequence of matching tokens at the output (or an error) and it always
return the longest matching token.
pf3

Partial preview of the text

Download DFA Minimization - Compiler Construction - Lecture Notes and more Study notes Compiler Construction in PDF only on Docsity!

Le Leccttuurree 9 9

DFA Minimization

The generated DFA may have a large number of states. The Hopcroft’s algorithm can be used to minimize DFA states. The behind the algorithm is to find groups of equivalent states. All transitions from states in one group G 1 go to states in the same group G2. Construct the minimized DFA such that there is one state for each group of states from the initial DFA. Here is the minimized version of the DFA created earlier; states A and C have been merged.

A,C

b a

b b

B D EE

a

b

a

a

We can construct an optimized acceptor with the following structure:

input

string

RE

w

R

yes, if w ε L(R)

no, if w ε L(R)

RE=>NFA

NFA=>DFA

Min. DFA

Simulate

DFA

Lexical Analyzers

Lexical analyzers (scanners) use the same mechanism but they have multiple RE descriptions for multiple tokens and have a character stream at the input. The lexical analyzer returns a sequence of matching tokens at the output (or an error) and it always return the longest matching token.

Lexical Analyzer Generators

The process of constructing a lexical analyzer can automated. We only need to specify Regular expressions for tokens and rules for assigning priorities for multiple longest match cases, e.g, “==” and “=”, “==” is longer.

Two popular lexical analyzer generators are

  • Flex : generates lexical analyzer in C or C++. It is more modern version of the original Lex tool that was part of the AT&T Bell Labs version of Unix.
  • Jlex: written in Java. Generates lexical analyzer in Java

Using Flex

We will use for the projects in this course. To use Flex, one has to provide a specification file as input to Flex. Flex reads this file and produces an output file contains the lexical analyzer source in C or C++.

The input specification file consists of three sections: C or C++ and flex definitions %% token definitions and actions %% user code

The symbols “%%” mark each section. A detailed guide to Flex is included in supplementary reading material for this course. We will go through a simple example.

The following is the Flex specification file for recognizing tokens found in a C++ function. The file is named “lex.l”; it is customary to use the “.l” extension for Flex input files.

%{ #include “tokdefs.h” %} D [0-9] L [a-zA-Z_] id {L}({L}|{D})* %% "void" {return(TOK_VOID);} "int" {return(TOK_INT);} "if" {return(TOK_IF);} Specification File lex.l "else" {return(TOK_ELSE);} "while"{return(TOK_WHILE)}; "<=" {return(TOK_LE);}