FLAT Theorems, Lecture notes of Theory of Formal Languages for Automata

Formal language and automata theory

Typology: Lecture notes

2011/2012

Uploaded on 09/19/2012

courageouscse
courageouscse 🇮🇳

4

(1)

2 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
FORMAL LANGUAGES AND AUTOMATA THEORY
NFA without moves and an equivalent DFA
Theorem 1.1:
Let L be a set accepted by a nondeterministic finite automaton without moves. Then
there exists a deterministic finite automaton that accepts L.
Proof:
Let M = (Q,,,q0,F) be an NFA (without moves) accepting L.
Construct a DFA M’(Q’,,’,q0’,F’) as follows.
The states of M’ are all the subsets of Q. i.e. Q’ = 2Q.
A subset like {q0,q1,…qn} will be denoted by [q0,q1,…qn] as a state of M’.
The initial state of M’ is q0 = [q0]
F’ is the set of states in Q’ containing a final state of M.
We also define
’([q0,q1,…qi],a) = [p0,p1,…pj] iff ({q0,q1,…qi},a) = {p0,p1,…pj}
We shall now prove that M’ is a DFA such that L(M) = l(M’). For this, we adopt the
method of proof by induction on the length of input string
First, we prove that transitions in M’ are exactly the same as transitions in M’ on any
input string x.
’(q0’,x) = [q1, q2,… qi] iff (q0,x) = {q1, q2,… qi} - (I)
where x is any input string.
Basis (for length of string = 0. i.e. for the input string ):
The result is obvious, since in M’ the transition is, ’(q0’,)=’([q0],]) = [q 0] and in M the
transition is (q0,) = {q0}
Induction:
Let us assume that statement (I) is true for input strings of length m.
Let ‘xa’ be a string of length m+1 with ‘a’ in .
’(q0’,xa) = ’(’(q0’,x),a)
By our assumption, since x is a string of length m,
’(q0’,x) = [p1, p2,… pi] iff (q0,x) = {p1, p2,… pi}
By definition of ’,
’([p1, p2,…pi],a) = [r1,r2,…pk] iff ({p1, p2,…pi},a) = {r1,r2,…rk}
Thus ’(q0’,xa) = ’(’(q0’,x),a) = ’([p1, p2,… pi],a) = [r1,r2,…rk]
iff (q0,xa) = ((q0,x),a) =({p1, p2,… pi},a) = {r1,r2,…rk}
Hence if statement (I) is true for strings of length <=m, then it will be true for n = m+1
also. Thus we establish the truth of statement (I).
Further ’(q0’,x) is in F’ exactly when (q0,x) contains a state in F. M’ accepts x if only if
M accepts x.
PAGE 15
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download FLAT Theorems and more Lecture notes Theory of Formal Languages for Automata in PDF only on Docsity!

FORMAL LANGUAGES AND AUTOMATA THEORY

NFA without moves and an equivalent DFA

Theorem 1.1: Let L be a set accepted by a nondeterministic finite automaton without moves. Then there exists a deterministic finite automaton that accepts L. Proof:

Let M = (Q,,,q 0 ,F) be an NFA (without moves) accepting L.

Construct a DFA M’(Q’,,’,q 0 ’,F’) as follows. The states of M’ are all the subsets of Q. i.e. Q’ = 2Q. A subset like {q 0 ,q^1 ,…q^ n} will be denoted by [q^0 ,q^1 ,…q^ n] as a state of M’.

The initial state of M’ is q 0 ’ = [q 0 ]

F’ is the set of states in Q’ containing a final state of M.

We also define ’([q 0 ,q 1 ,…q (^) i ],a) = [p 0 ,p 1 ,…p (^) j ] iff ({q 0 ,q 1 ,…q (^) i },a) = {p 0 ,p 1 ,…p (^) j}

We shall now prove that M’ is a DFA such that L(M) = l(M’). For this, we adopt the method of proof by induction on the length of input string

First, we prove that transitions in M’ are exactly the same as transitions in M’ on any input string x. ’(q 0 ’,x) = [q 1 , q 2 ,… q (^) i] iff (q 0 ,x) = {q 1 , q 2 ,… qi } - (I) where x is any input string.

Basis (for length of string = 0. i.e. for the input string ): The result is obvious, since in M’ the transition is, ’(q 0 ’,)=’([q 0 ],]) = [q 0 ] and in M the transition is (q 0 ,) = {q 0 }

Induction: Let us assume that statement (I) is true for input strings of length m. Let ‘xa’ be a string of length m+1 with ‘a’ in. ’(q 0 ’,xa) = ’(’(q 0 ’,x),a) By our assumption, since x is a string of length m, ’(q 0 ’,x) = [p 1 , p 2 ,… p (^) i] iff (q 0 ,x) = {p 1 , p 2 ,… pi } By definition of ’, ’([p 1 , p 2 ,…p (^) i],a) = [r 1 ,r 2 ,…p (^) k] iff ({p 1 , p 2 ,…p (^) i },a) = {r 1 ,r 2 ,…rk} Thus ’(q 0 ’,xa) = ’(’(q^0 ’,x),a) = ’([p 1 , p 2 ,… pi ],a) = [r 1 ,r 2 ,…rk] iff (q 0 ,xa) = ((q 0 ,x),a) =({p 1 , p 2 ,… pi },a) = {r 1 ,r 2 ,…rk} Hence if statement (I) is true for strings of length <=m, then it will be true for n = m+ also. Thus we establish the truth of statement (I).

Further ’(q 0 ’,x) is in F’ exactly when (q 0 ,x) contains a state in F. M’ accepts x if only if M accepts x.

Therefore, L(M) = L(M’).

NFA with moves and a corresponding NFA without moves.

Theorem 1.2: If L is accepted by an NFA with -transitions, then L is accepted by an NFA without - transitions.

Proof: Let M(Q,,,q 0 ,F) be an NFA with -transitions.

Construct a DFA M’(Q,,’,q 0 ,F’) where F’ = F {q 0 } if -closure(q^0 ) contains a state of F = F, otherwise and (q,a) = (q,a) for q in Q and a in. There is no move in M’ on.

We assert that M’ is an NFA without moves such that L(M) = L(M’).

There are two parts to the proof of the theorem. (i) The first part is to prove that, on an input string x, the subsets of states reached by M and M’ are one and the same. i.e. ’(q 0 ,x) = (q 0 ,x) for |x| 1. (ii)The second part is to prove that M’ accepts any string x (including ) iff M accepts it. i.e.’(q 0 ,x) contains a state of F’ if and only if (q 0 ,x) contains a state of F.

Part I of the proof: We shall first prove, by induction on the length of the input string |x|, that ’(q 0 ,x) = (q^0 ,x).^ (I)

Basis: The statement may not be true for x = , i.e for length of string 0, since ’(q 0 ,) = {q 0 } and (q 0 ,x) = -closure(q 0 ). Therefore we will start our induction from 1. The result is obvious, since ’(q 0 ,a) = (q^0 ,a) by definition.

Induction:

Let us assume that statement (I) is true for input strings of length m. Let xa be a string of length m+1 with a in. ’(q 0 ,xa) = ’(’(q 0 ,x),a) By our assumption, since x is a string of length m, ’(q 0 ,x) = (q 0 ,x). ’(q 0 ,xa) = ’(P,a) where P = (q 0 ,x). We have, now, to show that ’(P,a) = (q 0 ,xa). But ’(P,a) = ’(q,a) =(q,a). Since P = (q 0 ,x), we have (q,a) = ( q 0 ,xa). Thus ’(q 0 ,xa) = (q^0 ,xa).

Hence if statement (I) is true for strings of length m, then it will be true for strings of length = m +1 also. Thus we establish the truth of statement (I).

Part II of the proof:

Start q 0 q 1

For each a in

Start a q 0 q (^1)

Hence the theorem is true when the number of operators is zero. Induction: Let us assume that the theorem is true for regular expressions with number of operators < m 1. Let r have m operators. Then, depending on the form of r, we may have any one of three cases r = r 1 + r 2 , or r = r 1 r 2 or r = r 1 *. In all these cases, since r 1 and r 2 will have number of operators less than m, we will have NFAs M^1 , M^2 with moves, having one final state and no transitions out of that final state, such that L (M 1 ) = r 1 and L(M 2 ) = r (^) 2. Let M 1 be (Q 1 , 1 , 1 ,q 1 ,{f 1 }) and M 2 be (Q 2 , 2 , 2 ,q 2 ,{f 2 }). Without loss of generality, we may assume that Q 1 and Q^2 to be disjoint.

Case when r = r 1 + r^2

Construct M((Q 1 Q 2 {q (^) 0,f 0 }, 12 {},,q 0 ,{f 0 }) where (q 0 ,) = {q1, q^2 } (q,a) = 1 (q,a) for q in Q^1 -{f^1 } and a in^1 {}, (q,a) = 2 (q,a) for q in Q 2 -{f 2 } and a in 2 {}, (f 1 ,) = (f^2 ,) = {f^0 }.

The construction is depicted in the following diagram

M 1 q 1 f (^1)

Start q 0 f 0 M (^2) q 2 f (^2)

NFA M

It follows immediately that there is a path labeled x in M from q 0 to f 0 iff there is a path labeled x in M 1 from q 1 to f 1 or a path labeled x in M 2 from q 2 to f1.

Hence L(M) = L(M 1 ) L(M 2 ).

Case when r = r 1 r 2

Construct M((Q 1 Q 2 , 12 {},,q 1 ,{f 2 }) where (q,a) = 1 (q,a) for q in Q^1 -{f^1 } and a in^1 {},

(q,a) = 2 (q,a) for q in Q 2 and a in 2 {}, (f 1 ,) = {q^2 }.

The construction is depicted in the following diagram

Start q 1 f 1 q 2 f 2 M 1 M (^2)

NFA M

It follows immediately that there is a path labeled xy in M from q 1 to f 2 iff there is a path labeled x in M 1 from q^1 to f^1 and a path labeled y in M^2 from q^2 to f1. Hence L(M) = L(M 1 ) L(M^2 ).

Case when r = r 1 *

Construct M((Q 1 {q^0 ,f^0 },^1 {},,q 0 ,{f^0 }) where (q 0 ,) = (f 1 ,) = {q^1 ,f^0 } (q,a) = 1 (q,a) for q in Q^1 -{f^1 } and a in^1 {},

The construction is depicted in the following diagram

Start q 0 q 1 f 1 f (^0) M (^1)

NFA M

Any path in M from q 0 to f 0 consists of either the path from q 0 to f 0 labeled or a path from q 0 to q 1 labeled^ followed by path from q^1 to f^1 followed by any number (possibly nil) of repetitions of paths from f 1 to q 1 and q^1 to f 1 , followed by the path from f 1 to f 0 on. Thus there is a path in M from q 0 to f 0 labeled x iff x =^ or x = x 1 x2 …xj for some j 1 such that each xi is in L(M 1 ). Hence L(M) = (L(M))*.

Hence the theorem is true for all m.

is in L. Further, n is no greater than the number of states of the smallest FA accepting L.

Proof : Assume that L is a regular set. Therefore, it is accepted by some DFA. Let n be the number of states in Q. Consider a word z in L of n or more symbols. Let z = a 1 a (^) 2… am , n. Note that n Let (q0, a^1 a^ 2… ai ) = p^ i for i = 1,2,…,m where n with p^0 = q^0. Denote q^0 by p0. Since there are only n different states, it is not possible for each of the n+1 states pi to be distinct. Thus there are two integers j and k such that pj = pk where. Let u = a 1 a (^) 2…a (^) j ; v = aj+1 …a (^) k; w = a (^) k+1…am such that ( q 0 ,u) = pj ; (pj ,v) = p^ k (pk,w) = p^ m Further pj = p (^) k.

We have now found a constant n such that for some word z (sufficiently long) in L could be written as z = uvw, with n, and 1. The path on the input string w = a 1 a2… a (^) m in the transition diagram of M is as illustrated below.

V = a (^) j+1a (^) k

Start u = a 1 a 2 a (^) j w = ak+1 a (^) m p 0 = q 0 p (^) j = p (^) k pm F

since z is in L(M), pm is an accepting state of M. Now, the path leading to the accepting state could even be along the string a 1 a^ 2… aj a^ k +1… a^ m (i.e. uw) avoiding the part a^ j+1 …a^ k (i.e. v), since (q0, uw) = (( q 0 ,u), w) = ( p (^) j , w) = (pk, w) = p (^) m , an accepting state of M.

Therefore, u(v)^0 w also is in L(M).

Similarly, the path could go around the loop v (= a (^) j+1…a (^) k) any number of times since ((q0, u), (v)i^ w)^ = (p^ j, (a^ j+1 …ak)^ i^ a^ k+1…a^ m ) = (p (^) j, (a (^) j+1 …ak) (a (^) j+1 …a (^) k) i-1^ a (^) k+1…a (^) m) = ((p (^) j, (a^ j+1 …ak),(a^ j+1 …a^ k)^ i-1^ a^ k+1…a^ m) = (p (^) k,(a^ j+1 …ak)^ i-1^ a^ k+1…a^ m) = (p (^) j,(a^ j+1 …a^ k)^ i-1^ a^ k+1…a^ m) = … = (p (^) k, ak+1…a (^) m) = p (^) mF.

Thus, the string u(v)i^ w also is in M for any.

Hence the theorem.

Theorem 2.3: Let G(V,T,P,S) be a context free grammar. Then A if and only if there is an A-tree in G with yield.

Proof:

If part: Let be the yield of an A-tree. We can prove that A by induction on the number of interior vertices in the tree. Basis: If there is only one interior vertex, the tree must look like the one as below:

A

X 1 X 2 X 3 …… X (^) n

In this case, X 1 X 2 …X (^) n must be = and A must be a production of P by the definition of a derivation tree.

Induction: If the result is true for trees with up to k-1 interior vertices, for some k > 1.

Let be the yield of an A-tree with k interior vertices. Consider the sons of the root A. All of them could not be leaves. Let the labels of the sons in order be X 1 , X 2 , … X^ n. Then A X 1 X^2 …X^ n must be a production of G. The ith^ child labeled X^ i is either a leaf, or an interior vertex. If ith^ child is a leaf, let Xi =^ i. If the i^ th^ child is not a leaf then it is root of Xi -tree, a proper subtree of the A-tree with fewer than k-1 interior vertices.

Therefore, Xi i. Putting all these partial derivations of X (^) i ’s together, we have A

X 1 X 2 …X (^) n 1 X 2 …X (^) n 1 2 X^3 …X^ n 12 …X^ n … 12 …n =.

Thus A.

Only If part:

Let A. To prove that that there is an A-tree with yield.

Basis: If A = X 1 X^2 …X^ n in one step then A^ X 1 X^2 …X^ n is a production.

Then the A-tree with X1, X2, … Xn as children of the root is the A-tree with yield.

Induction:

Equivalence of acceptance by final states and empty stack.

There are two definitions of acceptance by a PDA, of an input string. i.e. acceptance by final states and acceptance by empty stack. We shall now prove that both are equivalent. Theorem 3.1. If L is L(M 2 ) for some PDA M 2 , then L is N(M 1 ) for some PDA M 1.

(In other words, If L is a language accepted by final states for some PDA M 2 , then it N(M 1 ), the language accepted by empty stack for some PDA M^1 ).

Proof: Let M 2 be (Q,,,,q 0 ,Z 0 ,F) be a PDA, accepting by final sets, such that L(M 2 ) = L. Construct M 1 (Q{q^0 ’,q^ e},,{X^0 },’,q^0 ’,X^0 ,) where ’ is defined as follows:

  1. ’(q 0 ’,,X 0 ) = {(q 0 ,Z 0 X 0 )};
  2. ’(q,a,Z) includes all the elements of (q,a,Z) for q in Q, a in {} and Z in ;
  3. ’(q,,Z) contains (qe,) for all q in F and Z in {X^0 };
  4. ’(qe,,Z) contains (q^ e ,) for all Z in {X^0 }.

Rule 1 pushes X 0 before starting to simulate M 2 on M 1. Rule 2 simulates all the moves of M 2. Rule 3 assures that whenever a final state is reached the ‘erasing state’ of M 1 is entered. Rule 4 assures that whenever the erasing state is entered, the entire stack of M 1 is emptied, thus allowing a choice of accepting the input so far read or continuing to simulate M 2.

To prove that L(M 2 ) = N(M 1 ):

Let x be in L(M 2 ). Therefore (q 0 ,x,X 0 ) (q,,) for some q in F.

Since all moves of M 2 are legal moves of M 1 , we have, (q 0 ’,x,X 0 ) (q 0 ,x,Z 0 X 0 ) (q,,X 0 ) for some q in F (qe,,X^0 ) (qe,,). Thus M 1 accepts x whenever M 2 accepts x.

Conversely, let M 1 accept x. We will prove that M 2 also accepts x.

The sequence of moves could be only as follows: (q 0 ’,x,X 0 ) (q 0 ,x,Z 0 X 0 ) (q,,X 0 ) for some q in F (qe,,X 0 ) (qe,,). Therefore, (q 0 ,x,Z 0 X^0 ) (q,,X^0 ) for some q in F. Hence (q 0 ,x,Z 0 ) (q,,) for some q in F. Hence the theorem.

Theorem 3.2.

If L is N(M 1 ) for some PDA M 1 , then L is L(M 2 ) for some PDA M 2.

(In other words, If L is a language accepted by empty stack for some PDA M 1 , then it L(M 2 ), the language accepted by final states for some PDA M^2 ).

Proof: Let M 1 be (Q,,,,q 0 ,Z 0 ,) be a PDA, accepting by empty stack, such that N(M 1 ) = L.

Construct M 2 (Q{q 0 ’,q (^) f},,{X 0 },’,q 0 ’,X 0 , qf ) where ’ is defined as follows:

  1. ’(q 0 ’,,X^0 ) = {(q^0 , X^0 Z 0 )};
  2. ’(q,a,Z) includes all the elements of (q,a,Z) for q in Q, a in {} and Z in ;
  3. ’(q,,Z 0 ) contains (q (^) f ,) for all q in Q.

Note that

  1. Rule 1 pushes X 0 before starting to simulate M 1.
  2. (^) Rule 2 simulates all the moves of M 1.
  3. Rule 3 assures that whenever the starting symbol in the stack of M 1 is popped the final state of M 2 is entered, thus allowing a choice of accepting the input so far read or continuing to simulate M 1.

To prove that L(M 2 ) = N(M 1 ): Let x be in N(M 1 ). Therefore (q 0 ,x,Z 0 ) (q,,) for some q in F. Since all moves of M 2 are legal moves of M 1 , we have, (q 0 ’,x,X 0 ) (q 0 ,x,X^0 Z^0 ) (q,,Z^0 ) (q (^) f,,)

Thus M 2 accepts x whenever M 1 accepts x.

Conversely, let M 1 accept x. We will prove that M 2 also accepts x. The sequence of moves could be only as follows: (q 0 ’,x,X 0 ) (q 0 ,x,Z 0 X 0 ) (q,,X 0 ) (q (^) f,,) Therefore, (q 0 ,x,X^0 Z 0 ) (q,,Z^0 ) for some q in F. Hence (q 0 ,x,X 0 ) (q,,) for some q in F. Hence the theorem.

Construction of a PDA for a context free language L.

Theorem 3.3. If L is a context free language, then there exists a PDA M such that L is N(M).

Proof: Let G(V,T,P,S) be a context free grammar in Greibach normal forma, such that L(G) = L. Construct the PDA M({q},T,V,,q,S,) where (q,a,A) contains (q,) whenever Aa is in P.

Theorem 3.4. If L is N(M) for some PDA M , then L is a context free language.

Proof: Let M(Q,,,,q 0 ,Z 0 ,) be the PDA.

Then G(V,,P,S) is a Context free grammar such that L(G) = N(M), where V = {[q,A,p] | q and p in Q and A in } {S} and P is the set of productions given by the following rule: )1 S[q 0 ,Z 0 ,q] for each q in Q; )2 If (q,a,A) contains (q 1 ,B 1 B^2 … B^ m), [q,A,qm+1 }a[q 1 ,B 1 ,q 2 ] [q 1 ,B 2 ,q 2 ]… [q (^) m ,Bm ,q (^) m+1 ] for each q,q 1 ,q^2 ,… qm+1 in Q, each a in , and A,B 1 , B^1 ,… B^1 in. )3 If (q,a,A) contains (q 1 ,), then the production is [q,A,q 1 }a.

The variables and productions are defined in such a way that the PDA simulates leftmost derivations in G.

The PDA, starting from state q and ending in state p, has to pop and erase A from the top of the stack, and be ready to pop the next lower symbol in the stack, by a sequence of moves. This compulsion on the system can be denoted by a state [q,A,p].

Therefore, the PDA must be started with state q 0 , with starting symbol Z 0 only on the stack, ready to empty the stack and to enter any state. Thus we provide the productions, S[q 0 ,Z^0 ,q] for each q in Q.

Further, if (q,a,A) contains (q 1 ,B 1 B 2 … B (^) m ), then, it means that the system can erase A from the top of the stack, starting from state q and ending up in state q 1 , on scanning ‘a’ on the input tape, but will push the string of variables, B 1 B^2 … B^ m. Therefore, the necessity to erase A has to be replaced by the necessity to remove all the symbols B 1 , B 2 , … Bm successively from the top of the stack, by sets of sequences of moves. Hence, [q,A,q (^) m+1]a[q 1 ,B 1 ,q 2 ] [q 1 ,B 2 ,q 2 ]… [qm ,B (^) m,qm+1 ] for each q,q 1 ,q 2 ,… qm+1 in Q, each a in , and A,B 1 , B 1 ,… B 1 in. Further, if (q,a,A) contains (q 1 ,), then, it means that the system can erase A from the top of the stack, starting from state q and ending up in state q 1 , on scanning ‘a’ on the input tape, without pushing any string into the stack. Hence [q,A,q 1 ]a, whenever (q,a,A) contains (q 1 ,).

Thus the PDA, so designed, simulates all and only the moves on any leftmost derivation of sentences in L(G). Hence L(G) = N(M).

Normal form theorems: Theorem 4.1 (Chomsky normal form):

Any context free language L without is generated by a grammar in which all productions are of the form A BC or A a where A,B and C are variables and ‘a’ is a terminal.

Proof: Step 1: Eliminate all -productions, unit productions and useless symbols from the CFG generating L and let G(V,T,P,S) be the resulting grammar.

Step 2: Consider the productions in P that contain more than one symbol on the right hand side. Consider any such production of form A X 1 X 2 …X (^) n , for n > 1. If Xi is a terminal ‘a’, replace each X^ i by a new variable like C^ a and introduce a new production C (^) aa, in acceptable form. However, there remain productions of the form A B 1 B^2 …B^ n , for n > 1 (where all Bi ’s are all variables).

Then the modified grammar be G’(V’,T,P’,S), where V’ is the union of V and the newly introduced variables and P’ is the modified productions of P.

Step 3: The productions of G’ are of form A a or A B 1 B 2 …B (^) n , for n > 1. For each production of P’ of form A B 1 B 2 …B (^) n , for n > 2, we create new variables D 1 , D 2 … D^ n-2 and replace the production A B^1 B^2 …B^ n with the set of productions { A B 1 D 1 , D 1 B^2 D 2 , D 2 B^3 D 3 , … Dn-2 B^ n-1B^ n }. The new set of variables is V’’ and the new set of productions is P’’. Now G’’(V’’,T,P’’,S) is a CFG in Chomsky Normal Form. It is easily seen that L(G’’) = L(G’) = L(G).

The pumping lemma for CFL’s

Theorem 4.2. Let L be a CFL. Then there exists a constant n, depending on L, such that, if z is in L, and |z| n, then we may write z = uvwxyz such that

  1. |vx| 1;
  2. |vwx| n and
  3. for all i 0, uv iwx^ i^ y is in L. Proof: First we prove the following Lemma. If the parse tree of a word generated by a Chomsky normal form of grammar has no path of length greater than i, then the word is of no length greater than 2 i-1^. (i.e. if w be the yield of a parse tree and if i is the length of the longest path in it, then |w| 2i-1^ ).

Proof: Basis: for i =1,

The tree then must be of the following form Sa, where ‘a’ is a terminal.

S