## Search in the document preview

**Alphabet
**An **alphabet** is any finite set of symbols.
∑ = {a, b, c, d} is an **alphabet set** where ‘a’, ‘b’, ‘c’, and ‘d’ are **symbols**.

**String
**A **string** is a finite sequence of symbols taken from ∑.
‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d}

**Language
**A language is a subset of ∑* for some alphabet ∑. It can be finite or infinite.
If the language takes all possible strings of length 2 over ∑ = {a, b}, then L = { ab, bb, ba,
bb}

**Grammar
**A grammar G can be formally written as a 4-tuple (V, T, S, P) where −

• V is a set of variables or non-terminal symbols.
• T or ∑ is a set of Terminal symbols.
• S is a special variable called the Start symbol, S ∈ N
• P is Production rules for Terminals and Non-terminals. A production rule has the form α
→ β, where α and β are strings on V*N* ∪ ∑ and least one symbol of α belongs to VN.

**Chomsky Hierarchy
**According to chomsky hierarchy, grammars are divided of 4 types:

• Type 0 known as unrestricted grammar. • Type-0 grammars include all formal grammars. Type 0 grammar language

are recognized by **turing machine**. These languages are also known as the
**recursively enumerable languages**.

• Type 1 known as context sensitive grammar.
• Type-1 grammars generate the **context-sensitive languages**. The language

generated by the grammar are recognized by the **Linear Bound
Automata
**

• Type 2 known as context free grammar. • Type-2 grammars generate the context-free languages. The language

generated by the grammar is recognized by a **Non Deterministic Push
down Automata**. Type-2 grammars generate the context-free languages.
In Type 2,

• Type 3 Regular Grammar
• Type-3 grammars generate the **regular languages**.These languages are

exactly all languages that can be decided by **a finite state automaton**

**Different Types of Automata in Language Theory
Automatons **are abstract models of machines that perform computations on an input by

moving through a series of states or configurations. At each state of the computation, a transition function determines the next configuration on the basis of a finite portion of the present configuration. As a result, once the computation reaches an accepting configuration, it accepts that input.

Characteristics of such machines include:
• **Inputs: **assumed to be sequences of symbols selected from a finite set .
• **Outputs: **sequences of symbols selected from a finite set .
• **States: **finite set *Q*, whose definition depends on the type of automaton.

There are **four major families of automaton **:
• Finite-state machine
• Pushdown automata
• Linear-bounded automata
• Turing machine

**Finite Automata and Regular(Type 3) grammer
**Finite Automata(FA) is the simplest machine to recognize patterns.

An automaton can be represented by a 5-tuple (Q, ∑, δ, q0, F), where −Q : Finite set of states.

∑ : set of Input Symbols. q0 : Initial state. F : set of Final States. δ : Transition Function.

**1) Deterministic Finite Automata (DFA)
**

In deterministic FA, there is only one move from every state on every input symbol
δ : Transition Function is defined as **δ : Q X ∑ --> Q.
**

For example, below DFA with ∑ = {0, 1} accepts all strings ending with 0.

**2) Nondeterministic Finite Automata(NFA)
** NFA is similar to DFA except following additional features:
1. Null (or ε) move is allowed i.e., it can move forward without reading symbols.
2. Ability to transit to any number of states for a particular input.
However, these above features don’t add any power to NFA. If we compare both in terms of
power, both are equivalent.

For example, below is a NFA for above problem

**Some Important Points:
**1. Every DFA is NFA but not vice versa.
2. Both NFA and DFA have same power and each NFA can be translated into a DFA.
3. There can be multiple final states in both DFA and NFA.
3. NFA is more of a theoretical concept.
4. DFA is used in Lexical Analysis in Compiler.

**Regular Expression
**Regular expression can be *defined by the following rules*:

1. Every letter of the alphabet ∑ is a regular expression. 2. Null string є and empty set Φ are regular expressions. 3. If r1 and r2 are regular expressions, then

(i) r1, r2 (ii) r1r2 ( concatenation of r1r2 ) (iii) r1 + r2 ( union of r1 and r2 )

(iv) r1*, r2* ( kleen closure of r1 and r2 ) are also regular expressions

4. If a string can be derived from the rules 1, 2 and 3 then it is also a regular expression
**Regular Grammar
**

A grammar is regular if it has rules of form **A -> a** or **A -> aB** or **A -> ɛ **0 25 B where is a
special symbol called NULL.
**Regular Languages :
**

A language is regular if it can be expressed in terms of regular expression.
**Closure Properties of Regular Languages
**

• **Union :** If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also be
regular..

• **Intersection :** If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will
also be regular

• **Concatenation :** If L1 and If L2 are two regular languages, their concatenation L1.L2
will also be regular.

• **Kleene Closure :** If L1 is a regular language, its Kleene closure L1* will also be regular.
• **Complement :** If L(G) is regular language, its complement L’(G) will also be regular.

Complement of a language can be found by subtracting strings which are in L(G) from all possible strings.

• For example, L(G) = {an | n > 3} L’(G) = {an | n <= 3}

**Construction of a Finite Automata from an Regular Expression
**We can use **Thompson's Construction** to find out a Finite Automaton from a Regular

Expression. We will reduce the regular expression into smallest regular expressions and converting these to NFA and finally to DFA.

**Method
Step 1** Construct an NFA with Null moves from the given regular expression.
**Step 2** Remove Null transition from the NFA and convert it into its equivalent DFA.

Some basic RA expressions are the following −
** Case 1** − For a regular expression ‘a’, we can construct the following FA −

** Case 2** − For a regular expression ‘ab’, we can construct the following FA −

** Case 3** − For a regular expression (a+b), we can construct the following FA −

** Case 4** − For a regular expression (a+b)*, we can construct the following FA −

**Pumping Lemma For Regular Grammars
**

**Theorem
**Let L be a regular language. Then there exists a constant **‘c’** such that for every string **w**
in **L** −
**|w| ≥ c
**

We can break **w** into three strings, **w = xyz**, such that −
• |y| > 0
• |xy| ≤ c
• For all k ≥ 0, the string xykz is also in L.

**Example
**

.
Prove that **L = {aibi | i ≥ 0}** is not regular.
** Solution** −

• At first, we assume that **L** is regular and n is the number of states.
• Let w = *anbn*. Thus |w| = 2n ≥ n.
• By pumping lemma, let w = xyz, where |xy| ≤ n.
• Let x = ap, y = aq, and z = arbn, where p + q + r = n, p ≠ 0, q ≠ 0, r ≠ 0. Thus |y| ≠ 0.
• Let k = 2. Then xy2z = apa2qarbn.

• Number of as = (p + 2q + r) = (p + q + r) + q = n + q • Hence, xy2z = an+q bn. Since q ≠ 0, xy2z is not of the form anbn. • Thus, xy2z is not in L. Hence L is not regular.

**Moore and Mealy Machines
Mealy Machine
**A Mealy Machine is an FSM whose output depends on the present state as well as the present
input.
It can be described by a 6 tuple (Q, ∑, O, δ, X, q0) where −

• **Q** is a finite set of states.
• **∑** is a finite set of symbols called the input alphabet.
• **O** is a finite set of symbols called the output alphabet.
• **δ** is the input transition function where δ: Q × ∑ → Q
• **X** is the output transition function where X: Q × ∑ → O
• **q0** is the initial state from where any input is processed (q0 ∈ Q).

**Moore Machine
**

Moore machine is an FSM whose outputs depend on only the present state. A Moore machine can be described by a 6 tuple (Q, ∑, O, δ, X, q0) where −

• **Q** is a finite set of states.
• **∑** is a finite set of symbols called the input alphabet.
• **O** is a finite set of symbols called the output alphabet.
• **δ** is the input transition function where δ: Q × ∑ → Q
• **X** is the output transition function where X: Q → O
• **q0** is the initial state from where any input is processed (q0 ∈ Q).

**Moore Machine to Mealy Machine
**

**Step 1** − Take a blank Mealy Machine transition table format.
**Step 2** − Copy all the Moore Machine transition states into this table format.
**Step 3** − Check the present states and their corresponding outputs in the Moore Machine state
table; if for a state Qi output is m, copy it into the output columns of the Mealy Machine state
table wherever Qi appears in the next state.
**Mealy Machine to Moore Machine
**

**Step 1** − Calculate the number of different outputs for each state (Qi) that are available in the
state table of the Mealy machine.
**Step 2** − If all the outputs of Qi are same, copy state Qi. If it has n distinct outputs, break Qi into
n states as Qin where **n** = 0, 1, 2.......
**Step 3** − If the output of the initial state is 1, insert a new initial state at the beginning which
gives 0 output.

**Pushdown Automat and Context Free (Type 2)
grammar
**Basically a pushdown automaton is −
**"Finite state machine" + "a stack"
**A pushdown automaton has three components −

• an input tape, • a control unit, and • a stack with infinite size.

A PDA can be formally described as a 7-tuple (Q, ∑, S, δ, q0, I, F) −
• **Q** is the finite number of states
• **∑** is input alphabet
• **S** is stack symbols
• **δ** is the transition function: Q × (∑ ∪ {ε}) × S × Q × S*
• **q0** is the initial state (q0 ∈ Q)
• **I** is the initial stack top symbol (I ∈ S)
• **F** is a set of accepting states (F ∈ Q)

**Context-Free Grammar
**A context-free grammar (CFG) consisting of a finite set of grammar rules is a quadruple
**(V,T,S,P)** where

• **V** is a set of non-terminal symbols.
• **T** is a set of terminals where **N ∩ T = NULL.
**• **P** is a set of rules, **P: N → (N ∪ T)***, i.e., the left-hand side of the production rule **P** does

have any right context or left context.
• **S** is the start symbol.

**Example
**• The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.
• The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
• The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε

**CFL Closure Property
**Context-free languages are **closed** under −

• Union • Concatenation • Kleene Star operation

Context-free languages are **not closed** under −

• **Intersection** − If L1 and L2 are context free languages, then L1 ∩ L2 is not necessarily
context free.

• **Complement** − If L1 is a context free language, then L1’ may not be context free.

**Intersection with Regular Language** − If L1 is a regular language and L2 is a context free
language, then L1 ∩ L2 is a context free language.

**Chomsky and Greibach Normal Form
**

**Chomsky Normal Form
**A CFG is in Chomsky Normal Form if the Productions are in the following forms −
• A → a
• A → BC
• S → ε
where A, B, and C are non-terminals and **a** is terminal.

Any context-free language is generated by a context-free grammar in Chomsky normal form.

**Greibach Normal Form
**A CFG is in Greibach Normal Form if the Productions are in the following forms −

• A → b • A → bD1…Dn • S → ε

where A, D1,....,Dn are non-terminals and b is a terminal. Every CFL L, where ε ∉ L can be generated by a CFG in Greibach normal form.

**Pumping Lemma for CFG
**If **L** is a context-free language, there is a pumping length **p** such that any string **w ∈ L** of

length **≥ p** can be written as **w = uvxyz**, where **vy ≠ ε**, **|vxy| ≤ p**, and for all **i ≥ 0, uvixyiz ∈ L**.

Example
Find out whether the language **L = {xnynzn | n ≥ 1}** is context free or not.
**Solution
**Let **L** is context free. Then, **L** must satisfy pumping lemma.
At first, choose a number **n** of the pumping lemma. Then, take z as 0n1n2n.
Break **z** into **uvwxy,** where
**|vwx| ≤ n and vx ≠ ε.
**Hence **vwx** cannot involve both 0s and 2s, since the last 0 and the first 2 are at least (n+1)

positions apart. There are two cases −

**Case 1** − **vwx** has no 2s. Then **vx** has only 0s and 1s. Then **uwy**, which would have to be
in **L**, has **n** 2s, but fewer than **n** 0s or 1s.

**Case 2** − **vwx** has no 0s.
Here contradiction occurs.
Hence, **L** is not a context-free language.

**Turing Machine and Recursive Enumerable Language
(Type 0)
**Turing Machine was invented in 1936 by Alan Turing.
A Turing Machine (TM) is a mathematical model which consists of an infinite length tape
divided into cells on which input is given.

A TM can be formally described as a 7-tuple (Q, X, ∑, δ, q0, B, F) where − • Q is a finite set of states • X is the tape alphabet • ∑ is the input alphabet • δ is a transition function; δ : Q × X → Q × X × {Left_shift, Right_shift}. • q0 is the initial state • B is the blank symbol • F is the set of final states

Time complexity all reasonable functions −
**T(n) = O(n log n)
**

TM's space complexity −
**S(n) = O(n)
**

**Example of Turing machine
**Turing machine M = (Q, X, ∑, δ, q0, B, F) with
• Q = {q0, q1, q2, qf}
• X = {a, b}
• ∑ = {1}
• q0 = {q0}
• B = blank symbol
• F = {qf }
δ is given by −

**Tape alphabet symbol Present State ‘q0’ Present State ‘q1’ Present State ‘q2’
**a 1Rq1 1Lq0 1Lqf
b 1Lq2 1Rq1 1Rqf

Here the transition 1Rq1 implies that the write symbol is 1, the tape moves right, and the next state is q1. Similarly, the transition 1Lq2 implies that the write symbol is 1, the tape moves left, and the next state is q2.

**Recursive Enumerable (RE) or Type -0 Language
**RE languages or type-0 languages are generated by type-0 grammars. An RE language can be
accepted or recognized by Turing machine which means it will enter into final state for the
strings of language and may or may not enter into rejecting state for the strings which are not
part of the language. It means TM can loop forever for the strings which are not a part of the
language. RE languages are also called as Turing recognizable languages.
**Recursive Language (REC)
**A recursive language (subset of RE) can be decided by Turing machine which means it will enter
into final state for the strings of language and rejecting state for the strings which are not part of
the language. REC languages are also called as Turing decidable languages.

.
**Closure Properties of Recursive Languages
**

• **Union**: If L1 and If L2 are two recursive languages, their union L1∪L2 will also be
recursive because if TM halts for L1 and halts for L2, it will also halt for L1∪L2.

• **Concatenation:** If L1 and If L2 are two recursive languages, their concatenation L1.L2
will also be recursive.

• **Kleene Closure:** If L1is recursive, its kleene closure L1* will also be recursive.
• ** Intersection and complement**: If L1 and If L2 are two recursive languages, their

intersection L1 ∩ L2 will also be recursive. Similarly, complementof recursive language L1 which is ∑*-L1, will also be recursive.

*Recursively enumerable languages are not closed under set difference or complementation.**The set difference L - P may or may not be recursively enumerable. If L is recursively
enumerable, then the complement of L is recursively enumerable if and only if L is also
recursive.
*

**Linear Bound Automata and Context Sensitive
Languages (Type 1)
**we can restrict power of Turing Machine in following ways:

1. If we use TAPE as STACK then it will be "PDA"

2. If we make TAPE finite then it will be "Finite Automata" 3. If TAPE size is equal to input size then it will be "LBA"

LBA is powerful than PDA

**A context-sensitive language
**A context-sensitive language is a language over some alphabet generated by some

grammar known as a context-sensitive grammar.

Formally, a context-sensitive language is a formal grammar G=(V,T,S,P) such that given any production in P, either has the form

• uXv→uwv ,where X∈V, u,v,w∈Σ* and w≠ λ • or is S→, provided that the start symbol S does not occur on the right side of any productions

in P.

**Examples of Context Sensitive Languages
**1. L = {anbncn | n ≥1}
2. L = {an! | n ≥ 0}
3. L = {an | n = m2, m ≥ 1}, means n is perfect square
4. L = {an | n is prime}
5. L = {an | n is not a prime}
6. L = {ww | w ε {a, b}+}
7. L = {wn | w ε {a, b}+, n ≥ 1}
8. L = {wwwR | w ε {a, b}+}