Operations on Languages and Regular Expressions: Theory of Computation CS 373 - Prof. Mahe, Study notes of Computer Science

The concepts of operations on languages, including union, concatenation, and kleene closure, using examples and formal definitions. It also covers regular expressions as a formula for representing complex languages using these operations.

Typology: Study notes

Pre 2010

Uploaded on 03/16/2009

koofers-user-5gl
koofers-user-5gl 🇺🇸

9 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 373: Theory of Computation
Manoj Prabhakaran Mahesh Viswanathan
Fall 2008
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Operations on Languages and Regular Expressions: Theory of Computation CS 373 - Prof. Mahe and more Study notes Computer Science in PDF only on Docsity!

CS 373: Theory of Computation

Manoj Prabhakaran Mahesh Viswanathan

Fall 2008

1 Operations on Languages

Operations on Languages

  • Recall: A language is a set of strings
  • We can consider new languages derived from operations on given languages
    • e.g., L 1 ∪ L 2 , L 1 ∩ L 2 , 12 L,...
  • A simple but powerful collection of operations:
    • Union, Concatenation and Kleene Closure

Union is a familiar operation on sets. We define and explain the other two operations below. Concatenation of Languages

Definition 1. Given languages L 1 and L 2 , we define their concatenation to be the language L 1 ◦ L 2 = {xy | x ∈ L 1 , y ∈ L 2 }

Example 2. • L 1 = {hello} and L 2 = {world} then L 1 ◦ L 2 = {helloworld}

  • L 1 = { 00 , 10 }; L 2 = { 0 , 1 }. L 1 ◦ L 2 = { 000 , 001 , 100 , 101 }
  • L 1 = set of strings ending in 0; L 2 = set of strings beginning with 01. L 1 ◦ L 2 = set of strings containing 001 as a substring
  • L ◦ {} = L. L ◦ ∅ = ∅.

Kleene Closure

Definition 3.

Ln^ =

{} if n = 0 Ln−^1 ◦ L otherwise

L∗^ =

i≥ 0

Li

i.e., Li^ is L ◦ L ◦ · · · ◦ L (concatenation of i copies of L), for i > 0. L∗, the Kleene Closure of L: set of strings formed by taking any number of strings (possibly none) from L, possibly with repetitions and concatenating all of them.

  • If L = { 0 , 1 }, then L^0 ={}, L^2 = { 00 , 01 , 10 , 11 }. L∗^ = set of all binary strings (including ).
  • ∅^0 = {}. For i > 0, ∅i^ = ∅. ∅∗^ = {}
  • ∅ is one of only two languages whose Kleene closure is finite. Which is the other? {}∗^ = {}.

R L(R)

(0 ∪ 1)∗^ = ({ 0 } ∪ { 1 })∗^ = { 0 , 1 }∗

0 ∗^ ∪ (0∗ 10 ∗ 10 ∗ 10 ∗)∗^ Strings where the number of 1s is divisible by 3 (0 ∪ 1)∗001(0 ∪ 1)∗^ Strings that have 001 as a sub- string

More Examples

R L(R)

(10)∗^ ∪ (01)∗^ ∪ 0(10)∗^ ∪ 1(01)∗^ Strings that consist of alter- nating 0s and 1s ( ∪ 1)(01)∗( ∪ 0) Strings that consist of alter- nating 0s and 1s (0 ∪ )(1 ∪ 10)∗^ Strings that do not have two consecutive 0s

Some Regular Expression Identities We say R 1 = R 2 if L(R 1 ) = L(R 2 ).

  • Commutativity: R 1 ∪ R 2 = R 2 ∪ R 1 (but R 1 ◦ R 2 6 = R 2 ◦ R 1 typically)
  • Associativity: (R 1 ∪ R 2 ) ∪ R 3 = R 1 ∪ (R 2 ∪ R 3 ) and (R 1 ◦ R 2 ) ◦ R 3 = R 1 ◦ (R 2 ◦ R 3 )
  • Distributivity: R ◦ (R 1 ∪ R 2 ) = R ◦ R 1 ∪ R ◦ R 2 and (R 1 ∪ R 2 ) ◦ R = R 1 ◦ R ∪ R 2 ◦ R
  • Concatenating with : R ◦  =  ◦ R = R
  • Concatenating with ∅: R ◦ ∅ = ∅ ◦ R = ∅
  • R ∪ ∅ = R. R ∪  = R iff  ∈ L(R)
  • (R∗)∗^ = R∗
  • ∅∗^ = 

Useful Notation

Definition 4. Define R+^ = RR∗. Thus, R∗^ = R+^ ∪ . In addition, R+^ = R∗^ iff  ∈ L(R).

2.2 Regular Expressions and Regular Languages

Regular Expressions and Regular Languages Why do they have such similar names?

Theorem 5. L is a regular language if and only if there is a regular expression R such that L(R) = L

i.e., Regular expressions have the same “expressive power” as finite automata.

Proof. • Given regular expression R, will construct NFA N such that L(N ) = L(R)

  • Given DFA M , will construct regular expression R such that L(M ) = L(R)

2.3 Regular Expressions to NFA

Regular Expressions to Finite Automata

... to Non-determinstic Finite Automata

Lemma 6. For any regex R, there is an NFA NR s.t. L(NR) = L(R).

Proof Idea We will build the NFA NR for R, inductively, based on the number of operators in R, #(R).

  • Base Case: #(R) = 0 means that R is ∅, , or a (from some a ∈ Σ). We will build NFAs for these cases.
  • Induction Hypothesis: Assume that for regular expressions R, with #(R) ≤ n, there is an NFA NR s.t. L(NR) = L(R).
  • Induction Step: Consider R with #(R) = n + 1. Based on the form of R, the NFA NR will be built using the induction hypothesis.

Regular Expression to NFA

Base Cases If R is an elementary regular expression, NFA NR is constructed as follows.

R = ∅

q 0

R = 

q 0

R = a

q 0 a q 1

⇐ w ∈ L(N 1 ) ∪ L(N 2 ). Consider w ∈ L(N 1 ); case of w ∈ L(N 2 ) is similar. Then, q 1 −→w N 1 q for some q ∈ F 1. Thus, q 0

 −→N q 1

w −→N q, and q ∈ F. This means that w ∈ L(N ).

Induction Step: Concatenation

Case R = R 1 ◦ R 2

  • By induction hypothesis, there are N 1 , N 2 s.t. L(N 1 ) = L(R 1 ) and L(N 2 ) = L(R 2 )
  • Build NFA N s.t. L(N ) = L(N 1 ) ◦ L(N 2 )

q 1

q 11

q 12

q 2 q 21

Figure 3: NFA for L(N 1 ) ◦ L(N 2 )

Formal definition and proof of correctness left as exercise.

Induction Step: Kleene Closure First Attempt

Case R = R∗ 1

  • By induction hypothesis, there is N 1 s.t. L(N 1 ) = L(R 1 )
  • Build NFA N s.t. L(N ) = (L(N 1 ))∗

q 0

q 1

q 2

Figure 4: NFA accepts (L(N 1 ))+

Problem: May not accept ! One can show that L(N ) = (L(N 1 ))+.

Induction Step: Kleene Closure Second Attempt

Case R = R∗ 1

  • By induction hypothesis, there is N 1 s.t. L(N 1 ) = L(R 1 )
  • Build NFA N s.t. L(N ) = (L(N 1 ))∗

q 0

q 1

q 2

Figure 5: NFA accepts ⊇ (L(N 1 ))∗

Problem: May accept strings that are not in (L(N 1 ))∗!

Example demonstrating the problem

q 0 q 1

Figure 6: Example NFA N

q 0 q 1

Figure 7: Incorrect Kleene Closure of N

L(N ) = (0 ∪ 1)∗1(0 ∪ 1)∗. Thus, (L(N ))∗^ =  ∪ (0 ∪ 1)∗1(0 ∪ 1)∗. The previous construction, gives an NFA that accepts 0 6 ∈ (L(N ))∗!

Induction Step: Kleene Closure Correct Construction

Case R = R∗ 1

  • First build N 1 s.t. L(N 1 ) = L(R 1 )
  • Given N 1 build NFA N s.t. L(N ) = L(N 1 ∗ )

Today

  • Defined Regular Expressions
    • Syntax: what a regex is built out of — ∅, , characters in Σ, and operators ∪, ◦, ∗.
    • Semantics: what language a regex stands for.
  • Expressive power of regular expressions: can express (any and only) regular languages
    • Today: Languages represented by regular expressions are regular (we showed how to build NFAs for them).
    • Coming up: Regular languages can be represented by regular expressions (by building regex for any given DFA).