Understanding Context-Free Grammars: From Basics to Applications, Lecture notes of Programming Languages

Context-free grammars (CFGs) and their relation to regular languages. It describes how CFGs are used to describe languages and how they can be used to describe arithmetic expressions. The document also explains CFG notation, derivations, and the language of a grammar. It then goes on to explain how regular expressions can be converted to CFGs and how every regular language is context-free. The document ends with a problem set and tips for designing CFGs.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

seshadrinathan_hin
seshadrinathan_hin ๐Ÿ‡บ๐Ÿ‡ธ

4.6

(17)

231 documents

1 / 45

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Context-Free Grammars
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d

Partial preview of the text

Download Understanding Context-Free Grammars: From Basics to Applications and more Lecture notes Programming Languages in PDF only on Docsity!

Context-Free Grammars

Describing Languages

โ— (^) We've seen two models for the regular languages: โ— (^) Finite automata accept precisely the strings in the language. โ— (^) Regular expressions describe precisely the strings in the language. โ— Finite automata recognize strings in the language. โ— (^) Perform a computation to determine whether a specific string is in the language. โ— (^) Regular expressions match strings in the language. โ— (^) Describe the general shape of all strings in the language.

Arithmetic Expressions

โ— (^) Suppose we want to describe all legal arithmetic expressions using addition, subtraction, multiplication, and division. โ— (^) Here is one possible CFG: E โ†’ int E โ†’ E Op E E โ†’ (E) Op โ†’ + Op โ†’ - Op โ†’ *** Op** โ†’ /

E

โ‡’ E Op E โ‡’ E Op (E) โ‡’ E Op (E Op E) โ‡’ E * (E Op E) โ‡’ int * (E Op E) โ‡’ int * (int Op E) โ‡’ int * (int Op int) โ‡’ int * (int + int)

Arithmetic Expressions

โ— (^) Suppose we want to describe all legal arithmetic expressions using addition, subtraction, multiplication, and division. โ— (^) Here is one possible CFG: E โ†’ int E โ†’ E Op E E โ†’ (E) Op โ†’ + Op โ†’ - Op โ†’ *** Op** โ†’ /

E

โ‡’ E Op E โ‡’ E Op int โ‡’ int Op int โ‡’ int / int

Some CFG Notation

โ—

Capital letters in Bold Red Uppercase

will represent nonterminals.

โ— i.e. A , B , C , D โ—

Lowercase letters in blue monospace will

represent terminals.

โ— (^) i.e. t , u , v , w โ—

Lowercase Greek letters in gray italics

will represent arbitrary strings of

terminals and nonterminals.

โ— i.e. ฮฑ , ฮณ , ฯ‰

A Notational Shorthand

E โ†’ int E โ†’ E Op E E โ†’ (E) Op โ†’ + Op โ†’ - Op โ†’ *** Op** โ†’ /

Derivations

โ‡’ E โ‡’ E Op E โ‡’ E Op (E) โ‡’ E Op (E Op E) โ‡’ E * (E Op E) โ‡’ int * (E Op E) โ‡’ int * (int Op E) โ‡’ int * (int Op int) โ‡’ int * (int + int) โ— A sequence of steps where nonterminals are replaced by the right-hand side of a production is called a derivation. โ— If string ฮฑ derives string ฯ‰ , we write ฮฑ โ‡’

ฯ‰. โ— In the example on the left, we see E โ‡’

int * (int + int). E โ†’ E Op E | int | ( E ) Op โ†’ + | ***** | - | /

The Language of a Grammar

โ—

If G is a CFG with alphabet ฮฃ and start

symbol S , then the language of G is the

set

โ„’ ( G ) = { ฯ‰ โˆˆ ฮฃ* | S โ‡’

โ—

That is, โ„’( G ) is the set of strings

derivable from the start symbol.

โ—

Note: ฯ‰ must be in ฮฃ*, the set of strings

made from terminals. Strings involving

nonterminals aren't in the language.

From Regexes to CFGs

โ—

CFGs consist purely of production rules

of the form A โ†’ ฯ‰. They do not have the

regular expression operators * or โˆช.

โ—

However, we can convert regular

expressions to CFGs as follows:

S โ†’ ab*

From Regexes to CFGs

โ—

CFGs consist purely of production rules

of the form A โ†’ ฯ‰. They do not have the

regular expression operators * or โˆช.

โ—

However, we can convert regular

expressions to CFGs as follows:

S โ†’ Ab A โ†’ Aa | ฮต

From Regexes to CFGs

โ—

CFGs consist purely of production rules

of the form A โ†’ ฯ‰. They do not have the

regular expression operators * or โˆช.

โ—

However, we can convert regular

expressions to CFGs as follows:

S โ†’ aX X โ†’ b | C C โ†’ Cc | ฮต

Regular Languages and CFLs

โ—

Theorem: Every regular language is

context-free.

โ—

Proof Idea: Use the construction from

the previous slides to convert a regular

expression for L into a CFG for L. โ– 

โ—

Problem Set Exercise: Instead, show

how to convert a DFA/NFA into a CFG.

Regular Languages CFLs All Languages

Why the Extra Power?

โ—

Why do CFGs have more power than

regular expressions?

โ—

Intuition: Derivations of strings have

unbounded โ€œmemory.โ€

S โ†’ aSb | ฮต

a a a a b b b b