Download Context-Free Grammars: Description, Natural Language Examples, and Programming Languages and more Slides Theory of Automata in PDF only on Docsity! Context-free languages Docsity.com Context-free grammar • This is an a different model for describing languages • The language is specified by productions (substitution rules) that tell how strings can be obtained, e.g. • Using these rules, we can derive strings like this: A → 0A1 A → B B → # A, B are variables 0, 1, # are terminals A is the start variable A ⇒ 0A1 ⇒ 00A11 ⇒ 000A111 ⇒ 000B111 ⇒ 000#111 Docsity.com Programming languages • Context-free grammars are also used to describe (parts of) programming languages • For instance, expressions like (2 + 3) * 5 or 3 + 8 + 2 * 7 can be described by the CFG <expr> → <expr> + <expr> <expr> → <expr> * <expr> <expr> → (<expr>) <expr> → 0 <expr> → 1 … <expr> → 9 Variables: <expr> Terminals: +, *, (, ), 0, 1, …, 9 Docsity.com Motivation for studying CFGs • Context-free grammars are essential for understanding the meaning of computer programs • They are used in compilers code: (2 + 3) * 5 meaning: “add 2 and 3, and then multiply by 5” Docsity.com Definition of context-free grammar • A context-free grammar (CFG) is a 4-tuple (V, T, P, S) where – V is a finite set of variables or non-terminals – T is a finite set of terminals (V ∩T = ∅) – P is a set of productions or substitution rules of the form where A is a symbol in V and α is a string over V ∪ T – S is a variable in V called the start variable A → α Docsity.com Language of a CFG • The language of a CFG (V, T, P, S) is the set of all strings containing only terminals that can be derived from the start variable S • This is a language over the alphabet T • A language L is context-free if it is the language of some CFG L = {ω | ω ∈ T* and S ⇒ ω } * Docsity.com Example 1 • Is the string 00#11 in L? • How about 00#111, 00#0#1#11? • What is the language of this CFG? A → 0A1 | B B → # variables: A, B terminals: 0, 1, # start variable: A L = {0n#1n: n ≥ 0} Docsity.com Example 2 • Give derivations of (), (()()) • How about ())? S → SS | (S) | ε convention: variables in uppercase, terminals in lowercase, start variable first S ⇒ (S) (rule 2) ⇒ () (rule 3) S ⇒ (S) (rule 2) ⇒ (SS) (rule 1) ⇒ ((S)S) (rule 2) ⇒ ((S)(S)) (rule 2) ⇒ (()(S)) (rule 3) ⇒ (()()) (rule 3) Docsity.com From regular to context-free regular expression ∅ ε a (alphabet symbol) E1 + E2 CFG E1E2 E1* grammar with no rules S → ε S → a S → S1 | S2 S → S1S2 S → SS1 | ε In all cases, S becomes the new start symbol Docsity.com Context-free versus regular • Is every context-free language regular? • No! We already saw some examples: • This language is context-free but not regular A → 0A1 | B B → # L = {0 n#1n: n ≥ 0} Docsity.com Parse tree • Derivations can also be represented using parse trees E ⇒ E + E ⇒ V + E ⇒ x + E ⇒ x + (E) ⇒ x + (E − E) ⇒ x + (V − E) ⇒ x + (y − E) ⇒ x + (y − V) ⇒ x + (y − z) E E E + V ( E ) E − E V V x y z E → E + E | E - E | (E) | V V → x | y | z Docsity.com Ambiguity • A grammar is ambiguous if some strings have more than one parse tree • Example: E → E + E | E − E | (E) | V V → x | y | z x + y + z E E E + E E + V V V x y z E E E + E E + V V V x y z Docsity.com Why ambiguity matters • The parse tree represents the intended meaning: x + y + z E E E + E E + V V V x y z E E E + E E + V V V x y z “first add y and z, and then add this to x” “first add x and y, and then add z to this” Docsity.com Why ambiguity matters • Suppose we also had multiplication: E → E + E | E − E | E × E | (E) | V V → x | y | z x × y + z E E E * E E + V V V x y z E E E + E E × V V V x y z “first x × y, then + z” “first y + z, then x ×” Docsity.com Disambiguation • Can we always disambiguate a grammar? • No, for two reasons – There exists an inherently ambiguous context-free L: Every CFG for this language is ambiguous – There is no general procedure that can tell if a grammar is ambiguous • However, grammars used in programming l t i ll b di bi t d Docsity.com Another Example • Is ab, baba, abbbaa in L? • How about a, bba? • What is the language of this CFG? • Is the CFG ambiguous? S → aB | bA A → a | aS | bAA B → b | bS | aBB Docsity.com