Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Building a Recursive Descent Parser: Understanding BNF and Java Code, Slides of Programming Languages

Aligarh Muslim University Programming Languages

An in-depth explanation of how to build a recursive descent parser using bnf (backus-naur form) and java code. It covers the basics of bnf, extended bnf, recognizing simple alternatives, helper methods, and sequences. The document also discusses the importance of the dry (don't repeat yourself) principle and provides java code examples.

Typology: Slides

2012/2013

Uploaded on 09/29/2013

dhanvant 🇮🇳

4.9

(9)

89 documents

1 / 28

This page cannot be seen from the preview

Don't miss anything!

Recognizers

docsity.com

Discover Slides of Programming Languages Aligarh Muslim University

Partial preview of the text

Download Building a Recursive Descent Parser: Understanding BNF and Java Code and more Slides Programming Languages in PDF only on Docsity!

Recognizers

Parsers and recognizers

 Given a grammar (say, in BNF) and a string,

 A recognizer will tell whether the string belongs to the language defined by the grammar  A parser will try to build a tree corresponding to the string, according to the rules of the grammar

Input string Recognizer result Parser result

2 + 3 * 4 true

2 + 3 * false Error

Review of BNF

 “Plain” BNF

 < > indicate a nonterminal that needs to be further expanded, for example,  Symbols not enclosed in < > are terminals; they represent themselves, for example, if, while, (  The symbol ::= means is defined as  The symbol | means or; it separates alternatives, for example, ::= + | -

 Extended BNF

 [ ] enclose an optional part of the rule  Example: ::= if ( ) [ else ]  { } mean the enclosed can be repeated zero or more times  Example: ::= ( ) | ( { , } )

Recognizing simple alternatives, I

 Consider the following BNF rule:

 <add_operator> ::= + | -  That is, an add operator is a plus sign or a minus sign

 To recognize an add operator, we need to get the next token,

and test whether it is one of these characters

 If it is a plus or a minus, we simply return true  But what if it isn’t?  We not only need to return false, but we also need to put the token back because it doesn’t belong to us, and some other grammar rule probably wants it

 Our tokenizer needs to be able to take back tokens

 Usually, it’s enough to be able to put just one token back  More complex grammars may require the ability to put back several tokens

Java code

 public boolean addOperator() { Token t = myTokenizer.next(); if (t.type == Type.SYMBOL && t.value.equals("+")) { return true; } if (t.type == Type.SYMBOL && t.value.equals("-")) { return true; } myTokenizer.pushBack(1); return false; }

 While this code isn’t particularly long or hard to read, we are going to have a lot of very similar methods

Helper methods

 Remember the DRY principle: Don’t Repeat Yourself

 If we turn each BNF production directly into Java, we will be

writing a lot of very similar code

 We should write some auxiliary or “helper” methods to hide

some of the details for us

 First helper method:

 private boolean symbol(String expectedSymbol)

 Gets the next token and tests whether it matches the expectedSymbol  If it matches, returns true  If it doesn’t match, puts the symbol back and returns false

 We’ll look more closely at this method in a moment

First implementation of symbol

 Here’s what symbol does:

 Gets a token  Makes sure that the token is a symbol  Compares the symbol to the desired symbol (by value)  If all the above is satisfied, returns true  Else (if not satisfied) puts the token back, and returns false

 private boolean symbol(String value) { Token t = tokenizer.next(); if (t.type == Type.SYMBOL && value.equals(t.value())) { return true; } else { tokenizer.pushBack(1); return false; } }

Implementing symbol

 We can implement methods name, number, and maybe eol

the same way

 All this code will look pretty much alike

 The main difference is in checking for the type  The DRY principle suggests we should use a helper method for symbol

 private boolean symbol(String expectedValue) { return nextTokenMatches(Type.SYMBOL, expectedValue); }

nextTokenMatches

 The previous method is fine for symbols, but what if we only care

about the type?

 For example, we want to get a number— any number  We need to compare only type, not value  private boolean nextTokenMatches(Type type, String value) { Token t = tokenizer.next(); omit this parameter if (type == t.type() && value.equals(t.getValue())) return true; else tokenizer.pushBack(1); omit this test return false; }

 It’s easier to overload nextTokenMatches than to combine the

two versions, and both versions are fairly short, so we are

probably better off with the code duplication

addOperator reprise

 public boolean addOperator() { return symbol("+") || symbol("-"); }

 private boolean symbol(String expectedValue) { return nextTokenMatches(Type.SYMBOL, expectedValue); }

 private boolean nextTokenMatches(Type type, String value) { Token t = tokenizer.next(); if (type == t.type() && value.equals(t.value())) return true; else tokenizer.pushBack(1); return false; }

Sequences, II

 The grammar rule is <empty_list> ::= “[” “]”

 And the token string contains [ 5 ]

 Solution #1: Write a pushBack method that push back more than one token at a time  This will allow you to put the back both the “[” and the “ 5 ”  You have to be very careful of the order in which you return tokens  This is a good use for a Stack  Solution #2: Call it an error  You might be able to get away with this, depending on the grammar  For example, for any reasonable grammar, (2 + 3 +) is clearly an error  Solution #3: Change the grammar  Tricky, and may not be possible  Solution #4: Combine rules  See the next slide

Implementing a fancier pushBack()

 java.io.StreamTokenizer does almost everything you need in

a tokenizer

 Its pushBack() method only “puts back” a single token

 If you need more than that, you have to extend

StreamTokenizer

 To push back more tokens than one, you need to either:

 Make your tokenizer keep track of the last several tokens (and have a pushBack(int n) method, or  Expect the calling program to tell you what tokens to push back (with a pushBack(Token t) method)

 Plus, you will have to override nextToken()

 Inside your nextToken() method, you can call super.nextToken() to get the next never-before-seen token  Your nextToken() method will also have to do something about nval and sval, such as provide methods to get these values

Sequences, IV

 Another possibility is to revise the grammar (but make sure the

new grammar is equivalent to the old one!)

 Old grammar:

 ::= “[” “]” | “[” “]”

 New grammar:

 ::= “[” <rest_of_list> <rest_of_list> ::= “]” | “]”

 New pseudocode:

 public boolean list() { if first token is “[” { if restOfList() return true } else put back first token }

 private boolean restOfList() { if first token is “]”, return true if first token is a number and second token is a “]”, return true else return false }

Simple sequences in Java

 Suppose you have this rule:

 ::= ( )

 A good way to do this is often to test whether the grammar rule

is not met

 public boolean factor() { if (symbol("(")) { if (!expression()) error("Error in parenthesized expression"); if (!symbol(")")) error("Unclosed parenthetical expression"); return true; } return false; }

 To do this, you need to be careful that the “(” is not the start of some other production that can be used where a factor can be used  In other words, be sure that if you get a “(” it must start a factor

Building a Recursive Descent Parser: Understanding BNF and Java Code, Slides of Programming Languages

Related documents

Partial preview of the text

Download Building a Recursive Descent Parser: Understanding BNF and Java Code and more Slides Programming Languages in PDF only on Docsity!

Recognizers

Parsers and recognizers

 Given a grammar (say, in BNF) and a string,

Input string Recognizer result Parser result

2 + 3 * 4 true

2 + 3 * false Error

Review of BNF

Recognizing simple alternatives, I

 Consider the following BNF rule:

 To recognize an add operator, we need to get the next token,

and test whether it is one of these characters

 Our tokenizer needs to be able to take back tokens

Java code

Helper methods

 Remember the DRY principle: Don’t Repeat Yourself

 If we turn each BNF production directly into Java, we will be

writing a lot of very similar code

 We should write some auxiliary or “helper” methods to hide

some of the details for us

 First helper method:

 private boolean symbol(String expectedSymbol)

 We’ll look more closely at this method in a moment

First implementation of symbol

 Here’s what symbol does:

Implementing symbol

 We can implement methods name, number, and maybe eol

the same way

 All this code will look pretty much alike

nextTokenMatches

 The previous method is fine for symbols, but what if we only care

about the type?

 It’s easier to overload nextTokenMatches than to combine the

two versions, and both versions are fairly short, so we are

probably better off with the code duplication

addOperator reprise

Sequences, II

 The grammar rule is <empty_list> ::= “[” “]”

 And the token string contains [ 5 ]

Implementing a fancier pushBack()

 java.io.StreamTokenizer does almost everything you need in

a tokenizer

 Its pushBack() method only “puts back” a single token

 If you need more than that, you have to extend

StreamTokenizer

 To push back more tokens than one, you need to either:

 Plus, you will have to override nextToken()

Sequences, IV

 Another possibility is to revise the grammar (but make sure the

new grammar is equivalent to the old one!)

 Old grammar:

 New grammar:

 New pseudocode:

Simple sequences in Java

 Suppose you have this rule:

 A good way to do this is often to test whether the grammar rule

is not met

 Also, error(String) must throw an Exception—why?