Parser Definition and Implementation in Haskell, Study notes of Computer Science

An explanation of parsers in haskell, their definition, implementation using different methods, and various examples. It covers the use of monad instance, simple parsers, parsers with predicates, and more sophisticated parsers. The document also includes the definition of operators like 'plus' and 'string'.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-6iy
koofers-user-6iy 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 26
Lectured by Prof. Caldwell and scribed by Sunil Kothari
November 25, 2008
1 Review
We will start from the beginning. The goal is to apply parsers to a string
and that will generate tree in the end.
The book defines parsers as functions from S tring T ree.
Then it is generalized to String (T r ee, String ).
Then the author says that since a string can parsed in multiple ways so let’s
account for those. The resultant type of parsers is [(S tring [tree, S tring)].
The one nice feature is that empty list will indicate failure. Finally, the type of
parser is P arser α =String [(α, Str ing)].
We hope to use sequencing operations from the Monad.
These are 3 different ways of defining a data type.
This is something new.
newT ype P arser a =MkP (S tring [(a, S tring)])
The only difference here is that there is another version
type Str ing = [Char]
Note that String is a synonym of [C har].
What happens in the data declaration ?
data Par ser a =M kP (S tring [(a, S tring)])
1
pf3
pf4
pf5

Partial preview of the text

Download Parser Definition and Implementation in Haskell and more Study notes Computer Science in PDF only on Docsity!

Lecture 26

Lectured by Prof. Caldwell and scribed by Sunil Kothari

November 25, 2008

1 Review

We will start from the beginning. The goal is to apply parsers to a string and that will generate tree in the end.

The book defines parsers as functions from String → T ree. Then it is generalized to String → (T ree, String). Then the author says that since a string can parsed in multiple ways so let’s account for those. The resultant type of parsers is [(String → [tree, String)]. The one nice feature is that empty list will indicate failure. Finally, the type of parser is P arser α = String → [(α, String)].

We hope to use sequencing operations from the Monad.

These are 3 different ways of defining a data type. This is something new.

newT ype P arser a = M kP (String → [(a, String)])

The only difference here is that there is another version

type String = [Char]

Note that String is a synonym of [Char].

What happens in the data declaration?

data P arser a = M kP (String → [(a, String)])

Typically, data is used for inductive data types so that you can do case and pattern matching.

typeP arsera = String → [(a, String)]

Putting a constructor (MkP here) tells the system to recognize the type as a Parser. It’s more or less an efficiency thing. Note that there’s no recursion in newType thing.

The newtype thing is not really there in the compiler.

Anyways we are adopting the newtype thing for Parser.

newtype Parser a = MkP (String -> [(a,String)])

apply is given as:

apply:: Parser a -> String -> [(a,String)] apply (MkP f) i = f i

applyParser is given as:

applyParser :: Parser a -> String -> a applyParser p = fst.head.apply p

Remember that (>>=) the type of bind operator is:

Hugs> :t (>>=) (>>=) :: Monad a => a b -> (b -> a c) -> a c Hugs>

Now, we can define an instance of Monad for Parser as given in Bird’s book:

instance Monad Parser -- (>>=) :: Parser a -> ( a -> Parser b) -> Parser b p >>= q = MkP f where f s = [(y,s’’)|(x,s’) <- apply p s, (y,s’’) <- apply (q x) s’]

-- return :: a -> Parser a return v = MkP( i -> [(v,i)])

Note that list comprehension is like do notation for lists. Interestingly, Hutton’s book defines the bind operator as follows:

instance Monad where -- (* Hutton’s thing*) p >>= f MkP (\i -> case (apply p i) of

This business of instantiating monads and figuring out is a fascinating work of functional programming people. The downside is that it is complicated. But it gets easy when you use it.

As of now, we have simple parsers. Now we make a parser which uses a predicate

sat::(Char -> Bool) -> Parser Char sat p = do x <- item if p x then return x else zero

Here’s some examples:

Parser> apply (sat (==’x’)) "wxyzzy" [] :: [(Char,String)] Parser> apply (sat (==’x’)) "xyzzy" [(’x’,"yzzy")] :: [(Char,String)]

This is interesting – the type contains a function flip, which is normally not the case.

:t (==’x’) flip (==) ’x’ : Char -> Bool

We can do more

Parser> apply (sat (¸-> c ‘elem‘ "xyzzy")) "wxyzzy" [] :: [(Char,String)]

Parser> apply (sat (¸-> c ‘elem‘ "xyzzy")) "xyzzy" [(’x’,"yzzy")] :: [(Char,String)]

Parser> apply (sat (¸-> c ‘elem‘ "xy")) "xyzzy" [(’x’,"yzzy")] :: [(Char,String)]

Then we have a parser which parses only digits and another which parses returns the character corresponding a particular digit.

digit :: Parser Char digit = sat isDigit

digit’ :: Parser Int digit’ = do d <- digit; return (ord d - ord ’0’)

Parser> apply digit "1234" [(’1’,"234")] :: [(Char,String)]

Parser> apply digit "abcd" [] :: [(Char,String)]

Parser> apply digit’ "abcd" [] :: [(Int,String)]

Parser> apply digit’ "1234" [(1,"234")] :: [(Int,String)] Parser>

More sophisticated parsers can now be defined

lower :: Parser Char lower = sat isLower

Parser> apply lowers "sUpper" [("s","Upper")] :: [([Char],String)] Parser> apply lowers "ssUpper" [("ss","Upper")] :: [([Char],String)]

upper :: Parser Char upper = sat isUpper

char :: Char -> Parser Char char x = sat (==x)

sat (!==) is a parser which matches the first character with !.

string [] = return [] string (x:xs) = do char x string xs return (x:xs)

If it’s an actual string - eat a character x and then recursively call string on xs

Parser> apply (string "xy") "xyzzy" [("xy","zzy")] :: [([Char],String)]

Parser> apply (string "wxy") "xyzzy" [] :: [([Char],String)] Parser>

The two parsers can also be combined using the +++ (choice) operator.

p ‘plus‘ q = Mkp f where fx = apply p s ++ apply q s

Hutton’s choice operator

nat :: Parser Int nat = do xs <- many1 digit return (read xs)

Parser> apply nat "123" [(123,"")] :: [(Int,String)] Parser> apply nat "123abc" [(123,"abc")] :: [(Int,String)]

manyis a parser to read zero or more times.

:t many lower many lower :: Parser [Char]

A standard thing is to tokenize i.e. eat as many space as possible.

space :: Parser () space = do many (char ’ ’) return () Here’s an example.

Parser> apply space " xyzzy" [((),"xyzzy")] :: [((),String)]

The following parser parses a list of natural numbers (including any spaces).

natural :: Parser Int natural = token nat

symbol :: String -> Parser String symbol xs = token (string xs)

natlist :: Parser [Int] natlist = do symbol "[" n <- natural ns <- many (do symbol "," natural) symbol "]" return (n:ns)

Parser> apply natlist "[1, 2,3, 5]" [([1,2,3,5],"")] :: [([Int],String)] Parser> apply natlist "[ ]" [] :: [([Int],String)] Parser> apply natlist "[1, 2,3, 5]zbcd " [([1,2,3,5],"zbcd ")] :: [([Int],String)] Parser>