- Parsing with Data.Yaml.Combinators, part 1
- Parsing with Data.Yaml.Combinators, part 2
- Parsing with Data.Yaml.Combinators, part 3
So, I need to make a parser. The input is YAML and I want to learn Haskell. The first hit I get is Data.Yaml, which sounds promising. But while searching for an example usage, I happen upon a blog post named Better YAML parsing, and who doesn’t want better, right? Alas, I’m a Haskell newbie and don’t even understand most of the operators used in the very terse example, so in building my own parser, there is a lot of trial and error and pondering the meaning of function types.
Writing this post has multiple purposes. My example could help others who want to use the excellent yaml-combinators package. Someone with a much better understanding of Haskell may read it and point out a better way to structure the code. I’m grateful for any comments which may further my understanding of Haskell or parser combinators.
Since my actual input is technical and boring, I will make up an example input for the purposes of this post. The example will be less technical, but equally boring:
persons: | |
- name: 'Coco Lenoix' | |
address: '1612 Havenhurst Drive' | |
- name: 'Diane Selwyn' | |
address: '2900 Griffith Park Boulevard' | |
organizations: | |
- name: 'StudioCanal' | |
address: '1 Place du Spectacle' | |
- name: 'ABC Studios' | |
address: '77 West 66th Street' |
Don’t fret, this is just the start. I promise it will get more intricate.
We’ll model this like so:
module Contacts | |
( Contacts, | |
contactsParser | |
) where | |
import Data.Vector (Vector) | |
import Data.Text (Text) | |
import Data.Yaml.Combinators | |
data Contacts = Contacts | |
(Vector Person) | |
(Vector Organization) | |
deriving Show | |
data Person = Person | |
Text -- name | |
Text -- address | |
deriving Show | |
data Organization = Organization | |
Text -- name | |
Text -- address | |
deriving Show |
We need to import Vector and Text of course. To make it clean we’ll stick it all in a module Contacts in the file Contacts.hs. Deriving Show makes it easy to print the address book in a fairly readable format.
We need a main function to read the file and initiate the parsing. We’ll stick that in Main.hs.
module Main where | |
import qualified Data.ByteString.Char8 as BS | |
import qualified Data.Yaml.Combinators as Y | |
import System.Environment (getArgs) | |
import System.IO | |
import Contacts | |
main :: IO () | |
main = do | |
args <- getArgs | |
bs <- BS.readFile $ head args | |
let parsedContent = Y.parse contactsParser bs :: Either String Contacts | |
case parsedContent of | |
(Left e) -> error e | |
(Right ts) -> putStrLn $ show ts |
Now we need to slip in that contactsParser. We’ll put that in the Contacts.hs file.
contactsParser :: Parser Contacts | |
contactsParser = object $ Contacts | |
<$> field "persons" personListParser | |
<*> field "organizations" organizationListParser | |
personListParser :: Parser (Vector Person) | |
personListParser = array personParser | |
organizationListParser :: Parser (Vector Organization) | |
organizationListParser = array organizationParser | |
personParser :: Parser Person | |
personParser = object $ Person | |
<$> field "name" string | |
<*> field "address" string | |
organizationParser :: Parser Organization | |
organizationParser = object $ Organization | |
<$> field "name" string | |
<*> field "address" string |
Trying to build this we get an error:
• Couldn’t match expected type ‘Text’ with actual type ‘[Char]’
This is because yaml-combinators wants Text as field names, but we provide [Char]. Instead of mucking about with Text constructors at every string literal we’ll just activate the OverloadedStrings extension. And of course we need to import Data.Yaml.Combinators.
{-# LANGUAGE OverloadedStrings #-} | |
module Contacts | |
( Contacts, | |
contactsParser | |
) where | |
import Data.Vector (Vector) | |
import Data.Text (Text) | |
import Data.Yaml.Combinators |
We build and run it.
> contacts-parser input.yaml
Contacts [Person “Coco Lenoix” “1612 Havenhurst Drive”,Person “Diane Selwyn” “2900 Griffith Park Boulevard”] [Organization “StudioCanal” “1 Place du Spectacle”,Organization “ABC Studios” “77 West 66th Street”]
This part was pretty straight forward. Without understanding the meaning of <$> and <*>, I could pretty much whip it up based on the example from the Better YAML parsing article. All fields were parsed from string to Text. In the next post, I will show how to parse from string to a type of your own.