Easy to use text parsing library for Elm.
This library does not attempt to provide the fastest parsing API, nor does it allow for the most flexibility when it comes to error messages and contextual feedback. Rather, this library seeks to provide an intuitive, easy to use API for writing parsers that don't need to be crazy fast or provide detailed error messages.
Status: This library is pretty new and probably has a bunch of bugs. Please try it out and report things to the Github issue tracker! However, don't expect this to be rock solid quite yet.
Let's say you want to parse URLs such as https://hello.com:123/greetings?recipient=world#message
into a nice data type, e.g.
type alias Url =
{ protocol : String
, host : String
, port_ : Maybe Int
, path : String
, query : Maybe String
, fragment : Maybe String
}
In this case, we'd want a value that looks like this:
{ protocol = "https"
, host = "hello.com"
, port_ = Just 123,
, path = "/greetings"
, query = Just "recipient=world"
, fragment = Just "message"
}
You also want an error if the parsing fails.
Let's start by assuming we've already defined the individual pieces:
import Parser exposing (..)
-- Parser.Common has useful high level parsers.
import Parser.Common exposing (..)
url : Parser Url
url =
into Url
|> grab protocol
|> ignore (string "://")
|> grab host
|> grab (maybe port_)
|> grab path
|> grab (maybe query)
|> grab (maybe fragment)
There's already a lot of functionality here:
url
has typeParser Url
, meaning that when it's run, it'll parse aUrl
value when successful.- We're buiding a
Url
, and theinto
function starts a pipeline that allows for that. Any function can be used here as long as its arguments match the followinggrab
lines. - The
|>
(pipe) operator is used to feed values into theUrl
constructor. - Each pipeline element can either
grab
orignore
the value returned by a parser. - The
string
function allows creating a parser that matches an exact string. - The
maybe
function allows maybe matching a parser, meaning that the matching is optional and will succeed withNothing
if there's no match.
Now let's define each individual component. All of them are Parser String
, so they'll succeed with a String
value.
protocol : Parser String
protocol =
stringWith (oneOrMore alpha)
Here we match one or more alphabetic characters, then map the characters to a String
. We need to do this because alpha
is a Parser Char
, and oneOrMore
turns a Parser a
into a Parser (List a)
. Therefore we'll end up with Parser (List Char)
, but stringWith
allows us to convert that into a Parser String
.
host : Parser String
host =
separatedBy (char '.') (stringWith (oneOrMore alphaNum))
|> map (String.join ".")
Here we first use separatedBy
to get all the components of the host. Each component matches a string with one or more alphanumeric characters, and each component is separated by .
. If this matches we'll end up with a Parser (List String)
. By using map
we'll turn that into a Parser String
that succeeds with each component joined with a .
again.
port_ : Parser Int
port_ =
char ':'
|> followedBy int
Here we match a :
followed by an integer, e.g. 42
. followedBy
discards the value of the previous parser in the pipeline in favor of its argument's value, so we'll just get a Parser Int
.
path : Parser String
path =
stringWith (oneOrMore (except (oneOf [ char '?', char '#' ])))
|> orElse (succeed "/")
We match a string with one or more characters that are not one of ?
or #
. If that fails, i.e. there's no path, we fall back to succeeding with /
as the path using orElse
.
query : Parser String
query =
char '?'
|> followedBy (stringWith (zeroOrMore (except (char '#'))))
We match the character ?
followed by a string with zero or more characters that are not #
.
fragment : Parser String
fragment =
char '#'
|> followedBy (stringWith (until end anyChar))
We match the character #
followed by a string with zero or more of any character until the end of the parsed string.
Now we have a parser that matches most valid URLs, but how do we run it? Here's how:
parsedUrl : Result Error Url
parsedUrl =
Parser.parse "https://hello.com:123/greetings?recipient=world#message" url
parse
takes an input string and a parser of type Parse a
and returns a Result Error a
, with Error
specifying the error message and position into the input if the parser fails.