ex_spirit v0.2.4 ExSpirit.Parser
ExSpirit.Parser is the parsing section of ExSpirit, designed to parse out some kind of stream of data (whether via a binary, a list, or perhaps an actual stream) into a data structure of your own design.
Definitions
Terminal Parser
A terminal parser is one that does not operate over any other parser, it is ‘terminal’ in its location.
Combination Parser
A combination parser is one that takes a parser as an input and does something with it, whether that is repeating it, surrounding it, or ignoring its output as a few examples.
Usage
Just add use ExSpirit.Parser
to a module to make it into a parsing module.
To add text parsing functions from the ExSpirit.Parsing.Text
module then add
text: true
to the use call.
Example
defmodule MyModule do
use ExSpirit.Parser, text: true
end
parse
The parse function is applied to the input and a parser call, such as in:
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint())
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
Skippers
A skipper is a parser that is passed to the :skipper
key in the parse
call
that will be called at the start of every built-in terminal. So when you
parse a, for example, uint()
, then it will be like calling
(skipper |> uint())
in its place. There are a few related function as well.
Examples
# A skipper runs only once per terminal, if you want it to repeat the skipper
# then set the skipper up so it repeats, a good one is `repeat(lit(?\s))` for
# example
iex> import ExSpirit.Tests.Parser
iex> context = parse(" 42 ", uint(), skipper: lit(?\s))
iex> {context.error, context.result, context.rest}
{nil, 42, " "}
Parsers
Elixir Standard Pipe Operator |>
The |>
pipe operator can be used to run a parser, and then another parser,
however it will only return the result of the last parser in the pipe chain.
This library does not override Elixir’s pipe operator, it uses it verbatum.
Use seq
for the usual expected sequence parsing.
Examples
# `|>` Returns the result of the last parser in the pipe chain,
# `lit` always returns nil for example
iex> import ExSpirit.Tests.Parser
iex> context = parse("42Test", uint() |> lit("Test"))
iex> {context.error, context.result, context.rest}
{nil, nil, ""}
# `|>` Returns the result of the last parser in the pipe chain
iex> import ExSpirit.Tests.Parser
iex> context = parse("42Test64", uint() |> lit("Test") |> uint())
iex> {context.error, context.result, context.rest}
{nil, 64, ""}
seq
The Sequence operator runs all of the parsers in the inline list (cannot be a
variable) and returns their results as a list. Any nil
’s returned are not
added to the result list, and if the result list has only a single value
returned then it returns that value straight away without being wrapped in a
list.
Examples
# `seq` parses a sequence returning the return of all of them, removing nils,
# as a list if more than one or the raw value if only one, if any fail then
# all fail.
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", seq([uint(), lit(" "), uint()]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [42, 64], ""}
# `seq` Here is sequence only returning a single value
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42Test", seq([uint(), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, 42, ""}
alt
The alternative parser runs the parsers in the inline list (cannot be a variable) and returns the result of the first one that succeeds, or the error of the last one.
Examples
# `alt` parses a set of alternatives in order and returns the first success
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("FF", alt([uint(16), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, 255, ""}
# `alt` parses a set of alternatives in order and returns the first success
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("Test", alt([uint(16), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, nil, ""}
defrule
Defining a rule defines a parser as well as some associated information such as the name of it for error reporting purposes, a mapping function so you can convert the output on the fly (fantastic for in-line AST generation for example!), among other uses. It is used like any other normal terminal rule.
Examples
All of the following examples use this definition of rules in a module:
defmodule ExSpirit.Tests.Parser do
use ExSpirit.Parser, text: true
defrule testrule(
seq([ uint(), lit(? ), uint() ])
)
defrule testrule_map(
seq([ uint(), lit(? ), uint() ])
), map: Enum.map(fn i -> i-40 end)
defrule testrule_fun(
seq([ uint(), lit(? ), uint() ])
), fun: (fn context -> %{context | result: {"altered", context.result}} end).()
defrule testrule_context(context) do
%{context | result: "always success"}
end
end
# You can use `defrule`s as any other terminal parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [42, 64], ""}
# `defrule`'s also set up a stack of calls down a context so you know
# 'where' an error occured, so name the rules descriptively
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 fail", testrule())
iex> {contexts.error.context.rulestack, contexts.result, contexts.rest}
{[:testrule], nil, "fail"}
# `defrule`s can map the result to return a different one:
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_map())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [2, 24], ""}
# `defrule`s can also operate over the context itself to do anything
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_fun())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, {"altered", [42, 64]}, ""}
# `defrule`s can also be a context function by only passing in `context`
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_context())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, "always success", "42 64"}
no_skip
The no_skip
combination parser takes a parser and clears the skipper so they
do no skipping. Good to parse non-skippable content within a large parser.
Examples
# You can turn off a skipper for a parser with `no_skip`
iex> import ExSpirit.Tests.Parser
iex> context = parse(" Test:42 ", lit("Test:") |> no_skip(uint()), skipper: lit(?\s))
iex> {context.error, context.result, context.rest}
{nil, 42, " "}
skipper
The skipper
combination parser takes a parser and changes the skipper within
it to the one you pass in for the duration of the parser that you pass in.
Examples
# You can change a skipper for a parser as well with `skipper`
iex> import ExSpirit.Tests.Parser
iex> context = parse(" Test: 42 ", lit("Test:") |> skipper(uint(), lit(?\t)), skipper: lit(?\s))
iex> {context.error, context.result, context.rest}
{nil, 42, " "}
ignore
The ignore
combination parser takes and runs a parser but ignores the
result of the parser, instead returning nil
.
Examples
# `ignore` will run the parser but return no result
iex> import ExSpirit.Tests.Parser
iex> context = parse("Test", ignore(char([?a..?z, ?T])))
iex> {context.error, context.result, context.rest}
{nil, nil, "est"}
branch
The branch
combination parser is designed for efficient branching based on
the result from another parser. It allows you to parse something, and using
the result of that parser you can then either lookup the value in a map or
call into a user function, either of which can return a parser function that
will then be used to continue parsing.
It takes two arguments, the first of which is the initial parser, the second
is either a user function of value -> parserFn
or a map of
values => parserFn
where the value key is looked up from the result of the
first parser. If the parserFn is nil
then branch
fails, else the parserFn
is executed to continue parsing. Because of the anonymous function calls this
has a slight overhead so only use this if switching parsers dynamically based
on a parsed value that is more complex then a simple alt
parser or the count
is more than a few branches in size.
This returns only the output from the parser in the map, not the lookup parser.
Examples
iex> import ExSpirit.Tests.Parser
iex> symbol_map = %{?b => &uint(&1, 2), ?d => &uint(&1, 10), ?x => &uint(&1, 16)}
iex> context = parse("b101010", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
iex> context = parse("d213478", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 213478, ""}
iex> context = parse("xe1DCf", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 925135, ""}
iex> context = parse("a", branch(char(), symbol_map))
iex> {context.error.message, context.result, context.rest}
{"Tried to branch to `97` but it was not found in the symbol_map", nil, ""}
iex> import ExSpirit.Tests.Parser
iex> symbol_mapper = fn
iex> ?b -> &uint(&1, 2)
iex> ?d -> &uint(&1, 10)
iex> ?x -> &uint(&1, 16)
iex> _value -> nil # Always have a default case. :-)
iex> end
iex> context = parse("b101010", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
iex> context = parse("d213478", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 213478, ""}
iex> context = parse("xe1DCf", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 925135, ""}
iex> context = parse("a", branch(char(), symbol_mapper))
iex> {context.error.message, context.result, context.rest}
{"Tried to branch to `97` but it was not found in the symbol_map", nil, ""}
tag
The tagger combination parser will wrap the result of the passed in parser in a standard erlang 2-tuple, the first element is the tag that you pass in, the second is the result of the parser.
Examples
# `tag` can tag the output from a parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("ff", tag(:integer, uint(16)))
iex> {context.error, context.result, context.rest}
{nil, {:integer, 255}, ""}
Expect
The expectation parser takes a parser but if it fails then it returns a hard
error that will prevent further parsers, even in branch tests, from running.
The purpose of this parser is to hard mention parsing errors at the correct
parsing site, so that if you are parsing an alt
of parsers, but you parse
out a ‘let’ for example, followed by an identifier, if the identifier fails
then you do not want to let the alt try the next one but instead fail out hard
with an error message related to the proper place the parse failed instead of
trying other parsers that you know will not succeed anyway.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("do 10", lit("do ") |> expect(uint()))
iex> {context.error, context.result, context.rest}
{nil, 10, ""}
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", lit("do ") |> expect(uint()))
iex> %ExSpirit.Parser.ExpectationFailureException{} = context.error
iex> {context.error.message, context.result, context.rest}
{"Parsing uint with radix of 10 had 0 digits but 1 minimum digits were required", nil, "nope"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", alt([ lit("do ") |> expect(uint()), lit("blah") ]))
iex> %ExSpirit.Parser.ExpectationFailureException{} = context.error
iex> {context.error.message, context.result, context.rest}
{"Parsing uint with radix of 10 had 0 digits but 1 minimum digits were required", nil, "nope"}
# Difference without the `expect`
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", alt([ lit("do ") |> uint(), lit("blah") ]))
iex> %ExSpirit.Parser.ParseException{} = context.error
iex> {context.error.message, context.result, context.rest}
{"literal `blah` did not match the input", nil, "do nope"}
repeat
The repeat parser repeats over a parser for bounded number of times, returning the results as a list. It does have a slight overhead compared to known execution times due to an anonmous function call, but that is necessary when performing a dynamic number of repetitions without mutable variables.
The optional arguments are the minimum number of repeats required, default of 0, and the maximum number of repeats, default of -1 (infinite).
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T)))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1, 10))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1, 2))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T], "TX"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 4))
iex> {context.error.message, context.result, context.rest}
{"Repeating over a parser failed due to not reaching the minimum amount of 4 with only a repeat count of 3", nil, "X"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTT", repeat(char(?T), 4))
iex> {context.error.message, context.result, context.rest}
{"Repeating over a parser failed due to not reaching the minimum amount of 4 with only a repeat count of 3", nil, ""}
iex> import ExSpirit.Tests.Parser
iex> context = parse("", repeat(char(?T)))
iex> {context.error, context.result, context.rest}
{nil, [], ""}
repeatFn
The repeat function parser allows you to pass in a parser function to repeat
over, but is otherwise identical to repeat
, especially as repeat
delegates
to repeatFn
.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeatFn(fn c -> c |> char(?T) end))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
# See `repeat` for more.
success
The success parser always returns the passed in value, default of nil, successfully like a parsed value.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("", success(42))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
fail
The fail parser always fails, documenting the user information passed in
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("", fail(42))
iex> {context.error.extradata, context.result, context.rest}
{42, nil, ""}
map_context
Runs a function with the context
Examples
iex> import ExSpirit.Tests.Parser
iex> fun = fn c -> %{c|result: 42} end
iex> context = parse("a", map_context(fun.()))
iex> {context.error, context.result, context.rest}
{nil, 42, "a"}
map_result
Runs a function with the context
Examples
iex> import ExSpirit.Tests.Parser
iex> fun = fn nil -> 42 end
iex> context = parse("a", map_result(fun.()))
iex> {context.error, context.result, context.rest}
{nil, 42, "a"}
map_context_around
Runs a function and parser with the both the context before and after the function call.
Examples
iex> import ExSpirit.Tests.Parser
iex> fun = fn {pre, post} -> %{post|result: {pre, post}} end
iex> context = parse("42", map_context_around(fun.(), uint()))
iex> {pre, post} = context.result
iex> {context.error, pre.column, post.column, context.rest}
{nil, 1, 3, ""}
skip
Runs the skipper now
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse(" a", skip(), skipper: chars(?\s, 0))
iex> {context.error, context.result, context.rest}
{nil, nil, "a"}
put_state
Puts something into the state at the specified key
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> put_state(:test, :result))
iex> {context.error, context.result, context.rest, context.state}
{nil, 42, "", %{test: 42}}
push_state
Puts something into the state at the specified key
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> push_state(:test, :result))
iex> {context.error, context.result, context.rest, context.state}
{nil, 42, "", %{test: [42]}}
get_state_into
Get something(s) from the state and put it into the locations in the parser that are marked with &1-* bindings
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:A", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into([:test], char(&1)))
iex> {context.error, context.result, context.rest}
{nil, ?A, ""}
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:B", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into([:test], char(&1)))
iex> {String.starts_with?(context.error.message, "Tried parsing out any of the the characters of"), context.result, context.rest}
{true, nil, "B"}
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:B", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into(:test, :result))
iex> {context.error, context.result, context.rest}
{nil, ?A, "B"}
lookahead
Looks ahead to confirm success, but does not update the context when successful.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("AA", lit(?A) |> lookahead(lit(?A)) |> char())
iex> {context.error, context.result, context.rest}
{nil, ?A, ""}
iex> import ExSpirit.Tests.Parser
iex> context = parse("AB", lit(?A) |> lookahead(lit(?A)) |> char())
iex> {String.starts_with?(context.error.message, "Lookahead failed"), context.result, context.rest}
{true, nil, "B"}
lookahead_not
Looks ahead to confirm failure, but does not update the context when failed.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("AB", lit(?A) |> lookahead_not(lit(?A)) |> char())
iex> {context.error, context.result, context.rest}
{nil, ?B, ""}
iex> import ExSpirit.Tests.Parser
iex> context = parse("AA", lit(?A) |> lookahead_not(lit(?A)) |> char())
iex> {String.starts_with?(context.error.message, "Lookahead_not failed"), context.result, context.rest}
{true, nil, "A"}
lexeme
Returns the entire parsed text from the parser, regardless of the actual return value.
Examples
iex> import ExSpirit.Tests.Parser
iex> context = parse("A256B", lexeme(char() |> uint()))
iex> {context.error, context.result, context.rest}
{nil, "A256", "B"}