NimbleParsec v0.3.2 NimbleParsec View Source

NimbleParsec is a simple and fast library for text-based parser combinators.

Combinators are built during runtime and compiled into multiple clauses with binary matching. This provides the following benefits:

  • Performance: since it compiles to binary matching, it leverages many Erlang VM optimizations to generate extremely fast parser code with low memory usage

  • Composable: this library does not rely on macros for building and composing parsers, therefore they are fully composable. The only macros are defparsec/3 and defparsecp/3 which emit the compiled clauses with binary matching

  • No runtime dependency: after compilation, the generated parser clauses have no runtime dependency on NimbleParsec. This opens up the possibility to compile parsers and do not impose a dependency on users of your library

  • No footprints: NimbleParsec only needs to be imported in your modules. There is no need for use NimbleParsec, leaving no footprints on your modules

The goal of this library is to focus on a set of primitives for writing efficient parser combinators. The composition aspect means you should be able to use those primitives to implement higher level combinators.

Note this library does not handle low-level binary parsing. In such cases, we recommend using Elixir’s bitstring syntax.

Examples

defmodule MyParser do
  import NimbleParsec

  date =
    integer(4)
    |> ignore(string("-"))
    |> integer(2)
    |> ignore(string("-"))
    |> integer(2)

  time =
    integer(2)
    |> ignore(string(":"))
    |> integer(2)
    |> ignore(string(":"))
    |> integer(2)
    |> optional(string("Z"))

  defparsec :datetime, date |> ignore(string("T")) |> concat(time), debug: true
end

MyParser.datetime("2010-04-17T14:12:34Z")
#=> {:ok, [2010, 4, 17, 14, 12, 34, "Z"], "", %{}, 1, 21}

If you add debug: true to defparsec/3, it will print the generated clauses, which are shown below:

defp datetime__0(<<x0, x1, x2, x3, "-", x4, x5, "-", x6, x7, "T",
                   x8, x9, ":", x10, x11, ":", x12, x13, rest::binary>>,
                 acc, stack, comb__context, comb__line, comb__column)
     when x0 >= 48 and x0 <= 57 and (x1 >= 48 and x1 <= 57) and
         (x2 >= 48 and x2 <= 57) and (x3 >= 48 and x3 <= 57) and
         (x4 >= 48 and x4 <= 57) and (x5 >= 48 and x5 <= 57) and
         (x6 >= 48 and x6 <= 57) and (x7 >= 48 and x7 <= 57) and
         (x8 >= 48 and x8 <= 57) and (x9 >= 48 and x9 <= 57) and
         (x10 >= 48 and x10 <= 57) and (x11 >= 48 and x11 <= 57) and
         (x12 >= 48 and x12 <= 57) and (x13 >= 48 and x13 <= 57) do
  datetime__1(
    rest,
    [(x13 - 48) * 1 + (x12 - 48) * 10, (x11 - 48) * 1 + (x10 - 48) * 10,
     (x9 - 48) * 1 + (x8 - 48) * 10, (x7 - 48) * 1 + (x6 - 48) * 10, (x5 - 48) * 1 + (x4 - 48) * 10,
     (x3 - 48) * 1 + (x2 - 48) * 10 + (x1 - 48) * 100 + (x0 - 48) * 1000] ++ acc,
    stack,
    comb__context,
    comb__line,
    comb__column + 19
  )
end

defp datetime__0(rest, acc, _stack, context, line, column) do
  {:error, "...", rest, context, line, column}
end

defp datetime__1(<<"Z", rest::binary>>, acc, stack, comb__context, comb__line, comb__column) do
  datetime__2(rest, ["Z"] ++ acc, stack, comb__context, comb__line, comb__column + 1)
end

defp datetime__1(rest, acc, stack, context, line, column) do
  datetime__2(rest, acc, stack, context, line, column)
end

defp datetime__2(rest, acc, _stack, context, line, column) do
  {:ok, acc, rest, context, line, column}
end

As you can see, it generates highly inlined code, comparable to hand-written parsers. This gives NimbleParsec an order of magnitude performance gains compared to other parser combinators. Further performance can be gained by giving the inline: true option to defparsec/3.

Link to this section Summary

Functions

Defines a single ascii codepoint in the given ranges

Defines an ascii string combinator with of exact length or min and max length

Puts the result of the given combinator as the first element of a tuple with the byte_offset as second element

Chooses one of the given combinators

Concatenates two combinators

Inspects the combinator state given to to_debug with the given opts

Defines a public parser combinator with the given name and opts

Defines a private parser combinator

Duplicates the combinator to_duplicate n times

Returns an empty combinator

Ignores the output of combinator given in to_ignore

Defines an integer combinator with of exact length or min and max length

Adds a label to the combinator to be used in error reports

Puts the result of the given combinator as the first element of a tuple with the line as second element

Looks ahead the rest of the binary to be parsed alongside the context

Maps over the combinator results with the remote or local function in call

Marks the given combinator as optional

Invokes an already compiled parsec with name name in the same module

Invokes while to emit the AST that will repeat to_repeat while the AST code returns {:cont, context}

Invokes call to emit the AST that traverses the to_traverse combinator results

Reduces over the combinator results with the remote or local function in call

Allow the combinator given on to_repeat to appear zero or more times

Repeats to_repeat until one of the combinators in choices match

Repeats while the given remote or local function while returns {:cont, context}

Replaces the output of combinator given in to_replace by a single value

Defines a string binary value

Tags the result of the given combinator in to_tag in a tuple with tag as first element

Allow the combinator given on to_repeat to appear at least, at most or exactly a given amout of times

Traverses the combinator results with the remote or local function call

Unwraps and tags the result of the given combinator in to_tag in a tuple with tag as first element

Defines a single utf8 codepoint in the given ranges

Defines an ascii string combinator with of exact length or min and max codepoint length

Wraps the results of the given combinator in to_wrap in a list

Link to this section Types

Link to this type bin_modifiers() View Source
bin_modifiers() :: :integer | :utf8 | :utf16 | :utf32
Link to this type exclusive_range() View Source
exclusive_range() :: {:not, Range.t()} | {:not, char()}
Link to this type fargs() View Source
fargs() :: {atom(), args :: [term()]}
Link to this type inclusive_range() View Source
inclusive_range() :: Range.t() | char()
Link to this type mfargs() View Source
mfargs() :: {module(), atom(), args :: [term()]}
Link to this type min_and_max() View Source
min_and_max() :: {:min, non_neg_integer()} | {:max, pos_integer()}
Link to this type t() View Source
t() :: [combinator()]

Link to this section Functions

Link to this function ascii_char(combinator \\ empty(), ranges) View Source
ascii_char(t(), [range()]) :: t()

Defines a single ascii codepoint in the given ranges.

ranges is a list containing one of:

  • a min..max range expressing supported codepoints
  • a codepoint integer expressing a supported codepoint
  • {:not, min..max} expressing not supported codepoints
  • {:not, codepoint} expressing a not supported codepoint

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :digit_and_lowercase,
            empty()
            |> ascii_char([?0..?9])
            |> ascii_char([?a..?z])
end

MyParser.digit_and_lowercase("1a")
#=> {:ok, [?1, ?a], "", %{}, {1, 0}, 2}

MyParser.digit_and_lowercase("a1")
#=> {:error, "expected a byte in the range ?0..?9, followed by a byte in the range ?a..?z", "a1", %{}, 1, 1}
Link to this function ascii_string(combinator \\ empty(), range, count_or_opts) View Source
ascii_string(t(), [range()], pos_integer() | [min_and_max()]) :: t()

Defines an ascii string combinator with of exact length or min and max length.

The ranges specify the allowed characters in the ascii string. See ascii_char/2 for more information.

If you want a string of unknown size, use ascii_string(ranges, min: 1). If you want a literal string, use string/2.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :two_lowercase_letters, ascii_string([?a..?z], 2)
end

MyParser.two_lowercase_letters("abc")
#=> {:ok, ["ab"], "c", %{}, {1, 0}, 2}
Link to this function byte_offset(combinator \\ empty(), to_wrap) View Source
byte_offset(t(), t()) :: t()

Puts the result of the given combinator as the first element of a tuple with the byte_offset as second element.

byte_offset is a non-negative integer.

Link to this function choice(combinator \\ empty(), choices) View Source
choice(t(), t()) :: t()

Chooses one of the given combinators.

Expects at leasts two choices.

Beware! Char combinators

Note both utf8_char/2 and ascii_char/2 allow multiple ranges to be given. Therefore, instead this:

choice([
  ascii_char([?a..?z]),
  ascii_char([?A..?Z]),
])

One should simply prefer:

ascii_char([?a..?z, ?A..?Z])

As the latter is compiled more efficiently by NimbleParser.

Beware! Always successful combinators

If a combinator that always succeeds is given as a choice, that choice will always succeed which may lead to unused function warnings since any further choice won’t ever be attempted. For example, because repeat/2 always succeeds, the string/2 combinator below it won’t ever run:

choice([
  repeat(ascii_char([?0..?9])),
  string("OK")
])

Instead of repeat/2, you may want to use times/3 with the flags :min and :max.

Link to this function concat(left, right) View Source
concat(t(), t()) :: t()

Concatenates two combinators.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :digit_upper_lower_plus,
            concat(
              concat(ascii_char([?0..?9]), ascii_char([?A..?Z])),
              concat(ascii_char([?a..?z]), ascii_char([?+..?+]))
            )
end

MyParser.digit_upper_lower_plus("1Az+")
#=> {:ok, [?1, ?A, ?z, ?+], "", %{}, {1, 0}, 4}
Link to this function debug(combinator \\ empty(), to_debug) View Source
debug(t(), t()) :: t()

Inspects the combinator state given to to_debug with the given opts.

Link to this macro defparsec(name, combinator, opts \\ []) View Source (macro)

Defines a public parser combinator with the given name and opts.

Beware!

defparsec/3 is executed during compilation. This means you can’t invoke a function defined in the same module. The following will error because the date function has not yet been defined:

defmodule MyParser do
  import NimbleParsec

  def date do
    integer(4)
    |> ignore(string("-"))
    |> integer(2)
    |> ignore(string("-"))
    |> integer(2)
  end

  defparsec :date, date()
end

This can be solved in different ways. You may define date in another module and then invoke it. You can also store the parsec in a variable or a module attribute and use that instead. For example:

defmodule MyParser do
  import NimbleParsec

  date =
    integer(4)
    |> ignore(string("-"))
    |> integer(2)
    |> ignore(string("-"))
    |> integer(2)

  defparsec :date, date
end

Options

  • :inline - when true, inlines clauses that work as redirection for other clauses. It is disabled by default because of a bug in Elixir v1.5 and v1.6 where unused functions that are inlined cause a compilation error

  • :debug - when true, writes generated clauses to :stderr for debugging

Link to this macro defparsecp(name, combinator, opts \\ []) View Source (macro)

Defines a private parser combinator.

It cannot be invoked directly, only via parsec/2.

Receives the same options as defparsec/3.

Link to this function duplicate(combinator \\ empty(), to_duplicate, n) View Source
duplicate(t(), t(), non_neg_integer()) :: t()

Duplicates the combinator to_duplicate n times.

Returns an empty combinator.

An empty combinator cannot be compiled on its own.

Link to this function ignore(combinator \\ empty(), to_ignore) View Source
ignore(t(), t()) :: t()

Ignores the output of combinator given in to_ignore.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :ignorable, string("T") |> ignore() |> integer(2, 2)
end

MyParser.ignorable("T12")
#=> {:ok, [12], "", %{}, {1, 0}, 2}
Link to this function integer(combinator \\ empty(), count_or_opts) View Source
integer(t(), pos_integer() | [min_and_max()]) :: t()

Defines an integer combinator with of exact length or min and max length.

If you want an integer of unknown size, use integer(min: 1).

This combinator does not parse the sign and is always on base 10.

Examples

With exact length:

defmodule MyParser do
  import NimbleParsec

  defparsec :two_digits_integer, integer(2)
end

MyParser.two_digits_integer("123")
#=> {:ok, [12], "3", %{}, {1, 0}, 2}

MyParser.two_digits_integer("1a3")
#=> {:error, "expected a two digits integer", "1a3", %{}, {1, 0}, 0}

With min and max:

defmodule MyParser do
  import NimbleParsec

  defparsec :two_digits_integer, integer(min: 2, max: 4)
end

MyParser.two_digits_integer("123")
#=> {:ok, [123], "", %{}, {1, 0}, 2}

MyParser.two_digits_integer("1a3")
#=> {:error, "expected a two digits integer", "1a3", %{}, {1, 0}, 0}

If the size of the integer has a min and max close to each other, such as from 2 to 4 or from 1 to 2, using choice may emit more efficient code:

choice([integer(4), integer(3), integer(2)])

Note you should start from bigger to smaller.

Link to this function label(combinator \\ empty(), to_label, label) View Source

Adds a label to the combinator to be used in error reports.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :digit_and_lowercase,
            empty()
            |> ascii_char([?0..?9])
            |> ascii_char([?a..?z])
            |> label("digit followed by lowercase letter")
end

MyParser.digit_and_lowercase("1a")
#=> {:ok, [?1, ?a], "", %{}, {1, 0}, 2}

MyParser.digit_and_lowercase("a1")
#=> {:error, "expected a digit followed by lowercase letter", "a1", %{}, {1, 0}, 0}
Link to this function line(combinator \\ empty(), to_wrap) View Source
line(t(), t()) :: t()

Puts the result of the given combinator as the first element of a tuple with the line as second element.

line is a tuple where the first element is the current line and the second element is the byte offset immediately after the newline.

Link to this function lookahead(combinator \\ empty(), call) View Source
lookahead(t(), call()) :: t()

Looks ahead the rest of the binary to be parsed alongside the context.

call is either a {module, function, args} representing a remote call, a {function, args} representing a local call or an atom function representing {function, []}.

The function given in call will receive 4 additional arguments. The rest of the parsed binary, the parser context, the current line and the current offset will be prepended to the given args. The args will be injected at the compile site and therefore must be escapable via Macro.escape/1.

The call must return a tuple {acc, context} with list of results to be added to the accumulator in reverse order as first argument and a context as second argument. It may also return {:error, reason} to stop processing.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :letters_no_zero,
            ascii_char([?a..?z])
            |> times(min: 3)
            |> lookahead(:error_when_next_is_0)

  defp error_when_next_is_0(<<?0, _::binary>>, context, _line, _offset) do
    {:error, "next is 0"}
  end

  defp error_when_next_is_0(_rest, context, _line, _offset) do
    {[], context}
  end
end

MyParser.letters_no_zero("abc")
#=> {:ok, ["99-98-97"], "", %{}, {1, 0}, 3}

MyParser.letters_no_zero("abc1")
#=> {:ok, ["99-98-97"], "1", %{}, {1, 0}, 3}

MyParser.letters_no_zero("abc0")
#=> {:error, "next is zero", "0", %{}, {1, 0}, 3}
Link to this function map(combinator \\ empty(), to_map, call) View Source
map(t(), t(), call()) :: t()

Maps over the combinator results with the remote or local function in call.

call is either a {module, function, args} representing a remote call, a {function, args} representing a local call or an atom function representing {function, []}.

Each parser result will be invoked individually for the call. Each result be prepended to the given args. The args will be injected at the compile site and therefore must be escapable via Macro.escape/1.

See traverse/3 for a low level version of this function.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :letters_to_string_chars,
            ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> map({Integer, :to_string, []})
end

MyParser.letters_to_string_chars("abc")
#=> {:ok, ["97", "98", "99"], "", %{}, {1, 0}, 3}
Link to this function optional(combinator \\ empty(), optional) View Source
optional(t(), t()) :: t()

Marks the given combinator as optional.

It is equivalent to choice([optional, empty()]).

Link to this function parsec(combinator \\ empty(), name) View Source

Invokes an already compiled parsec with name name in the same module.

It is useful to implement recursive definitions.

It can also be used to exchange compilation time by runtime performance. If you have a parser used over and over again, you can compile it using defparsecp and rely on it via this function. The tree size built at compile time will be reduce although runtime performance is degraded as every time this function is invoked it introduces a stacktrace entry.

Examples

A very limited but recursive XML parser could be written as follows:

defmodule SimpleXML do
  import NimbleParsec

  tag = ascii_string([?a..?z, ?A..?Z], min: 1)
  text = ascii_string([not: ?<], min: 1)

  opening_tag =
    ignore(string("<"))
    |> concat(tag)
    |> ignore(string(">"))

  closing_tag =
    ignore(string("</"))
    |> concat(tag)
    |> ignore(string(">"))

  defparsec :xml,
            opening_tag
            |> repeat_until(choice([parsec(:xml), text]), [string("</")])
            |> concat(closing_tag)
            |> wrap()
end

SimpleXML.xml("<foo>bar</foo>")
#=> {:ok, [["foo", "bar", "foo"]], "", %{}, {1, 0}, 14}
Link to this function quoted_repeat_while(combinator \\ empty(), to_repeat, while) View Source
quoted_repeat_while(t(), t(), mfargs()) :: t()

Invokes while to emit the AST that will repeat to_repeat while the AST code returns {:cont, context}.

In case repetition should stop, while must return {:halt, context}.

while is a {module, function, args} and it will receive 4 additional arguments. The AST representations of the binary to be parsed, context, line and offset will be prended to args. while is invoked at compile time and is useful in combinators that avoid injecting runtime dependencies.

Link to this function quoted_traverse(combinator, to_traverse, call) View Source
quoted_traverse(t(), t(), mfargs()) :: t()

Invokes call to emit the AST that traverses the to_traverse combinator results.

call is a {module, function, args} and it will receive 5 additional arguments. The AST representation of the rest of the parsed binary, the parser results, context, line and offset will be prepended to args. call is invoked at compile time and is useful in combinators that avoid injecting runtime dependencies.

The call must return a list of results to be added to the accumulator. Notice the received results are in reverse order and must be returned in reverse order too.

The number of elements returned does not need to be the same as the number of elements given.

Link to this function reduce(combinator \\ empty(), to_reduce, call) View Source
reduce(t(), t(), call()) :: t()

Reduces over the combinator results with the remote or local function in call.

call is either a {module, function, args} representing a remote call, a {function, args} representing a local call or an atom function representing {function, []}.

The parser results to be reduced will be prepended to the given args. The args will be injected at the compile site and therefore must be escapable via Macro.escape/1.

See traverse/3 for a low level version of this function.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :letters_to_reduced_chars,
            ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> reduce({Enum, :join, ["-"]})
end

MyParser.letters_to_reduced_chars("abc")
#=> {:ok, ["97-98-99"], "", %{}, {1, 0}, 3}
Link to this function repeat(combinator \\ empty(), to_repeat) View Source
repeat(t(), t()) :: t()

Allow the combinator given on to_repeat to appear zero or more times.

Beware! Since repeat/2 allows zero entries, it cannot be used inside choice/2, because it will always succeed and may lead to unused function warnings since any further choice won’t ever be attempted. For example, because repeat/2 always succeeds, the string/2 combinator below it won’t ever run:

choice([
  repeat(ascii_char([?a..?z])),
  string("OK")
])

Instead of repeat/2, you may want to use times/3 with the flags :min and :max.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :repeat_lower, repeat(ascii_char([?a..?z]))
end

MyParser.repeat_lower("abcd")
#=> {:ok, [?a, ?b, ?c, ?d], "", %{}, {1, 0}, 4}

MyParser.repeat_lower("1234")
#=> {:ok, [], "1234", %{}, {1, 0}, 0}
Link to this function repeat_until(combinator \\ empty(), to_repeat, choices) View Source
repeat_until(t(), t(), [t()]) :: t()

Repeats to_repeat until one of the combinators in choices match.

Each of the combinators given in choice must be optimizable into a single pattern, otherwise this function will refuse to compile. Use repeat_while/3 for a general mechanism for repeating.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :string_with_quotes,
            ascii_char([?"])
            |> repeat_until(
              choice([
                ~S(\") |> string() |> replace(?"),
                utf8_char([])
              ]),
              [ascii_char([?"])]
            )
            |> ascii_char([?"])
            |> reduce({List, :to_string, []})

end

MyParser.string_with_quotes(~S("string with quotes \" inside"))
{:ok, ["\"string with quotes \" inside\""], "", %{}, {1, 0}, 30}
Link to this function repeat_while(combinator \\ empty(), to_repeat, while) View Source
repeat_while(t(), t(), call()) :: t()

Repeats while the given remote or local function while returns {:cont, context}.

In case repetition should stop, while must return {:halt, context}.

while is either a {module, function, args} representing a remote call, a {function, args} representing a local call or an atom function representing {function, []}.

The function given in while will receive 4 additional arguments. The rest of the binary to be parsed, the parser context, the current line and the current offset will be prepended to the given args. The args will be injected at the compile site and therefore must be escapable via Macro.escape/1.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :string_with_quotes,
            ascii_char([?"])
            |> repeat_while(
              choice([
                ~S(\") |> string() |> replace(?"),
                utf8_char([])
              ]),
              {:not_quote, []}
            )
            |> ascii_char([?"])
            |> reduce({List, :to_string, []})

  defp not_quote(<<?", _::binary>>, context, _, _), do: {:halt, context}
  defp not_quote(_, context, _, _), do: {:cont, context}
end

MyParser.string_with_quotes(~S("string with quotes \" inside"))
{:ok, ["\"string with quotes \" inside\""], "", %{}, {1, 0}, 30}
Link to this function replace(combinator \\ empty(), to_replace, value) View Source
replace(t(), t(), term()) :: t()

Replaces the output of combinator given in to_replace by a single value.

The value will be injected at the compile site and therefore must be escapable via Macro.escape/1.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :replaceable, string("T") |> replace("OTHER") |> integer(2, 2)
end

MyParser.replaceable("T12")
#=> {:ok, ["OTHER", 12], "", %{}, {1, 0}, 2}
Link to this function string(combinator \\ empty(), binary) View Source
string(t(), binary()) :: t()

Defines a string binary value.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :string_t, string("T")
end

MyParser.string_t("T")
#=> {:ok, ["T"], "", %{}, {1, 0}, 1}

MyParser.string_t("not T")
#=> {:error, "expected a string \"T\"", "not T", %{}, {1, 0}, 0}
Link to this function tag(combinator \\ empty(), to_tag, tag) View Source

Tags the result of the given combinator in to_tag in a tuple with tag as first element.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec integer(min: 1) |> tag(:integer)
end

MyParser.integer("1234")
#=> {:ok, [integer: [1234]], "", %{}, {1, 0}, 4}

Notice, however, that the integer result is wrapped in a list, because the parser is expected to emit multiple tokens. When you are sure that only a single token is emitted, you should use unwrap_and_tag/3.

Link to this function times(combinator \\ empty(), to_repeat, count_or_min_max) View Source
times(t(), t(), pos_integer() | [min_and_max()]) :: t()

Allow the combinator given on to_repeat to appear at least, at most or exactly a given amout of times.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :minimum_lower, times(ascii_char([?a..?z]), min: 2)
end

MyParser.minimum_lower("abcd")
#=> {:ok, [?a, ?b, ?c, ?d], "", %{}, {1, 0}, 4}

MyParser.minimum_lower("ab12")
#=> {:ok, [?a, ?b], "12", %{}, {1, 0}, 2}

MyParser.minimum_lower("a123")
#=> {:ok, [], "a123", %{}, {1, 0}, 0}
Link to this function traverse(combinator \\ empty(), to_traverse, call) View Source
traverse(t(), t(), call()) :: t()

Traverses the combinator results with the remote or local function call.

call is either a {module, function, args} representing a remote call, a {function, args} representing a local call or an atom function representing {function, []}.

The function given in call will receive 5 additional arguments. The rest of the parsed binary, the parser results to be traversed, the parser context, the current line and the current offset will be prepended to the given args. The args will be injected at the compile site and therefore must be escapable via Macro.escape/1.

The call must return a tuple {acc, context} with list of results to be added to the accumulator as first argument and a context as second argument. It may also return {:error, reason} to stop processing. Notice the received results are in reverse order and must be returned in reverse order too.

The number of elements returned does not need to be the same as the number of elements given.

This is a low-level function for changing the parsed result. On top of this function, other functions are built, such as map/3 if you want to map over each individual element and not worry about ordering, reduce/3 to reduce all elements into a single one, replace/3 if you want to replace the parsed result by a single value and ignore/3 if you want to ignore the parsed result.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :letters_to_chars,
            ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> ascii_char([?a..?z])
            |> traverse({:join_and_wrap, ["-"]})

  defp join_and_wrap(_rest, args, context, _line, _offset, joiner) do
    {args |> Enum.join(joiner) |> List.wrap(), context}
  end
end

MyParser.letters_to_chars("abc")
#=> {:ok, ["99-98-97"], "", %{}, {1, 0}, 3}
Link to this function unwrap_and_tag(combinator \\ empty(), to_tag, tag) View Source

Unwraps and tags the result of the given combinator in to_tag in a tuple with tag as first element.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec integer(min: 1) |> unwrap_and_tag(:integer)
end

MyParser.integer("1234")
#=> {:ok, [integer: 1234], "", %{}, {1, 0}, 4}

In case the combinator emits more than one token, an error will be raised. See tag/3 for more information.

Link to this function utf8_char(combinator \\ empty(), ranges) View Source
utf8_char(t(), [range()]) :: t()

Defines a single utf8 codepoint in the given ranges.

ranges is a list containing one of:

  • a min..max range expressing supported codepoints
  • a codepoint integer expressing a supported codepoint
  • {:not, min..max} expressing not supported codepoints
  • {:not, codepoint} expressing a not supported codepoint

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :digit_and_utf8,
            empty()
            |> utf8_char([?0..?9])
            |> utf8_char([])
end

MyParser.digit_and_utf8("1é")
#=> {:ok, [?1, ?é], "", %{}, {1, 0}, 2}

MyParser.digit_and_utf8("a1")
#=> {:error, "expected a utf8 codepoint in the range ?0..?9, followed by a utf8 codepoint", "a1", %{}, {1, 0}, 0}
Link to this function utf8_string(combinator \\ empty(), range, count_or_opts) View Source
utf8_string(t(), [range()], pos_integer() | [min_and_max()]) :: t()

Defines an ascii string combinator with of exact length or min and max codepoint length.

The ranges specify the allowed characters in the ascii string. See ascii_char/2 for more information.

If you want a string of unknown size, use utf8_string(ranges, min: 1). If you want a literal string, use string/2.

Examples

defmodule MyParser do
  import NimbleParsec

  defparsec :two_letters, utf8_string([], 2)
end

MyParser.two_letters("áé")
#=> {:ok, ["áé"], "", %{}, {1, 0}, 3}
Link to this function wrap(combinator \\ empty(), to_wrap) View Source
wrap(t(), t()) :: t()

Wraps the results of the given combinator in to_wrap in a list.