parsey v0.0.2 Parsey

A library to setup basic parsing requirements for non-complex nested inputs.

Parsing behaviours are defined using rulesets, these sets take the format of [rule]. Rulesets are matched against in the order defined. The first rule in the set will have a higher priority than the last rule in the set.

A rule is a matching expression that is named. The name of a rule can be any atom, and multiple rules can consist of the same name. While the matching expression can be either a Regex expression or a function.

Rules may additionally be configured to specify the additional options that will be returned in the ast, or the ruleset modification behaviour (what rules to exclude, include or re-define), and if the rule should be ignored (not added to the ast).

The default behaviour of a matched rule is to remove all rules with the same name from the ruleset, and then try further match the matched input with the new ruleset. Returning the ast one completion.

The behaviour of matchers (applies to both regex and functions) is return a list of indices [{ index, length }] where the first List.first tuple in the list is used to indicate the portion of the input to be removed, while the last List.last is used to indicate the portion of the input to be focused on (parsed further).

Summary

Functions

Parse the given input using the specified ruleset

Types

ast ::
  String.t |
  {name, [ast]} |
  {name, [ast], option}
matcher ::
  Regex.t |
  (String.t -> nil | [{integer, integer}])
name :: atom
option :: any
rule ::
  {name, matcher} |
  {name, %{match: matcher, capture: non_neg_integer, format: formatter, option: option, ignore: boolean, exclude: excluder | [excluder], include: rule | [rule], rules: rule | [rule]}}

Functions

parse(input, rules)

Specs

parse(String.t, [rule]) :: [ast]

Parse the given input using the specified ruleset.

Example

iex> rules = [
...>     whitespace: %{ match: ~r/\A\s/, ignore: true },
...>     element_end: %{ match: ~r/\A<\/.*?>/, ignore: true },
...>     element: %{ match: fn
...>         input = <<"<", _ :: binary>> ->
...>             elements = String.splitter(input, "<", trim: true)
...>
...>             [first] = Enum.take(elements, 1)
...>             [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, first, return: :index)
...>             tag_length = tag_length + 1
...>
...>             { 0, length } = Stream.drop(elements, 1) |> Enum.reduce_while({ 1, 0 }, fn
...>                 element = <<"/", _ :: binary>>, { 1, length } ->
...>                     [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, element, return: :index)
...>                     { :halt, { 0, length + tag_length + 1 } }
...>                 element = <<"/", _ :: binary>>, { count, length } -> { :cont, { count - 1, length + String.length(element) + 1 } }
...>                 element, { count, length } -> { :cont, { count + 1, length + String.length(element) + 1 } }
...>             end)
...>
...>             length = length + String.length(first) + 1
...>             [{ 0, length }, {1, tag_length - 2}, { tag_length, length - tag_length }]
...>         _ -> nil
...>     end, exclude: nil, option: fn input, [_, { index, length }, _] -> String.slice(input, index, length) end },
...>     value: %{ match: ~r/\A\d+/, rules: [] }
...> ]
iex> input = """
...> <array>
...>     <integer>1</integer>
...>     <integer>2</integer>
...> </array>
...> <array>
...>     <integer>3</integer>
...>     <integer>4</integer>
...> </array>
...> """
iex> Parsey.parse(input, rules)
[
    { :element, [
        { :element, [value: ["1"]], "integer" },
        { :element, [value: ["2"]], "integer" }
    ], "array" },
    { :element, [
        { :element, [value: ["3"]], "integer" },
        { :element, [value: ["4"]], "integer" }
    ], "array" },
]