WikitextEx.Parser (WikitextEx v0.1.1)

View Source

A robust MediaWiki wikitext parser that converts wikitext markup into structured AST nodes.

This parser supports the complete range of MediaWiki wikitext syntax including:

  • Headers (=, ==, ===, etc.)
  • Templates with named and positional arguments
  • Internal links, categories, files, and interlanguage links
  • Text formatting (bold, italic, combinations)
  • Lists (ordered and unordered with nesting)
  • Tables with headers and data cells
  • HTML tags and comments
  • Reference tags and nowiki sections

Usage

iex> {:ok, ast, _, _, _, _} = WikitextEx.Parser.parse("'''Bold''' text")
iex> ast
[
  %WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold"}, children: []}]},
  %WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: " text"}, children: []}
]

Parser Architecture

The parser uses NimbleParsec combinators to build a comprehensive grammar that handles:

  • Proper precedence for nested formatting
  • Context-aware parsing for different wikitext environments
  • Robust error recovery and graceful degradation

Summary

Functions

Parses the given binary as bold_italic_text.

Parses the given binary as bold_text.

Parses the given binary as container_html_tag.

Parses the given binary as container_ref_tag.

Parses the given binary as header.

Parses the given binary as html_comment.

Parses the given binary as italic_text.

Parses the given binary as link.

Parses the given binary as list_item.

Parses the given binary as nowiki_tag.

Parses the given binary as parse.

Parses the given binary as self_closing_html_tag.

Parses the given binary as self_closing_ref_tag.

Parses the given binary as table.

Parses the given binary as table_cell_attributes.

Parses the given binary as table_cell_content.

Parses the given binary as table_data_cell.

Parses the given binary as table_header_cell.

Parses the given binary as template.

Parses the given binary as template_value.

Functions

bold_italic_text(binary, opts \\ [])

@spec bold_italic_text(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as bold_italic_text.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the bold_italic_text (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

bold_text(binary, opts \\ [])

@spec bold_text(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as bold_text.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the bold_text (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

container_html_tag(binary, opts \\ [])

@spec container_html_tag(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as container_html_tag.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the container_html_tag (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

container_ref_tag(binary, opts \\ [])

@spec container_ref_tag(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as container_ref_tag.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the container_ref_tag (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

header(binary, opts \\ [])

@spec header(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as header.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the header (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

html_comment(binary, opts \\ [])

@spec html_comment(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as html_comment.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the html_comment (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

italic_text(binary, opts \\ [])

@spec italic_text(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as italic_text.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the italic_text (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

link(binary, opts \\ [])

@spec link(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as link.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the link (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

list_item(binary, opts \\ [])

@spec list_item(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as list_item.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the list_item (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

nowiki_tag(binary, opts \\ [])

@spec nowiki_tag(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as nowiki_tag.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the nowiki_tag (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

parse(binary, opts \\ [])

@spec parse(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as parse.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the parse (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

self_closing_html_tag(binary, opts \\ [])

@spec self_closing_html_tag(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as self_closing_html_tag.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the self_closing_html_tag (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

self_closing_ref_tag(binary, opts \\ [])

@spec self_closing_ref_tag(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as self_closing_ref_tag.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the self_closing_ref_tag (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

table(binary, opts \\ [])

@spec table(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as table.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the table (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

table_cell_attributes(binary, opts \\ [])

@spec table_cell_attributes(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as table_cell_attributes.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the table_cell_attributes (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

table_cell_content(binary, opts \\ [])

@spec table_cell_content(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as table_cell_content.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the table_cell_content (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

table_data_cell(binary, opts \\ [])

@spec table_data_cell(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as table_data_cell.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the table_data_cell (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

table_header_cell(binary, opts \\ [])

@spec table_header_cell(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as table_header_cell.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the table_header_cell (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

template(binary, opts \\ [])

@spec template(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as template.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the template (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

template_value(binary, opts \\ [])

@spec template_value(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()

Parses the given binary as template_value.

Returns {:ok, [token], rest, context, position, byte_offset} or {:error, reason, rest, context, line, byte_offset} where position describes the location of the template_value (start position) as {line, offset_to_start_of_line}.

To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line.

Options

  • :byte_offset - the byte offset for the whole binary, defaults to 0
  • :line - the line and the byte offset into that line, defaults to {1, byte_offset}
  • :context - the initial context value. It will be converted to a map

to_attribute(list)

to_attributes(attrs)

to_bare_value(value_list)

to_bold(children)

to_bold_italic(children)

to_cell_content(content_parts)

to_container_html_tag(list)

to_container_ref_tag(children)

to_data_cell(list)

to_header(list)

to_header_cell(list)

to_html_comment(chars)

to_italic(children)

to_kv(list)

to_link(chars)

to_list_item(list)

to_nowiki(chars)

to_self_closing_html_tag(list)

to_self_closing_ref_tag(list)

to_table_with_nimble_parsec(chars)

to_template(list)

to_text(list)