WikitextEx.Parser (WikitextEx v0.1.1)
View SourceA robust MediaWiki wikitext parser that converts wikitext markup into structured AST nodes.
This parser supports the complete range of MediaWiki wikitext syntax including:
- Headers (=, ==, ===, etc.)
- Templates with named and positional arguments
- Internal links, categories, files, and interlanguage links
- Text formatting (bold, italic, combinations)
- Lists (ordered and unordered with nesting)
- Tables with headers and data cells
- HTML tags and comments
- Reference tags and nowiki sections
Usage
iex> {:ok, ast, _, _, _, _} = WikitextEx.Parser.parse("'''Bold''' text")
iex> ast
[
%WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold"}, children: []}]},
%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: " text"}, children: []}
]
Parser Architecture
The parser uses NimbleParsec combinators to build a comprehensive grammar that handles:
- Proper precedence for nested formatting
- Context-aware parsing for different wikitext environments
- Robust error recovery and graceful degradation
Summary
Functions
Parses the given binary
as bold_italic_text.
Parses the given binary
as bold_text.
Parses the given binary
as container_html_tag.
Parses the given binary
as container_ref_tag.
Parses the given binary
as header.
Parses the given binary
as html_comment.
Parses the given binary
as italic_text.
Parses the given binary
as link.
Parses the given binary
as list_item.
Parses the given binary
as nowiki_tag.
Parses the given binary
as parse.
Parses the given binary
as self_closing_html_tag.
Parses the given binary
as self_closing_ref_tag.
Parses the given binary
as table.
Parses the given binary
as table_cell_attributes.
Parses the given binary
as table_cell_content.
Parses the given binary
as table_data_cell.
Parses the given binary
as table_header_cell.
Parses the given binary
as template.
Parses the given binary
as template_value.
Functions
@spec bold_italic_text(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as bold_italic_text.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the bold_italic_text (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec bold_text(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as bold_text.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the bold_text (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec container_html_tag(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as container_html_tag.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the container_html_tag (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec container_ref_tag(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as container_ref_tag.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the container_ref_tag (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec header(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as header.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the header (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec html_comment(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as html_comment.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the html_comment (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec italic_text(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as italic_text.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the italic_text (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec link(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as link.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the link (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec list_item(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as list_item.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the list_item (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec nowiki_tag(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as nowiki_tag.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the nowiki_tag (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec parse(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as parse.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the parse (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec self_closing_html_tag(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as self_closing_html_tag.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the self_closing_html_tag (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec self_closing_ref_tag(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as self_closing_ref_tag.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the self_closing_ref_tag (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec table(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as table.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the table (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec table_cell_attributes(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as table_cell_attributes.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the table_cell_attributes (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec table_cell_content(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as table_cell_content.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the table_cell_content (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec table_data_cell(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as table_data_cell.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the table_data_cell (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec table_header_cell(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as table_header_cell.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the table_header_cell (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec template(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as template.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the template (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map
@spec template_value(binary(), keyword()) :: {:ok, [term()], rest, context, line, byte_offset} | {:error, reason, rest, context, line, byte_offset} when line: {pos_integer(), byte_offset}, byte_offset: non_neg_integer(), rest: binary(), reason: String.t(), context: map()
Parses the given binary
as template_value.
Returns {:ok, [token], rest, context, position, byte_offset}
or
{:error, reason, rest, context, line, byte_offset}
where position
describes the location of the template_value (start position) as {line, offset_to_start_of_line}
.
To column where the error occurred can be inferred from byte_offset - offset_to_start_of_line
.
Options
:byte_offset
- the byte offset for the whole binary, defaults to 0:line
- the line and the byte offset into that line, defaults to{1, byte_offset}
:context
- the initial context value. It will be converted to a map