WikitextEx (WikitextEx v0.1.1)
View SourceWikitextEx - A robust MediaWiki wikitext parser for Elixir.
WikitextEx provides functionality to parse MediaWiki wikitext markup into structured AST nodes, making it easy to process and analyze wiki content.
Quick Start
iex> # Parse wikitext into AST
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold''' and ''italic'' text")
iex> # Work with the parsed AST
iex> templates = WikitextEx.find_templates(ast)
iex> length(templates)
0
iex> text_content = WikitextEx.extract_text(ast)
iex> text_content
"Bold and italic text"
Main Functions
parse/1
- Parse wikitext string into ASTfind_templates/1
- Extract all template nodes from ASTfind_links/1
- Extract all link nodes from ASTextract_text/1
- Get plain text content from AST
Summary
Functions
Extract plain text content from AST nodes.
Find all header nodes in an AST.
Find all link nodes in an AST (including categories and files).
Find all template nodes in an AST.
Parse wikitext markup into an AST.
Functions
@spec extract_text([WikitextEx.AST.t()]) :: String.t()
Extract plain text content from AST nodes.
This function recursively traverses the AST and extracts all text content, ignoring markup and structure.
Examples
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold''' and ''italic'' text")
iex> WikitextEx.extract_text(ast)
"Bold and italic text"
@spec find_headers([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]
Find all header nodes in an AST.
Examples
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("== Header ==\nContent")
iex> headers = WikitextEx.find_headers(ast)
iex> length(headers)
1
@spec find_links([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]
Find all link nodes in an AST (including categories and files).
Examples
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("[[Article]] [[Category:Example]]")
iex> links = WikitextEx.find_links(ast)
iex> length(links)
2
@spec find_templates([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]
Find all template nodes in an AST.
Examples
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("{{template1}} and {{template2|arg}}")
iex> templates = WikitextEx.find_templates(ast)
iex> length(templates)
2
@spec parse(String.t()) :: {:ok, [WikitextEx.AST.t()], String.t(), map(), {non_neg_integer(), non_neg_integer()}, non_neg_integer()} | {:error, String.t(), String.t(), map(), {non_neg_integer(), non_neg_integer()}, non_neg_integer()}
Parse wikitext markup into an AST.
Returns the same tuple format as NimbleParsec for consistency:
{:ok, ast, rest, context, position, byte_offset}
on success or
{:error, reason, rest, context, position, byte_offset}
on failure.
Examples
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold text'''")
iex> [%WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold text"}, children: []}]}] = ast
[%WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold text"}, children: []}]}]
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("{{template|arg}}")
iex> [%WikitextEx.AST{type: :template, value: %WikitextEx.AST.Template{name: "template", args: [positional: "arg"]}, children: []}] = ast
[%WikitextEx.AST{type: :template, value: %WikitextEx.AST.Template{name: "template", args: [positional: "arg"]}, children: []}]