WikitextEx (WikitextEx v0.1.1)

View Source

WikitextEx - A robust MediaWiki wikitext parser for Elixir.

WikitextEx provides functionality to parse MediaWiki wikitext markup into structured AST nodes, making it easy to process and analyze wiki content.

Quick Start

iex> # Parse wikitext into AST
iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold''' and ''italic'' text")
iex> # Work with the parsed AST
iex> templates = WikitextEx.find_templates(ast)
iex> length(templates)
0
iex> text_content = WikitextEx.extract_text(ast)
iex> text_content
"Bold and italic text"

Main Functions

Summary

Functions

Extract plain text content from AST nodes.

Find all header nodes in an AST.

Find all link nodes in an AST (including categories and files).

Find all template nodes in an AST.

Parse wikitext markup into an AST.

Functions

extract_text(ast_nodes)

@spec extract_text([WikitextEx.AST.t()]) :: String.t()

Extract plain text content from AST nodes.

This function recursively traverses the AST and extracts all text content, ignoring markup and structure.

Examples

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold''' and ''italic'' text")
iex> WikitextEx.extract_text(ast)
"Bold and italic text"

find_headers(ast_nodes)

@spec find_headers([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]

Find all header nodes in an AST.

Examples

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("== Header ==\nContent")
iex> headers = WikitextEx.find_headers(ast)
iex> length(headers)
1

find_links(ast_nodes)

@spec find_links([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]

Find all link nodes in an AST (including categories and files).

Examples

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("[[Article]] [[Category:Example]]")
iex> links = WikitextEx.find_links(ast)
iex> length(links)
2

find_templates(ast_nodes)

@spec find_templates([WikitextEx.AST.t()]) :: [WikitextEx.AST.t()]

Find all template nodes in an AST.

Examples

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("{{template1}} and {{template2|arg}}")
iex> templates = WikitextEx.find_templates(ast)
iex> length(templates)
2

parse(wikitext)

Parse wikitext markup into an AST.

Returns the same tuple format as NimbleParsec for consistency: {:ok, ast, rest, context, position, byte_offset} on success or {:error, reason, rest, context, position, byte_offset} on failure.

Examples

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("'''Bold text'''")
iex> [%WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold text"}, children: []}]}] = ast
[%WikitextEx.AST{type: :bold, value: nil, children: [%WikitextEx.AST{type: :text, value: %WikitextEx.AST.Text{content: "Bold text"}, children: []}]}]

iex> {:ok, ast, _, _, _, _} = WikitextEx.parse("{{template|arg}}")
iex> [%WikitextEx.AST{type: :template, value: %WikitextEx.AST.Template{name: "template", args: [positional: "arg"]}, children: []}] = ast
[%WikitextEx.AST{type: :template, value: %WikitextEx.AST.Template{name: "template", args: [positional: "arg"]}, children: []}]