Tableau extension to extract excerpts from posts.
Extraction Rules
- If the post frontmatter already has an
:excerptfield, it is preserved unchanged; - If the content contains range markers (default
<!--excerpt:start-->and<!--excerpt:end-->), extract the content between them; - If the content contains the split marker (default
<!--more-->), extract everything before it; - Otherwise, use structural extraction to extract content based on text structure.
The post frontmatter is updated to add an :excerpt field and the post body may be
updated to remove the split marker (depending on configuration). Range markers are not
removed from the body.
Configuration
config :tableau, TableauExcerptExtension,
enabled: true,
range: %{
start: "<!--\s*excerpt:start\s*-->",
end: "<!--\s*excerpt:end\s*-->"
},
marker: %{
pattern: "<!--\s*more\s*-->",
remove: true
},
fallback: %{
count: 1,
more: "…",
strategy: :paragraph
},
processors: [
md: TableauExcerptExtension.Processor.Markdown
]Configuration Options
:enabled(defaultfalse): Enable or disable the extension:range: Range marker configuration. Set tofalseto disable range extraction:start(default"<!--\s*excerpt:start\s*-->"): Pattern for the start marker:end(default"<!--\s*excerpt:end\s*-->"): Pattern for the end marker:remove(defaultfalse): Remove the markers from the post body (content between markers is preserved)
:marker: Split marker configuration. Set tofalseto disable marker matching:pattern(default"<!--\s*more\s*-->"): A string converted into a Regex pattern for split marker matching:remove(defaulttrue): Remove the marker from the post body
:fallback: Structural extraction strategy when no markers are found. Set tofalseto disable.:more(default…): If using:wordmode and the excerpt is truncated mid-sentence, this string will be appended:count: The count of paragraphs, sentences or words to extract; the default depends on the strategy selectedstrategy default paragraph 1 sentence 2 word 25 :strategy(default::paragraph): The extraction strategy to use:paragraph: Extract count complete paragraphs:sentence: Extract count sentences, stopping at the first paragraph boundary:word: Extract count words, stopping at the first paragraph boundary and appending themorestring if mid-sentence
:processors: Map of file extensions (atoms) to processor modules. Processors handle format-specific filtering and cleaning. The default is%{md: TableauExcerptExtension.Processor.Markdown}; content without an explicit processor will be passed to the Passthrough processor.
Format Processing
Excerpts are processed by format-specific processors based on the post's file extension.
Processors implement the TableauExcerptExtension.Processor behaviour with two
callbacks:
filter_paragraphs/1: Filters paragraph-like blocks (e.g., remove headings/rules). This is only called when using thefallbackstructural extraction.clean/2: Cleans format-specific syntax from excerpts (e.g., footnotes, reference links). This is called for all extracted excerpts (excerpts already present in post frontmatter are ignored).
Built-in Processors
TableauExcerptExtension.Processor.Markdown: Filters headings/rules, cleans footnotes and reference linksTableauExcerptExtension.Processor.Passthrough: Passthrough processor for unknown formats
Custom Processors
To support additional formats, implement the TableauExcerptExtension.Processor
behaviour and add to the :processors config:
config :tableau, TableauExcerptExtension,
processors: %{
md: TableauExcerptExtension.Processor.Markdown,
djot: MySite.DjotProcessor
}