TableauExcerptExtension (TableauExcerptExtension v1.1.0)

Copy Markdown View Source

Tableau extension to extract excerpts from posts.

Extraction Rules

  1. If the post frontmatter already has an :excerpt field, it is preserved unchanged;
  2. If the content contains range markers (default <!--excerpt:start--> and <!--excerpt:end-->), extract the content between them;
  3. If the content contains the split marker (default <!--more-->), extract everything before it;
  4. Otherwise, use structural extraction to extract content based on text structure.

The post frontmatter is updated to add an :excerpt field and the post body may be updated to remove the split marker (depending on configuration). Range markers are not removed from the body.

Configuration

config :tableau, TableauExcerptExtension,
  enabled: true,
  range: %{
    start: "<!--\s*excerpt:start\s*-->",
    end: "<!--\s*excerpt:end\s*-->"
  },
  marker: %{
    pattern: "<!--\s*more\s*-->",
    remove: true
  },
  fallback: %{
    count: 1,
    more: "…",
    strategy: :paragraph
  },
  processors: [
    md: TableauExcerptExtension.Processor.Markdown
  ]

Configuration Options

  • :enabled (default false): Enable or disable the extension

  • :range: Range marker configuration. Set to false to disable range extraction

    • :start (default "<!--\s*excerpt:start\s*-->"): Pattern for the start marker
    • :end (default "<!--\s*excerpt:end\s*-->"): Pattern for the end marker
    • :remove (default false): Remove the markers from the post body (content between markers is preserved)
  • :marker: Split marker configuration. Set to false to disable marker matching

    • :pattern (default "<!--\s*more\s*-->"): A string converted into a Regex pattern for split marker matching

    • :remove (default true): Remove the marker from the post body

  • :fallback: Structural extraction strategy when no markers are found. Set to false to disable.

    • :more (default ): If using :word mode and the excerpt is truncated mid-sentence, this string will be appended

    • :count: The count of paragraphs, sentences or words to extract; the default depends on the strategy selected

      strategydefault
      paragraph1
      sentence2
      word25
    • :strategy (default: :paragraph): The extraction strategy to use

      • :paragraph: Extract count complete paragraphs
      • :sentence: Extract count sentences, stopping at the first paragraph boundary
      • :word: Extract count words, stopping at the first paragraph boundary and appending the more string if mid-sentence
  • :processors: Map of file extensions (atoms) to processor modules. Processors handle format-specific filtering and cleaning. The default is %{md: TableauExcerptExtension.Processor.Markdown}; content without an explicit processor will be passed to the Passthrough processor.

Format Processing

Excerpts are processed by format-specific processors based on the post's file extension. Processors implement the TableauExcerptExtension.Processor behaviour with two callbacks:

  • filter_paragraphs/1: Filters paragraph-like blocks (e.g., remove headings/rules). This is only called when using the fallback structural extraction.
  • clean/2: Cleans format-specific syntax from excerpts (e.g., footnotes, reference links). This is called for all extracted excerpts (excerpts already present in post frontmatter are ignored).

Built-in Processors

Custom Processors

To support additional formats, implement the TableauExcerptExtension.Processor behaviour and add to the :processors config:

config :tableau, TableauExcerptExtension,
  processors: %{
    md: TableauExcerptExtension.Processor.Markdown,
    djot: MySite.DjotProcessor
  }