rebar3_erli18n_po_meta (rebar3_erli18n v0.1.0)

Copy Markdown View Source

Metadata-aware PO/POT serializer for .pot/.po catalogs.

erli18n_po is the parity-verified core: its entry() is ONLY {singular, Ctx, Msgid, Tr} or {plural, Ctx, Msgid, Plural, Forms}, and dump/1 emits ONLY the msgctxt/msgid/msgid_plural/msgstr block. By design it drops #, fuzzy (PSD-001) and #~ obsolete (PSD-007) on parse, and carries no #: references or #./# comments. Those are exactly the bytes a translator's tooling and a msgmerge workflow depend on.

This module is the NEW serializer layer that wraps erli18n_po for the translatable block and emits all the metadata itself:

  • # translator comments
  • #. extracted (programmer) comments
  • #: source references (file:line)
  • #, flags (fuzzy, c-format, ...)
  • #| previous msgid / msgctxt (the msgmerge fuzzy hint)
  • #~ obsolete entries (every line of the block is #~-prefixed)

It is OUTSIDE the GNU-gettext parity oracle's coverage (that oracle checks erli18n_po's msgstr block), so it carries its own byte-level golden CT and — when the msgmerge CLI is present — a msgmerge parity oracle that skips cleanly when absent.

Serialization order

Within one entry the lines are emitted in canonical GNU order: translator comments, extracted comments, references, flags, previous-msgid, then the translatable block. Obsolete entries omit references/flags/prev-msgid (GNU emits only #~-prefixed block lines for obsoletes) and prefix every block line with #~.

Cost

Serialization is a single O(entries) streaming pass; each entry's metadata block is emitted independently as the entry is written, so cost grows linearly with entry-plus-reference count, never with file size or a catalog cross-product.

Summary

Types

The translatable core of an entry, in the SAME shape erli18n_po:entry() uses so it can be handed to erli18n_po:dump/1 unchanged.

A metadata-bearing catalog: a header (raw msgstr text, as erli18n_po keeps it) plus an ordered list of meta_entry().

A full PO entry: its translatable body() plus the metadata lines that erli18n_po cannot represent.

Functions

Serialize a metadata-bearing catalog() to PO/POT bytes.

Serialize one meta_entry() to its PO bytes, metadata first then the body.

Compare two msgids for LOGICAL equality, ignoring PO line-wrapping.

Types

body()

-type body() ::
          {singular, Context :: undefined | binary(), Msgid :: binary(), Translation :: binary()} |
          {plural,
           Context :: undefined | binary(),
           Msgid :: binary(),
           MsgidPlural :: undefined | binary(),
           Forms :: [{non_neg_integer(), binary()}]}.

The translatable core of an entry, in the SAME shape erli18n_po:entry() uses so it can be handed to erli18n_po:dump/1 unchanged.

catalog()

-type catalog() :: #{header := binary(), entries := [meta_entry()]}.

A metadata-bearing catalog: a header (raw msgstr text, as erli18n_po keeps it) plus an ordered list of meta_entry().

meta_entry()

-type meta_entry() ::
          #{body := body(),
            comments => [binary()],
            extracted => [binary()],
            references => [{string() | binary(), pos_integer()}],
            flags => [atom() | binary()],
            previous =>
                undefined |
                {undefined | binary(), binary()} |
                {undefined | binary(), binary(), binary()},
            obsolete => boolean()}.

A full PO entry: its translatable body() plus the metadata lines that erli18n_po cannot represent.

  • comments# translator comment lines (text only, no leading #).
  • extracted#. extracted-comment lines (text only).
  • references#: references as {File, Line}; emitted file:line.
  • flags#, flags as atoms/binaries (e.g. fuzzy); fuzzy is the one merge sets.
  • previous#| previous-msgid hint: undefined, or {Ctx, Msgid} / {Ctx, Msgid, MsgidPlural} carried verbatim from the matched old entry.
  • obsoletetrue emits the whole entry as a #~ block (GNU obsolete).

Functions

dump/1

-spec dump(catalog()) -> binary().

Serialize a metadata-bearing catalog() to PO/POT bytes.

The header is emitted via erli18n_po:dump/1 (so it inherits the header fidelity), then each entry via dump_entry/1, separated by blank lines.

dump_entry(Entry)

-spec dump_entry(meta_entry()) -> binary().

Serialize one meta_entry() to its PO bytes, metadata first then the body.

A non-obsolete entry emits, in order: # comments, #. extracted comments, #: references, #, flags, #| previous-msgid, then the translatable block (delegated to erli18n_po:dump/1). An obsolete entry emits its comments then the whole block #~-prefixed, with no references/flags/prev-msgid (matching GNU msgmerge output for obsoletes). Every entry ends with one trailing blank line.

msgid_equal(A, B)

-spec msgid_equal(binary(), binary()) -> boolean().

Compare two msgids for LOGICAL equality, ignoring PO line-wrapping.

A .po may wrap a long msgid across multiple "..." continuation lines (or emit it --no-wrap on one line); both decode to the same logical string. Since both operands here are already DECODED binaries (the parser joined continuations), this is plain binary equality — the function exists to name the contract at merge call sites and to keep the wrapping-equality guarantee explicit and testable.