All notable changes to erli18n will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning 2.0.0.
Versioning policy
Per SemVer 2.0.0 §4, this project is in the 0.x.y initial-development phase:
0.x.y→0.x.y+1(patch): backward-compatible bug fixes only.0.x.y→0.x+1.0(minor): may introduce backward-incompatible changes, announced in advance via CHANGELOG. Additive changes (new functions, new arities, new opt-in flags, new telemetry events) are the norm.- Telemetry events are versioned per the schema policy documented in the
erli18n_telemetrymodule-moduledoc; events marked@stablecannot change schema within0.xseries, events marked@unstablemay.
Criteria for 1.0.0
The 1.0.0 release commits to API stability. Tag bumps to 1.0.0 only when all of the following hold:
- At least one external project uses
erli18nin production for ≥ 6 months without reporting breaking issues. - The Post-0.1.0 Roadmap items that affect public API surface (charset support, hot upgrade behavior, async load) are either implemented or formally rejected with rationale.
- Parity SUITE (
erli18n_parity_SUITE) passes end-to-end against the real GNUgettext/ngettextCLI (gettext-tools≥ 0.21) as oracle (currently 6 scenarios; target ≥ 20 covering the full PSD-001…009 semantics matrix). - No unfixed
@unstabletelemetry events remain — all events either promoted to@stableor removed. - CHANGELOG documents zero behavioral changes for at least 2 consecutive minor releases.
[Unreleased]
No unreleased changes.
[0.3.0] — 2026-06-19
Phase 2: canonicalization-aware BCP-47 fallback chain + Accept-Language
negotiation (opt-in). This release is additive — a new module, four new
facade functions, one new application-env key defaulting to off, and one new
telemetry event under the existing opt-in flag. With the default configuration
every public function behaves exactly as in 0.2.0; the exact-match lookup hot
path is byte-for-byte unchanged and reads nothing extra. The minor bump follows
the 0.x SemVer policy above.
Added
erli18n_negotiate— a pure, total, dependency-free engine for locale canonicalization, fallback-chain construction, andAccept-Languagenegotiation. Holds no state (nogen_server, ETS, or process dictionary) and is property-tested in isolation.canonicalize/1— folds a BCP-47 / POSIX tag to the erli18n catalog-key shape (<<"pt-BR">>→<<"pt_BR">>): hyphen/underscore equivalence, RFC 5646 §2.1.1 positional casing (language lowercase, script Titlecase, region UPPERCASE), POSIX charset/modifier suffix stripping (pt_BR.UTF-8,ca_ES@valencia), and a closed legacy-language alias table (in→id,iw→he,ji→yi,jw→jv,mo→ro). Total and idempotent. Out of scope (documented non-goals): macrolanguage/script inference such aszh_Hans⇄zh_CN(needs the CLDR Add Likely Subtags algorithm) and grandfathered/irregular tags.fallback_chain/2— the ordered RFC 4647 Lookup candidate list (pt-BR+ defaulten→[<<"pt_BR">>, <<"pt">>, <<"en">>]), canonicalized, order-preserving-deduplicated, and bounded.parse_accept_language/1— parses an HTTPAccept-Languageheader (RFC 9110 §12.5.4) into[{Range, Q}]withQas an integer in milli-units (0..1000); absentq=1000, well-formedq=0dropped, sorted by descending quality with a stable header-order tiebreak. Total and fail-soft; output shape matches cowlib'scow_http_hd:parse_accept_language/1.negotiate/2,3andbest_match/3— RFC 4647 Lookup of a preference list against an available-locale set, returning the first supported match (preserving the available entry's original casing), a default, orerror.
- Facade additions on
erli18n—negotiate/2(always returns a usable locale, defaulting todefault_locale/0on no match),parse_accept_language/1,canonicalize_locale/1, andset_locale_fallback/1. None changes an existing arity. - Opt-in lookup fallback chain — the four lookup families
(
gettext/ngettext/pgettext/npgettext, and so the interpolatingf-family that delegates to them) consult the fallback chain only on an exact-match miss and only when enabled, so apt_BRrequest resolves a loadedptcatalog instead of returning the rawmsgid. - Config
erli18n.locale_fallback(env, defaultoff):off— exact match only (0.2.0 behavior; the hit path reads nothing extra).base_language— RFC 4647 Lookup chain (pt_BR→pt→default_locale).{explicit, Map}—Map :: #{locale() => [locale()]}override layer; an unlisted locale falls through tobase_language.
- Telemetry
[erli18n, locale, fallback]— emitted when a non-exact locale resolves a translation through the chain, with achain_depthmeasurement andrequested_locale/resolved_localemetadata. Opt-in under the existingemit_lookup_telemetryflag and kept entirely off the exact-hit path. event_locale_fallback/0onerli18n_telemetry.
Performance & safety
- Zero-overhead exact hit. All fallback work runs strictly in the post-miss
branch and only when enabled; an exact hit remains a single
ets:lookupwith no extra allocation or config read (verified by a dedicated CT case). On a miss with fallback on, cost is O(chain length) extra reads, short-circuiting on the first hit. - Total / fail-soft & anti-DoS. Parsing untrusted tags and headers never
raises and never interns atoms (no
binary_to_atom); bounded by per-tag (35 B), subtag (8), chain (8), header (4096 B), element (64), and range (32) caps. An invalidlocale_fallbackvalue degrades tooffrather than breaking a lookup.
Caveats
- Likely-subtags inference is not performed.
zh-CNcanonicalizes tozh_CN, notzh_Hans; a script-only catalog (zh_Hans) is not matched by a region-only request (zh_CN) or vice versa. Load catalogs under the keys your clients send, or supply an{explicit, Map}mapping.
[0.2.0] — 2026-06-16
Phase 1: named %{var} interpolation. This release is additive — every
change is a new function, type, or module; the existing gettext / ngettext /
pgettext / npgettext families (and their d / dc variants) are
behaviorally unchanged. The minor bump follows the 0.x SemVer policy above.
Added
erli18n_interp— a pure, dependency-free substituter for named%{name}placeholders.format/2(lenient) is total and fail-soft: for any input and any bindings it returns a binary and never raises.format/3takes anopts()map whose single key,on_missing, selects the missing-binding policy (lenient|strict).- Named placeholders.
%{name}decouples wording from argument order — a translator can move or repeat%{name}and the binding still resolves by name (atom keys). Values may be a binary, an iolist/string, an integer, a float, or an atom, and are coerced to UTF-8 text. - Escaping. A literal percent is
%%; to emit a literal, un-substituted%{name}, write%%{name}(the%%collapses to%, leaving{name}untouched). lenientvsstrict. Lenient leaves an unbound%{name}in place literally; strict raises{erli18n_interp, {missing_binding, Name}}.- Anti-DoS caps. Output is bounded by
?MAX_OUTPUT_BYTES(65536): every append (literal chunk, coerced bound value, literal placeholder) is size-checked in O(1) and the result is truncated to fit before scanning stops. Placeholder expansion is bounded by?MAX_EXPANSIONS(1024); past that, placeholders are emitted literally. Truncation/clamp paths usebinary:copy/1so the returned binary does not pin a large parent binary.
- Named placeholders.
bindings/0type —#{atom() => term()}, exported fromerli18n_interp(alongsideon_missing/0andopts/0).- Interpolating
f-suffix façade family — 24 new functions onerli18n:gettextf,ngettextf,pgettextf,npgettextfand theird/dcdomain-explicit variants, each with a process-locale and an explicit-locale arity. Everyffunction resolves the translation exactly like its non-fsibling, then splices%{var}values from a trailingBindings :: map(). The façadeffamily is lenient (unbound placeholders stay literal; never raises); opt intostrictby callingerli18n_interp:format/3directly. - Plural count auto-bind. The
ngettextf/npgettextffamilies auto-bindcount => N, so%{count}is always available without passing it; a caller-suppliedcountwins.
Caveats
- Bidi / RTL. Interpolation does not auto-insert Unicode bidi isolation marks (U+2066–U+2069) around spliced values. Placing an RTL value into an LTR sentence (or the reverse) can reorder neighbouring punctuation under the Unicode Bidirectional Algorithm. Isolate mixed-direction values yourself until a future version offers opt-in isolation.
[0.1.0] — 2026-06-14
Initial development release. The public API is functional but subject to backward-incompatible
changes on minor bumps per the 0.x SemVer policy.
Requires OTP 27 or newer. The public modules carry native -doc / -moduledoc
documentation attributes (EEP-59), which only compile on OTP 27+; OTP 25.3 and 26 reject
them at compile time with attribute doc after function definitions.
Added
- Core OTP application:
erli18n_app,erli18n_sup(intensity{5, 10}hardcoded per AMB-002). erli18n_server— genserver + ETS catalog store with anti-bottleneck pattern (hot path `lookup*is lock-free direct ETS from caller process; writes serialized throughprotected` table owner).erli18n_po— hand-written recursive-descent parser for GNU gettext.poformat. Honors PSDs 001-009:- PSD-001: fuzzy entries dropped by default; opt-in via
#{include_fuzzy => true}. - PSD-002: charset support restricted to UTF-8, Latin-1, US-ASCII (native to
unicode:characters_to_binary/3). - PSD-003: empty
msgstrpreserved; fallback-to-msgid handled at lookup. - PSD-004: header
Plural-Formsis runtime source of truth; CLDR consulted at load only for divergence warning. - PSD-005: BOM UTF-8 stripped silently.
- PSD-006: msgctxt stored as a separate ETS key field, matching how GNU gettext keys contextual entries (
msgctxt+EOT+msgid). - PSD-007: obsolete
#~entries skipped. - PSD-008: degenerate plural (
nplurals=1) accepted. - PSD-009:
npluralsmismatch rejected with structured error.
- PSD-001: fuzzy entries dropped by default; opt-in via
erli18n_plural— recursive-descent C-expression evaluator forPlural-Formsheader. CLDR data inlined for 49 locales. Bignum-clean.erli18n_server:ensure_loaded/3,4andreload/3,4— atomic catalog load (parse → compile plural → validate vs CLDR → insert), with idempotency fast-path (RISK-012 mitigation).erli18n(façade) — full GNU gettext C-macro API surface:gettextfamily (singular),ngettextfamily (plural),pgettextfamily (contextual),npgettextfamily (contextual + plural), withd/dcaliases. Per-process locale via process dictionary; application-wide defaults viaapplication:get_env/2.erli18n_telemetry— 7:telemetryevents as first-class observability concern (catalog load/reload/unload spans; lookup miss/fuzzy_skip opt-in; plural divergence warning; memory warning rate-limited).telemetrydeclared as optional dep viaoptional_applications(OTP 24+).- Test suite: 289 Common Test cases, green on OTP 27 and 28 — façade API, gen_server / catalog,
.poparser, plural evaluator, loader, and telemetry suites, plus PropEr properties (200 runs each) and fuzz scenarios (100–500 runs each). 6 of these are parity scenarios run against the real GNUgettext/ngettextCLI oracle; that suite skips cleanly whengettext-toolsor thept_BR.UTF-8/ru_RU.UTF-8locales are absent. - Coverage: 100% of behaviorally reachable lines. Dead defensive code removed (no silent fallbacks for invariant violations — crashes are explicit via
function_clause/case_clause/badmatch). - Apache 2.0 license.
- GitHub Actions CI (
.github/workflows/ci.yml) — three jobs on pinnedubuntu-24.04runners:lint(fast quality gate on OTP 28),test(Common Test + coverage across OTP 27 and 28, withgettextinstalled and thept_BR.UTF-8/ru_RU.UTF-8locales generated soerli18n_parity_SUITEexercises the oracle path),dialyzer(isolated job with PLT cache). CI runs automatically only onmain; every other branch runs on demand viaworkflow_dispatch. Concurrency cancellation per ref, least-privilegecontents: readtoken, rebar3 build cache keyed per OTP. - Local CI emulation via
actand a custom runner image (Dockerfile.act-runner): extendsghcr.io/catthehacker/ubuntu:full-24.04with ELP2026-02-27(SHA256-verified per SLSA v0.2). Reuses the workflow YAML unchanged — GitHub-hosted runners gracefully[SKIP]the ELP steps in real CI. Bootstrap is declarative incompose.yml(act-toolcachevolume init + image build).actionlint 1.7.12pinned viamise.tomlfor static workflow analysis. - Repo hygiene:
README.md(with usage / install / compatibility / dev sections),CONTRIBUTING.md,SECURITY.md,CODE_OF_CONDUCT.md(Contributor Covenant 3.0),.editorconfig.
Architecture decisions
The design rationale is captured inline in the source: PO-semantics decisions
(PSD-001…PSD-009), risk mitigations (RISK-*), and ambiguity resolutions
(AMB-*) are referenced from the relevant module -moduledoc / -doc attributes
and code comments. The internal planning corpus that originally tracked them is not
part of the published package.