Space-efficient distinct counting for the BEAM, ported from Ertl 2024 (VLDB).

Hex.pm Hex Docs License

UltraLogLog is a probabilistic data structure for approximate distinct counting — same constant memory, constant-time inserts, and associative merge as HyperLogLog, but 24–28% less memory at the same accuracy. This package is a paper-faithful Elixir port, cross-validated bit-for-bit against the Hash4j Java reference (v0.17.0) on every estimator.

The algorithm comes from:

Otmar Ertl. UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting. PVLDB 17(7), 2024. [VLDB PDF]]paper · [[arXiv extended]

Quick start

Add to mix.exs:

def deps do
  [{:ultra_log_log, "~> 0.1.0"}]
end

Count distinct values in a stream:

ull = UltraLogLog.new(precision: 12)        # 4 KB, ~1.2% standard error
ull = UltraLogLog.add(ull, "session-abc")
ull = UltraLogLog.add(ull, "session-xyz")
{:ok, count} = UltraLogLog.cardinality(ull)
# count ≈ 2.0

Merge two sketches — merge/2 is commutative, associative, and idempotent:

a = UltraLogLog.new(precision: 12) |> UltraLogLog.add("x")
b = UltraLogLog.new(precision: 12) |> UltraLogLog.add("y")
merged = UltraLogLog.merge(a, b)
{:ok, count} = UltraLogLog.cardinality(merged)
# count ≈ 2.0

Trade memory for accuracy by raising the precision:

small = UltraLogLog.new(precision: 10)   #  1 KB, ~3.1% error
big   = UltraLogLog.new(precision: 14)   # 16 KB, ~0.6% error

Why UltraLogLog

HyperLogLog has been the standard answer for "how many distinct things have I seen?" since 2007: constant memory, constant-time inserts, and mergeable across shards or nodes. UltraLogLog keeps all of that.

What UltraLogLog adds is information density. Where HLL packs 6-bit registers, ULL uses byte-aligned 8-bit ones — two extra bits per slot, recovered many times over by a 2024 estimator family that extracts more signal from each register. The net is 24–28% less memory at the same standard error, depending on which estimator you choose. The byte alignment is also a BEAM win: registers map directly to plain binaries, no bit-packing on the hot path.

When not to use it: small N (just use a MapSet), or any case that requires exact counts. ULL is an approximate counter; the smallest useful precision (p=10, 1 KB) carries ~3.1% relative standard error.

Estimators

EstimatorStorage factorRel. std. errorUse when
:fgra (default)4.8950.782/√mDefault. Single-pass, no iteration.
:mle4.6310.761/√mTightest bound; secant solver runs ~5 iterations per query.
:martingale3.4660.658/√mSingle-stream sketch never merged; returns {:error, :invalidated_by_merge} after merge/2.
{:ok, count} = UltraLogLog.cardinality(ull, estimator: :mle)

Merge is a CRDT operation. Element-wise register merge under the ULL partial order is commutative, associative, and idempotent — so distributed cardinality is trivial: shards independently maintain sketches, merge on demand, and the answer is exactly as if every insert had hit a single sketch. No coordinator, no quorum, no consensus.

Empirical validation

Every estimator is cross-checked against Hash4j v0.17.0's reference implementation on the same 16 register snapshots (4 precisions × 4 checkpoints), then exercised statistically over 450 random trials. Full reports live under docs/measurements/ in the repository.

FGRA vs Hash4j (16 fixtures)

  • max relative error: 8.4e-16
  • mean: 2.6e-16

This is IEEE 754 noise — the implementations agree to the last bit of double precision.

MLE vs Hash4j (16 fixtures)

  • max relative error: 1.3e-16
  • mean: 8.4e-18
  • secant iterations: mean 4.06, max 6

Most fixtures bit-identical; the worst case is one trailing-bit flip in floating-point accumulation. The secant solver converges in 3–6 iterations on every fixture and across 100 additional random sketches per precision (300 total).

Martingale vs Hash4j (16 fixtures)

  • max relative error: 0.0
  • mean: 0.0

Bit-exact across all 16 fixtures, including the 100k-insert accumulation cases. The estimator shares Hash4j's branch-free integer formulation for the per-register state-change probability, so floating-point divergence has nowhere to come from.

Statistical correctness (15 cells × 30 trials, 450 estimates each)

All three estimators meet the paper's theoretical bounds with significant headroom on every (p, N) cell:

EstimatorWorst biasBoundWorst stddev ratioBound
:fgra+0.411% (p=10, N=10⁴)±1.339%1.134 (p=14, N=10⁵)1.5
:mle+0.404% (p=10, N=10⁶)±1.302%1.034 (p=14, N=10⁵)1.5
:martingale+0.566% (p=10, N=10⁶)±1.127%1.070 (p=14, N=100)1.5

Bias bounds are 3σ around the theoretical relative standard error; stddev bounds allow up to 50% above the theoretical figure. Run the suite locally with:

REPORT=1 mix test --include statistical

See fgra-v0.1.txt, mle-v0.1.txt, and martingale-v0.1.txt in the repository for the full per-cell tables.

Precision and memory

Precision p allocates 2^p 8-bit registers — state size is exactly 2^p bytes:

p=10   1 KB,  ~3.1% error
p=12   4 KB,  ~1.2% error   (default)
p=14  16 KB,  ~0.6% error
p=16  64 KB,  ~0.3% error

Pick the smallest precision whose error fits your use case. p=12 is a reasonable default.

Status and roadmap

  • v0.1 (current) — immutable sketch, FGRA / MLE / martingale estimators, merge, binary serialization, downsize (full implementation in v0.2), full validation against Hash4j v0.17.0.
  • v0.2 (planned) — lock-free :atomics-backed concurrent insert path; native 64-bit hash (xxhash3 NIF); benchmarks; GitHub Actions CI.
  • v0.3 (planned) — sharded inserts via PartitionSupervisor and cluster-wide merge over distributed Erlang.
  • v0.4 (potential) — ExaLogLog, the 2024 follow-up to ULL for exa-scale cardinalities.

Planned items are not promised timelines. Watch the repository or CHANGELOG.md for releases.

Citation

If you use UltraLogLog in academic work, please cite the underlying paper:

@article{ertl2024ultraloglog,
  author  = {Otmar Ertl},
  title   = {UltraLogLog: A Practical and More Space-Efficient
             Alternative to HyperLogLog for Approximate Distinct Counting},
  journal = {Proceedings of the VLDB Endowment},
  volume  = {17},
  number  = {7},
  year    = {2024},
  pages   = {1655--1668},
  url     = {https://www.vldb.org/pvldb/vol17/p1655-ertl.pdf}
}

If this package is useful in your production work, a star on the GitHub repository is appreciated.

Acknowledgements

  • Otmar Ertl (@oertl) and Dynatrace Research for both the paper and the Hash4j Java reference, which served as ground truth for every byte of register encoding and every digit of estimator output.
  • The Hash4j contributors at Dynatrace — their public source is what made paper-faithful porting realistic on a reasonable timeline.

License

Apache 2.0.