ExDataSketch.DataSketches.Murmur3 (ExDataSketch v0.9.0)

Copy Markdown View Source

Minimal MurmurHash3_x64_128 implementation for DataSketches seed hash computation.

This module implements only the subset of MurmurHash3 needed to compute the 16-bit seed hash used by Apache DataSketches for compatibility verification. The seed hash identifies which hash function/seed was used to create a sketch, preventing merges between incompatible sketches.

Seed Hash Computation

The seed hash is computed as:

  1. Hash the 8-byte little-endian encoding of the seed using MurmurHash3_x64_128 with hash seed 0.
  2. Take the lower 16 bits of the first 64-bit output word.

For the default DataSketches seed of 9001, this produces a fixed constant.

Summary

Functions

Computes the DataSketches seed hash for a given seed value.

Functions

seed_hash(seed)

@spec seed_hash(non_neg_integer()) :: non_neg_integer()

Computes the DataSketches seed hash for a given seed value.

Returns a 16-bit unsigned integer matching the value produced by org.apache.datasketches.common.Util.computeSeedHash(seed) in Java. The computation hashes the seed as a little-endian u64 using MurmurHash3_x64_128 with hash seed 0, then takes the lower 16 bits of the first output word.

Examples

iex> h = ExDataSketch.DataSketches.Murmur3.seed_hash(9001)
iex> is_integer(h) and h >= 0 and h <= 0xFFFF
true