Stable 64-bit hash interface for ExDataSketch.
All sketch algorithms require a deterministic hash function that maps arbitrary Elixir terms to 64-bit unsigned integers. This module provides that interface with automatic backend selection and a pure-Elixir fallback.
Hash Properties
- Output range: 0..2^64-1 (unsigned 64-bit integer).
- Deterministic: same input always produces same output within the same runtime configuration.
- Uniform distribution: output bits are well-distributed for sketch accuracy.
Auto-detection
When no custom :hash_fn is provided, hash64/2 automatically selects the
best available hash implementation:
XXHash3 (NIF): When the Rust NIF is loaded,
hash64/2uses XXHash3 which produces native 64-bit hashes with zero Elixir-side overhead. XXHash3 output is stable across platforms.phash2 + mix64 (pure): When the NIF is not available,
hash64/2falls back to:erlang.phash2/2with a fixnum-safe 64-bit mixer. The mixer uses 16-bit partial products to avoid bigint heap allocations while preserving full 64-bit output quality.
The NIF availability check is performed once and cached in
:persistent_term for zero-cost subsequent lookups.
Pluggable Hash
Pass hash_fn: fn term -> non_neg_integer end to override the default.
The custom function must return values in 0..2^64-1.
Stability
:erlang.phash2/2 output is not guaranteed stable across OTP major versions.
XXHash3 output is stable across platforms. For cross-version stability, use
the NIF build (XXHash3) or supply a custom :hash_fn.
Summary
Types
Static description of a hash algorithm. Returned by algorithm_info/1.
Functions
Returns the static descriptor for a hash algorithm.
Returns the default hash algorithm for new sketches.
Returns the default hash strategy based on NIF availability.
Hashes an arbitrary Elixir term to a 64-bit unsigned integer.
Hashes a raw binary to a 64-bit unsigned integer.
Returns whether the NIF is available for hashing.
Resolves the effective hash strategy for a sketch given user options.
Returns the list of hash algorithm identifiers supported by this build.
Validates that two sets of sketch options have compatible hashing configuration.
Hashes a binary using XXHash3 (64-bit) via Rust NIF.
Hashes a binary using XXHash3 (64-bit) with a seed via Rust NIF.
Types
@type algorithm_info() :: %{ id: hash_strategy(), name: String.t(), output_bits: 64, has_seed: boolean(), available?: boolean(), stability: :stable | :otp_dependent | :runtime_dependent }
Static description of a hash algorithm. Returned by algorithm_info/1.
@type hash64() :: non_neg_integer()
@type hash_opt() :: {:seed, non_neg_integer()} | {:hash_fn, (term() -> hash64())} | {:hash_strategy, hash_strategy()}
@type hash_strategy() :: :phash2 | :xxhash3 | :murmur3 | :custom
@type opts() :: [hash_opt()]
Functions
@spec algorithm_info(hash_strategy()) :: algorithm_info()
Returns the static descriptor for a hash algorithm.
See algorithm_info/0 for the returned map shape.
Examples
iex> info = ExDataSketch.Hash.algorithm_info(:xxhash3)
iex> info.id
:xxhash3
iex> info.output_bits
64
iex> info = ExDataSketch.Hash.algorithm_info(:murmur3)
iex> info.has_seed
true
iex> info.stability
:stable
iex> info = ExDataSketch.Hash.algorithm_info(:phash2)
iex> info.stability
:otp_dependent
@spec default_algorithm() :: :xxhash3 | :phash2
Returns the default hash algorithm for new sketches.
This is the v0.8.0 successor to default_hash_strategy/0 and uses the
same selection logic. Prefer this name in new code; the old name is
retained for backward compatibility.
Examples
iex> ExDataSketch.Hash.default_algorithm() in [:xxhash3, :phash2]
true
@spec default_hash_strategy() :: :xxhash3 | :phash2
Returns the default hash strategy based on NIF availability.
Returns :xxhash3 when the NIF is loaded, :phash2 otherwise.
Hashes an arbitrary Elixir term to a 64-bit unsigned integer.
When no :hash_fn is provided, automatically uses XXHash3 via NIF if
available, otherwise falls back to phash2 with fixnum-safe bit mixing.
Options
:seed- seed value for the hash (default: 0). Combined with the base hash.:hash_fn- custom hash function(term -> 0..2^64-1). When provided,:seedis ignored and the function is called directly.
Examples
iex> h = ExDataSketch.Hash.hash64("hello")
iex> is_integer(h) and h >= 0
true
iex> ExDataSketch.Hash.hash64("hello") == ExDataSketch.Hash.hash64("hello")
true
iex> ExDataSketch.Hash.hash64("hello") != ExDataSketch.Hash.hash64("world")
true
iex> ExDataSketch.Hash.hash64("test", seed: 42) != ExDataSketch.Hash.hash64("test", seed: 0)
true
Hashes a raw binary to a 64-bit unsigned integer.
Operates directly on binary bytes without term encoding overhead. Useful when the input is already binary data (e.g., from external sources).
When no :hash_fn is provided, automatically uses XXHash3 via NIF if
available, otherwise falls back to phash2 with fixnum-safe bit mixing.
Options
Same as hash64/2.
Examples
iex> h = ExDataSketch.Hash.hash64_binary(<<1, 2, 3>>)
iex> is_integer(h) and h >= 0
true
iex> ExDataSketch.Hash.hash64_binary(<<"abc">>) == ExDataSketch.Hash.hash64_binary(<<"abc">>)
true
@spec nif_available?() :: boolean()
Returns whether the NIF is available for hashing.
The result is computed once and cached in :persistent_term.
@spec resolve_strategy(keyword()) :: hash_strategy()
Resolves the effective hash strategy for a sketch given user options.
Resolution precedence:
- If
:hash_fnis set →:custom(closure-based, never merge-compatible). - If the caller passed
:hash_strategy, that value is honored. Unknown values are rejected withArgumentError. - Otherwise
default_algorithm/0is used.
This is the single source of truth for sketch constructors. It exists to
let callers select :murmur3 (Apache DataSketches interop) or :phash2
(BEAM-only fallback) at sketch creation time without surprising the
default-choice machinery.
Examples
iex> ExDataSketch.Hash.resolve_strategy([])
ExDataSketch.Hash.default_algorithm()
iex> ExDataSketch.Hash.resolve_strategy(hash_strategy: :murmur3)
:murmur3
iex> ExDataSketch.Hash.resolve_strategy(hash_fn: fn _ -> 0 end)
:custom
iex> ExDataSketch.Hash.resolve_strategy(hash_strategy: :phash2)
:phash2
iex> try do
...> ExDataSketch.Hash.resolve_strategy(hash_strategy: :sha256)
...> rescue
...> ArgumentError -> :raised
...> end
:raised
@spec supported_algorithms() :: [hash_strategy()]
Returns the list of hash algorithm identifiers supported by this build.
:custom is included to indicate that user-supplied :hash_fn closures
are an accepted hash strategy, but they are NEVER returned by
default_algorithm/0 and are NEVER merge-compatible across sketches.
Examples
iex> algos = ExDataSketch.Hash.supported_algorithms()
iex> Enum.all?([:phash2, :xxhash3, :murmur3, :custom], &(&1 in algos))
true
Validates that two sets of sketch options have compatible hashing configuration.
Raises ExDataSketch.Errors.IncompatibleSketchesError if:
- Either sketch uses a custom
:hash_fn(closures cannot be compared) - Hash strategies differ (e.g.
:xxhash3vs:phash2) - Seeds differ (default is 0)
This is a backward-compatible shim over
ExDataSketch.Hash.Validation.validate_options!/3. Prefer the new module
in new code; this function remains stable for all v0.x sketches.
Hashes a binary using XXHash3 (64-bit) via Rust NIF.
Returns a deterministic 64-bit hash that is stable across platforms and versions when the Rust NIF is available. Falls back to the phash2-based hash if the NIF is not loaded; the fallback is NOT stable across OTP major versions (see module docs).
This function operates on raw binary data. For Elixir terms, convert to
binary first (e.g., using :erlang.term_to_binary/1 or to_string/1).
Examples
iex> h = ExDataSketch.Hash.xxhash3_64("hello")
iex> is_integer(h) and h >= 0
true
iex> ExDataSketch.Hash.xxhash3_64("hello") == ExDataSketch.Hash.xxhash3_64("hello")
true
@spec xxhash3_64(binary(), non_neg_integer()) :: hash64()
Hashes a binary using XXHash3 (64-bit) with a seed via Rust NIF.
Falls back to the phash2-based hash if the NIF is not available.
Examples
iex> h = ExDataSketch.Hash.xxhash3_64("hello", 42)
iex> is_integer(h) and h >= 0
true
iex> ExDataSketch.Hash.xxhash3_64("hello", 0) != ExDataSketch.Hash.xxhash3_64("hello", 42)
true