ExDataSketch.Binary.Header (ExDataSketch v0.8.0)

Copy Markdown View Source

EXSK v2 binary frame header.

This module owns the layout of the v2 frame used to wrap every persisted sketch from ex_data_sketch v0.8.0 onward. It is decoupled from the ExDataSketch.Codec historical entry point so that the Phase 2 frame format can evolve without disturbing the v1 reader and the existing golden vectors.

v2 Layout

All multi-byte integers are little-endian. CRC is computed over every byte preceding it (offsets 0 .. crc_off - 1).

off       size  field                          notes
  0         4   magic "EXSK"                   identical to v1
  4         1   serialization_version (u8)     = 2
  5         1   sketch_family (u8)             matches the EXSK sketch_id
  6         1   family_version (u8)            sketch-specific layout version
  7         1   flags (u8)                     reserved; must be 0 in v2
  8         2   header_size (u16 LE)           total bytes from offset 0 up to (not including) the payload
 10         M   hash_metadata block (variable) `ExDataSketch.Hash.Metadata`

10+M 4 payload_size (u32 LE) 14+M N payload sketch-specific (params + state encoding) 14+M+N 4 crc32c (over bytes [0 .. 14+M+N - 1])

Total frame size: 18 + M + N bytes.

The header_size field equals 10 + M + 4 — i.e., the offset of the payload. A reader can use it as a fast-skip when only the hash metadata is needed. Mismatch between the declared and actual header_size is a corruption indicator.

Forward compatibility

  • Wire bytes for serialization_version, magic, and the metadata-block algorithm/backend bytes are stable across all v0.x releases.
  • Bumping family_version is the per-sketch evolution lever and does not affect the frame parser.
  • Future flags bits will be defined here, with the rule "if a reader encounters an unrecognized flag bit it MUST reject the frame as incompatible". This is intentionally strict — silent feature fallthrough is the cause of most binary-format bugs.

See also

Summary

Functions

Decodes an EXSK v2 frame.

Encodes an EXSK v2 frame around the given metadata and payload.

Returns the EXSK magic bytes.

Returns the current frame version (2).

Types

frame()

@type frame() :: %{
  magic: binary(),
  serialization_version: 2,
  sketch_family: non_neg_integer(),
  family_version: non_neg_integer(),
  flags: non_neg_integer(),
  metadata: ExDataSketch.Hash.Metadata.t(),
  payload: binary()
}

Functions

decode(bin)

@spec decode(binary()) :: {:ok, frame()} | {:error, Exception.t()}

Decodes an EXSK v2 frame.

Returns {:ok, frame_map} on success, or {:error, %DeserializationError{}} on failure. Never crashes the BEAM on malformed input — every parse pathway returns a structured error.

The CRC32C is verified against the recomputed checksum over all bytes preceding it; any mismatch is reported with reason "checksum mismatch ...".

This decoder ONLY handles v2 frames. v1 frames are dispatched by ExDataSketch.Binary.decode/1, which sniffs the version byte and routes through the legacy ExDataSketch.Codec.

Examples

iex> meta = ExDataSketch.Hash.Metadata.new(:xxhash3, 0, 1, 1, :rust)
iex> bin = ExDataSketch.Binary.Header.encode(meta, <<1, 2, 3>>)
iex> {:ok, frame} = ExDataSketch.Binary.Header.decode(bin)
iex> frame.payload
<<1, 2, 3>>
iex> frame.serialization_version
2

encode(metadata, payload, opts \\ [])

@spec encode(ExDataSketch.Hash.Metadata.t(), binary(), keyword()) :: binary()

Encodes an EXSK v2 frame around the given metadata and payload.

The metadata's sketch_family and sketch_family_version are mirrored into the frame's sketch_family / family_version bytes so that a reader can validate sketch identity without first parsing the metadata.

Raises ArgumentError for malformed inputs (payload over 4 GiB, bad flags, etc).

Examples

iex> meta = ExDataSketch.Hash.Metadata.new(:xxhash3, 0, 1, 1, :rust)
iex> bin = ExDataSketch.Binary.Header.encode(meta, <<1, 2, 3>>)
iex> <<"EXSK", 2, _rest::binary>> = bin
iex> byte_size(bin) > 0
true

magic()

@spec magic() :: binary()

Returns the EXSK magic bytes.

Examples

iex> ExDataSketch.Binary.Header.magic()
"EXSK"

version()

@spec version() :: 2

Returns the current frame version (2).

Examples

iex> ExDataSketch.Binary.Header.version()
2