Ferricstore.Store.BlobStore (ferricstore v0.4.2)

Copy Markdown View Source

Side-channel blob storage for large values.

New writes append payload records into a shard-local segment log under data_dir/blob/shard_N/segments/, while Bitcask stores the fixed-size BlobRef. Older content-addressed v1 refs remain readable so existing data can be served during the transition.

Summary

Functions

Returns a file ref for a blob after validating the file is regular and has the expected size.

Returns file refs in input order while validating append-segment headers in batches.

Reads and validates a blob by ref.

Reads and validates refs in order.

Stores payload in the shard append segment and returns the small ref.

Stores payloads in one append batch and fsyncs the segment once.

Recovers append-segment files by truncating the first partial or corrupt tail.

Deletes blob files that are not present in live_refs.

Verifies that an existing blob exactly matches its ref.

Verifies a batch of refs, validating duplicate refs once.

Types

protection_token()

@type protection_token() ::
  nil
  | {:blob_store_protection, binary(), non_neg_integer(), [binary()]}
  | [protection_token()]

reason()

@type reason() :: term()

Functions

file_ref(data_dir, shard_index, ref)

@spec file_ref(binary(), non_neg_integer(), Ferricstore.Store.BlobRef.t()) ::
  {:ok, {binary(), non_neg_integer(), non_neg_integer()}} | {:error, reason()}

Returns a file ref for a blob after validating the file is regular and has the expected size.

This is the hot streaming path. It intentionally does not hash the blob on every read; get/3 and verify/3 still verify materialized reads. Full checksum validation belongs in write-time validation and background scrub, not in every sendfile/file-stream GET.

file_refs_many(data_dir, shard_index, refs)

@spec file_refs_many(binary(), non_neg_integer(), [Ferricstore.Store.BlobRef.t()]) ::
  [
    ok: {binary(), non_neg_integer(), non_neg_integer()},
    error: reason()
  ]

Returns file refs in input order while validating append-segment headers in batches.

This is the streaming read hot path for MGET/pipelined GET. Segment refs are grouped by path so a batch that points at one blob segment opens it once, but corruption and missing-file results stay isolated per requested ref.

get(data_dir, shard_index, ref)

@spec get(binary(), non_neg_integer(), Ferricstore.Store.BlobRef.t()) ::
  {:ok, binary()} | {:error, reason()}

Reads and validates a blob by ref.

get_many(data_dir, shard_index, refs)

@spec get_many(binary(), non_neg_integer(), [Ferricstore.Store.BlobRef.t()]) :: [
  ok: binary(),
  error: reason()
]

Reads and validates refs in order.

Segment refs are grouped by append segment so batch reads open each segment once while still returning per-ref errors. Duplicate refs are loaded once and fanned back out to their original positions.

get_range(data_dir, shard_index, ref, relative_offset, count)

@spec get_range(
  binary(),
  non_neg_integer(),
  Ferricstore.Store.BlobRef.t(),
  non_neg_integer(),
  non_neg_integer()
) :: {:ok, binary()} | {:error, reason()}

Reads a byte range from a blob ref.

This materialized range API is used by commands that return bytes through BEAM. Segment-backed partial ranges validate the record header and pread only the requested bytes; full-range reads still validate the full payload checksum. file_ref/3 remains the stat/header-validated streaming path for full large-value reads.

put(data_dir, shard_index, payload)

@spec put(binary(), non_neg_integer(), binary()) ::
  {:ok, Ferricstore.Store.BlobRef.t()} | {:error, reason()}

Stores payload in the shard append segment and returns the small ref.

put_many(data_dir, shard_index, payloads)

@spec put_many(binary(), non_neg_integer(), [binary()]) ::
  {:ok, [Ferricstore.Store.BlobRef.t()]} | {:error, reason()}

Stores payloads in one append batch and fsyncs the segment once.

recover_shard(data_dir, shard_index)

@spec recover_shard(binary(), non_neg_integer()) ::
  {:ok,
   %{
     segments: non_neg_integer(),
     truncated_segments: non_neg_integer(),
     truncated_bytes: non_neg_integer()
   }}
  | {:error, term()}

Recovers append-segment files by truncating the first partial or corrupt tail.

This is called lazily before the first append in a VM and is also public for startup/lifecycle tests. Older valid records before the bad tail remain readable.

storage_stats(data_dir)

@spec storage_stats(binary()) ::
  {:ok,
   %{
     files: non_neg_integer(),
     bytes: non_neg_integer(),
     legacy_files: non_neg_integer(),
     legacy_bytes: non_neg_integer(),
     segment_files: non_neg_integer(),
     segment_bytes: non_neg_integer(),
     tmp_files: non_neg_integer(),
     tmp_bytes: non_neg_integer()
   }}
  | {:error, term()}

sweep_unreferenced(data_dir, shard_index, live_refs)

@spec sweep_unreferenced(binary(), non_neg_integer(), Enumerable.t()) ::
  {:ok,
   %{
     deleted_files: non_neg_integer(),
     deleted_bytes: non_neg_integer(),
     kept_files: non_neg_integer()
   }}
  | {:error, term()}

Deletes blob files that are not present in live_refs.

The caller owns producing a complete live set. This function is deliberately conservative for append segments: a segment is kept while any live v2 ref points into it, prepared refs can register a short protection token until Raft apply finishes, and fresh dead segments are kept for a grace window as a final safety net. The shard must still guard Ra replay safety before calling this, because unreleased Ra log entries can contain older blob refs.

verify(data_dir, shard_index, ref)

@spec verify(binary(), non_neg_integer(), Ferricstore.Store.BlobRef.t()) ::
  :ok | {:error, reason()}

Verifies that an existing blob exactly matches its ref.

This is intended for write/apply correctness boundaries where a ref-only command would otherwise acknowledge a pointer without proving the pointed bytes are intact. It hashes the file in chunks and does not materialize the full payload as a BEAM binary.

verify_many(data_dir, shard_index, refs)

@spec verify_many(binary(), non_neg_integer(), [Ferricstore.Store.BlobRef.t()]) ::
  :ok | {:error, reason()}

Verifies a batch of refs, validating duplicate refs once.

This keeps apply-time blob ref checks fully checksummed while avoiding repeated disk reads when a batch intentionally fans out one payload to many keys.

verify_many(data_dir, shard_index, refs, verifier)

@spec verify_many(
  binary(),
  non_neg_integer(),
  [Ferricstore.Store.BlobRef.t()],
  (binary(), non_neg_integer(), Ferricstore.Store.BlobRef.t() ->
     :ok
     | {:error, reason()})
) ::
  :ok | {:error, reason()}