Ferricstore.Merge.Manifest (ferricstore v0.3.1)

Copy Markdown View Source

Crash-safe manifest for in-progress merge operations.

Before a merge starts, the scheduler writes a manifest file to the shard's data directory describing the merge plan (which file IDs are being merged, the target output file ID). If the node crashes mid-merge, the next startup detects the manifest and cleans up the partial merge output.

Manifest file format

The manifest is a binary term file written atomically (write to .tmp then rename). It contains an Erlang term with the merge plan.

Recovery protocol

On shard startup, if a manifest exists:

  1. Delete any partial output files (the new merged log/hint file).
  2. Leave the original input files intact (they are still valid).
  3. Delete the manifest file.
  4. The shard opens normally — the next merge cycle will retry.

This is safe because the Rust compact() function writes to a NEW file and the old files are only deleted AFTER the compaction succeeds and the keydir is updated. If we crash before the old files are deleted, the old files are still valid and the keydir is rebuilt from them on the next open.

Summary

Functions

Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.

Returns true if a merge manifest exists in the given data directory.

Reads the merge manifest from the shard's data directory.

Checks for and recovers from an interrupted merge on startup.

Writes a merge manifest to the shard's data directory.

Types

merge_plan()

@type merge_plan() :: %{
  shard_index: non_neg_integer(),
  input_file_ids: [non_neg_integer()],
  started_at: integer()
}

Functions

delete(data_dir)

@spec delete(Path.t()) :: :ok | {:error, term()}

Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.

exists?(data_dir)

@spec exists?(Path.t()) :: boolean()

Returns true if a merge manifest exists in the given data directory.

read(data_dir)

@spec read(Path.t()) :: {:ok, merge_plan()} | :none | {:error, term()}

Reads the merge manifest from the shard's data directory.

Returns {:ok, plan} if a manifest exists, or :none if no manifest is present (normal state — no interrupted merge).

recover_if_needed(data_dir, shard_index)

@spec recover_if_needed(Path.t(), non_neg_integer()) :: :ok | {:error, term()}

Checks for and recovers from an interrupted merge on startup.

If a manifest exists, this function:

  1. Logs a warning about the interrupted merge.
  2. Removes any partial output files that may have been created.
  3. Deletes the manifest.

The original input files are left intact. The keydir will be rebuilt from them during normal startup, and the next merge cycle will re-merge them.

Parameters

  • data_dir -- path to the shard's data directory
  • shard_index -- for logging purposes

Returns

  • :ok if no manifest was found or recovery succeeded
  • {:error, reason} if cleanup fails

write(data_dir, plan)

@spec write(Path.t(), merge_plan()) :: :ok | {:error, term()}

Writes a merge manifest to the shard's data directory.

The manifest is written atomically: first to a .tmp file, then renamed to the final path. This ensures the manifest is either fully written or absent — never partially written.

Parameters

  • data_dir -- path to the shard's data directory
  • plan -- merge plan map with :shard_index, :input_file_ids

Returns

  • :ok on success
  • {:error, reason} if the file cannot be written