Ferricstore.Merge.Manifest (ferricstore v0.4.3)

Copy Markdown View Source

Crash-safe manifest for in-progress merge operations.

Before a merge starts, the scheduler writes a manifest file to the shard's data directory describing the merge plan (which file IDs are being merged, the target output file ID). If the node crashes mid-merge, the next startup detects the manifest and cleans up the partial merge output.

Manifest file format

The manifest is a binary term file written atomically (write to .tmp then rename). It contains an Erlang term with the merge plan.

Recovery protocol

On shard startup, if a manifest exists:

  1. Delete any partial compaction temp files.
  2. Leave the original input files intact (they are still valid).
  3. Delete the manifest file.
  4. The shard opens normally — the next merge cycle will retry.

This is safe because the shard compaction path writes to compact_*.log temporary files and only renames them over existing non-active inputs after the copy succeeds. Recovery must not delete ordinary numbered log files that are newer than the manifest inputs, because those can be the shard's active file after a crash between manifest write and compaction start.

Summary

Functions

Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.

Returns true if a merge manifest exists in the given data directory.

Reads the merge manifest from the shard's data directory.

Checks for and recovers from an interrupted merge on startup.

Writes a merge manifest to the shard's data directory.

Types

merge_plan()

@type merge_plan() :: %{
  shard_index: non_neg_integer(),
  input_file_ids: [non_neg_integer()],
  started_at: integer()
}

Functions

delete(data_dir)

@spec delete(Path.t()) :: :ok | {:error, term()}

Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.

exists?(data_dir)

@spec exists?(Path.t()) :: boolean()

Returns true if a merge manifest exists in the given data directory.

read(data_dir)

@spec read(Path.t()) :: {:ok, merge_plan()} | :none | {:error, term()}

Reads the merge manifest from the shard's data directory.

Returns {:ok, plan} if a manifest exists, or :none if no manifest is present (normal state — no interrupted merge).

recover_if_needed(data_dir, shard_index)

@spec recover_if_needed(Path.t(), non_neg_integer()) :: :ok | {:error, term()}

Checks for and recovers from an interrupted merge on startup.

If a manifest exists, this function:

  1. Logs a warning about the interrupted merge.
  2. Removes any partial output files that may have been created.
  3. Deletes the manifest.

The original input files are left intact. The keydir will be rebuilt from them during normal startup, and the next merge cycle will re-merge them.

Parameters

  • data_dir -- path to the shard's data directory
  • shard_index -- for logging purposes

Returns

  • :ok if no manifest was found or recovery succeeded
  • {:error, reason} if cleanup fails

write(data_dir, plan)

@spec write(Path.t(), merge_plan()) :: :ok | {:error, term()}

Writes a merge manifest to the shard's data directory.

The manifest is written atomically: first to a .tmp file, then renamed to the final path. This ensures the manifest is either fully written or absent — never partially written.

Parameters

  • data_dir -- path to the shard's data directory
  • plan -- merge plan map with :shard_index, :input_file_ids

Returns

  • :ok on success
  • {:error, reason} if the file cannot be written