Crash-safe manifest for in-progress merge operations.
Before a merge starts, the scheduler writes a manifest file to the shard's data directory describing the merge plan (which file IDs are being merged, the target output file ID). If the node crashes mid-merge, the next startup detects the manifest and cleans up the partial merge output.
Manifest file format
The manifest is a binary term file written atomically (write to .tmp then
rename). It contains an Erlang term with the merge plan.
Recovery protocol
On shard startup, if a manifest exists:
- Delete any partial compaction temp files.
- Leave the original input files intact (they are still valid).
- Delete the manifest file.
- The shard opens normally — the next merge cycle will retry.
This is safe because the shard compaction path writes to compact_*.log
temporary files and only renames them over existing non-active inputs after
the copy succeeds. Recovery must not delete ordinary numbered log files that
are newer than the manifest inputs, because those can be the shard's active
file after a crash between manifest write and compaction start.
Summary
Functions
Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.
Returns true if a merge manifest exists in the given data directory.
Reads the merge manifest from the shard's data directory.
Checks for and recovers from an interrupted merge on startup.
Writes a merge manifest to the shard's data directory.
Types
@type merge_plan() :: %{ shard_index: non_neg_integer(), input_file_ids: [non_neg_integer()], started_at: integer() }
Functions
Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.
Returns true if a merge manifest exists in the given data directory.
@spec read(Path.t()) :: {:ok, merge_plan()} | :none | {:error, term()}
Reads the merge manifest from the shard's data directory.
Returns {:ok, plan} if a manifest exists, or :none if no manifest is
present (normal state — no interrupted merge).
@spec recover_if_needed(Path.t(), non_neg_integer()) :: :ok | {:error, term()}
Checks for and recovers from an interrupted merge on startup.
If a manifest exists, this function:
- Logs a warning about the interrupted merge.
- Removes any partial output files that may have been created.
- Deletes the manifest.
The original input files are left intact. The keydir will be rebuilt from them during normal startup, and the next merge cycle will re-merge them.
Parameters
data_dir-- path to the shard's data directoryshard_index-- for logging purposes
Returns
:okif no manifest was found or recovery succeeded{:error, reason}if cleanup fails
@spec write(Path.t(), merge_plan()) :: :ok | {:error, term()}
Writes a merge manifest to the shard's data directory.
The manifest is written atomically: first to a .tmp file, then renamed
to the final path. This ensures the manifest is either fully written or
absent — never partially written.
Parameters
data_dir-- path to the shard's data directoryplan-- merge plan map with:shard_index,:input_file_ids
Returns
:okon success{:error, reason}if the file cannot be written