Crash-safe manifest for in-progress merge operations.
Before a merge starts, the scheduler writes a manifest file to the shard's data directory describing the merge plan (which file IDs are being merged, the target output file ID). If the node crashes mid-merge, the next startup detects the manifest and cleans up the partial merge output.
Manifest file format
The manifest is a binary term file written atomically (write to .tmp then
rename). It contains an Erlang term with the merge plan.
Recovery protocol
On shard startup, if a manifest exists:
- Delete any partial output files (the new merged log/hint file).
- Leave the original input files intact (they are still valid).
- Delete the manifest file.
- The shard opens normally — the next merge cycle will retry.
This is safe because the Rust compact() function writes to a NEW file and
the old files are only deleted AFTER the compaction succeeds and the keydir
is updated. If we crash before the old files are deleted, the old files are
still valid and the keydir is rebuilt from them on the next open.
Summary
Functions
Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.
Returns true if a merge manifest exists in the given data directory.
Reads the merge manifest from the shard's data directory.
Checks for and recovers from an interrupted merge on startup.
Writes a merge manifest to the shard's data directory.
Types
@type merge_plan() :: %{ shard_index: non_neg_integer(), input_file_ids: [non_neg_integer()], started_at: integer() }
Functions
Removes the merge manifest file. Called after a merge completes successfully or after crash recovery cleanup.
Returns true if a merge manifest exists in the given data directory.
@spec read(Path.t()) :: {:ok, merge_plan()} | :none | {:error, term()}
Reads the merge manifest from the shard's data directory.
Returns {:ok, plan} if a manifest exists, or :none if no manifest is
present (normal state — no interrupted merge).
@spec recover_if_needed(Path.t(), non_neg_integer()) :: :ok | {:error, term()}
Checks for and recovers from an interrupted merge on startup.
If a manifest exists, this function:
- Logs a warning about the interrupted merge.
- Removes any partial output files that may have been created.
- Deletes the manifest.
The original input files are left intact. The keydir will be rebuilt from them during normal startup, and the next merge cycle will re-merge them.
Parameters
data_dir-- path to the shard's data directoryshard_index-- for logging purposes
Returns
:okif no manifest was found or recovery succeeded{:error, reason}if cleanup fails
@spec write(Path.t(), merge_plan()) :: :ok | {:error, term()}
Writes a merge manifest to the shard's data directory.
The manifest is written atomically: first to a .tmp file, then renamed
to the final path. This ensures the manifest is either fully written or
absent — never partially written.
Parameters
data_dir-- path to the shard's data directoryplan-- merge plan map with:shard_index,:input_file_ids
Returns
:okon success{:error, reason}if the file cannot be written