Orbis.GNSS.Data.Cache (Orbis v0.9.0)

Copy Markdown View Source

Local, on-disk cache for decompressed GNSS products, with atomic writes, SHA-256 integrity, gzip decompression (with a bomb guard), and a JSON provenance sidecar.

The cache stores decompressed product files keyed by their canonical IGS long-name (path-traversal-safe: the filename is validated to contain no path separators or ..). A successful fetch is committed atomically: bytes are written to a temporary file in the same directory, then File.rename/2d into place, so a crashed or partial download can never leave a half-written file visible under its real name.

Alongside each cached file <name> a <name>.provenance.json sidecar records the source URL, the SHA-256 of both the compressed and decompressed bytes, the byte sizes, and the fetch timestamp. The sidecar is part of the commit contract: commit/3 returns {:ok, path} only if both the product and its sidecar are written (if the sidecar cannot be written the product is rolled back), so a committed file always carries its integrity hash. classify/2 uses that stored hash to verify a cache hit when the caller supplies no explicit checksum.

Summary

Functions

Classify the cache entry at path for integrity.

Atomically commit decompressed bytes and their provenance sidecar.

The default cache directory, :filename.basedir(:user_cache, "orbis/gnss").

The default gzip-bomb decompression cap, in bytes.

Decompress a gzip byte buffer, capping the output at max_bytes.

Resolve the absolute path a product would occupy in cache_dir.

Read and decode a product's provenance sidecar, if present.

Compute the lowercase hex SHA-256 of a byte buffer.

Functions

classify(path, expected_sha256 \\ nil)

@spec classify(String.t(), String.t() | nil) ::
  {:hit, String.t()}
  | :absent
  | {:stale, term()}
  | :unverified
  | {:error, term()}

Classify the cache entry at path for integrity.

Returns one of:

  • {:hit, path} — present and verified: against expected_sha256 when the caller supplies one, otherwise against the provenance sidecar's stored decompressed SHA-256;
  • :absent — no cached file;
  • {:stale, {:checksum_mismatch, expected, got}} — present but failed verification (corrupt or stale);
  • :unverified — present but no checksum is available to verify it (no caller hash and no usable sidecar — e.g. a file placed by hand);
  • {:error, reason} — the file exists but could not be read.

The caller decides what to do per mode: a {:hit, _} is always usable; online, a :stale or :unverified entry should be re-downloaded; offline, a :stale entry is terminal while an :unverified one is the best available.

commit(path, decompressed, provenance)

@spec commit(String.t(), binary(), map()) :: {:ok, String.t()} | {:error, term()}

Atomically commit decompressed bytes and their provenance sidecar.

Both files are staged to unique temp files in the cache directory and then renamed into place (rename is atomic on POSIX). The sidecar is required: if it cannot be encoded or renamed, the just-committed product is removed and an error is returned, so a {:ok, path} result always has a matching sidecar carrying the decompressed SHA-256 that classify/2 later verifies against. Creates the cache directory if needed. Returns {:ok, path} or a typed error ({:error, {:cache_dir_not_writable, reason}} / {:error, {:provenance_write_failed, reason}} / {:error, {:temp_file_error, reason}}).

default_dir()

@spec default_dir() :: String.t()

The default cache directory, :filename.basedir(:user_cache, "orbis/gnss").

default_max_decompressed_bytes()

@spec default_max_decompressed_bytes() :: pos_integer()

The default gzip-bomb decompression cap, in bytes.

gunzip(compressed, max_bytes \\ 524_288_000)

@spec gunzip(binary(), pos_integer()) :: {:ok, binary()} | {:error, term()}

Decompress a gzip byte buffer, capping the output at max_bytes.

The cap protects against gzip bombs: decompression runs in a streaming inflate loop that feeds the compressed input in bounded slices, accumulates output chunk by chunk, and aborts the moment the running output size would exceed the limit — so the remainder of a bomb is never materialized and peak memory stays bounded to roughly max_bytes. Returns {:ok, decompressed} or one of {:error, {:decompress_failed, reason}} / {:error, {:decompress_size_exceeded, max_bytes, got}}.

path_for(cache_dir, filename)

@spec path_for(String.t(), String.t()) :: {:ok, String.t()} | {:error, term()}

Resolve the absolute path a product would occupy in cache_dir.

The filename is the product's canonical long-name. The name is validated to contain no directory separators, no .., and no leading /, so a malformed product (or any future change to the catalog) can never escape the cache root.

read_provenance(path)

@spec read_provenance(String.t()) :: {:ok, map()} | :none | {:error, term()}

Read and decode a product's provenance sidecar, if present.

Returns {:ok, map}, :none when there is no sidecar, or {:error, reason} when the sidecar exists but cannot be decoded.

sha256(bytes)

@spec sha256(binary()) :: String.t()

Compute the lowercase hex SHA-256 of a byte buffer.