Exgit.RepoHandle (exgit v0.1.0)

Copy Markdown View Source

Opt-in process-based handle around a Repository.t() value.

The rest of Exgit exposes repositories as immutable values — you thread the return of every call through to the next. That's the right shape for single-threaded scripts and for call-sites that don't need sharing.

A RepoHandle is the right shape when:

  • Multiple processes need to share one repository's cache (e.g. one LiveView process per user, but all searching the same repo — you want the Promisor cache populated once, reused by all sessions).
  • You want to run background prefetch while foreground operations read the cache as it grows (see Exgit.FS.prefetch_async/3).
  • A single long-lived session progressively accumulates cache state across many calls and needs consistent read semantics.

It is not the right shape for short scripts, one-shot clones, or anywhere you'd rather pass a value than coordinate a process.

Concurrency model

Reads go directly to ETS (no GenServer call, no message copy of the repo). This is 1-2 µs per read, safe under any number of concurrent readers.

Writes are serialized through the handle process. Two concurrent update/2 calls run one at a time; each sees the result of the previous. This means long-running update functions (e.g. an inline prefetch) will block other writes — if that matters, use Exgit.FS.prefetch_async/3, which does the network work outside the handle and only calls update/2 at the end to commit.

Lifecycle

The handle owns its ETS table. When the process exits for any reason — normal stop, crash, Process.exit/2, supervisor shutdown — the table is automatically destroyed by the BEAM. Callers holding a dead handle get {:error, :dead_handle} from fetch/1.

Clients that want the handle to outlive a supervision tree are responsible for wiring it into the right supervisor themselves.

Example

{:ok, handle} = Exgit.RepoHandle.start_link(repo)

# Background prefetch — returns immediately.
{:ok, task} = Exgit.FS.prefetch_async(handle)

# Meanwhile, foreground reads work against the current snapshot.
repo_snapshot = Exgit.RepoHandle.fetch!(handle)
Exgit.FS.grep(repo_snapshot, "HEAD", "auth", max_count: 10)

# Wait for prefetch to finish, then do a full-repo search.
:ok = Exgit.FS.await_prefetch(task)
fresh_snapshot = Exgit.RepoHandle.fetch!(handle)
Exgit.FS.grep(fresh_snapshot, "HEAD", "auth") |> Enum.to_list()

Why ETS and not Agent.get/2

Agent.get/2 still sends a message to the agent process and copies the state back. For a Repository with a large Promisor cache that copy would be 10-100 MB per read — unacceptable for a LiveView that reads on every keystroke.

ETS lookups on a :public, :read_concurrency: true table are lock-free in the typical case and return a reference to the stored term without copying (Erlang 26+ uses read-only-off-heap binaries for large terms). One lookup is ~1-2 µs regardless of repo size.

Summary

Functions

Returns a specification to start this module under a supervisor.

Fetch the current repository value.

Same as fetch/1, but returns the repo directly and raises ArgumentError if the handle is dead or if the table doesn't exist. Callers that want to tolerate dead handles should use fetch/1.

Run fetch_fn against the current repo, deduplicating concurrent callers with the same key.

Replace the stored repository value wholesale.

Start a handle owning initial_repo.

Stop the handle process and destroy its ETS table.

Get the ETS table reference for a handle. Exposed so very latency-sensitive callers can cache it across many reads.

Apply fun to the current repo value and store the result.

Types

t()

@type t() :: pid() | atom()

update_result()

@type update_result() ::
  Exgit.Repository.t() | {:ok, Exgit.Repository.t()} | {:error, term()}

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

fetch(handle)

@spec fetch(t()) :: {:ok, Exgit.Repository.t()} | {:error, :dead_handle | :no_table}

Fetch the current repository value.

Fast-path ETS lookup: no message send to the handle process, no copy of the repo into this process's mailbox. Safe to call on every hot-loop iteration.

Returns {:ok, repo} on success or {:error, :dead_handle} / {:error, :no_table}. See fetch!/1 for a raising variant.

fetch!(handle)

@spec fetch!(t()) :: Exgit.Repository.t()

Same as fetch/1, but returns the repo directly and raises ArgumentError if the handle is dead or if the table doesn't exist. Callers that want to tolerate dead handles should use fetch/1.

fetch_once(handle, key, fetch_fn, timeout \\ 300_000)

@spec fetch_once(
  t(),
  term(),
  (Exgit.Repository.t() -> {:ok, Exgit.Repository.t()} | {:error, term()}),
  timeout()
) :: {:ok, Exgit.Repository.t()} | {:error, term()}

Run fetch_fn against the current repo, deduplicating concurrent callers with the same key.

The canonical shape: multiple processes want to trigger the same expensive network fetch (e.g. prefetch commit history for blame). Without dedup, each caller fires its own identical network call — wasteful.

With fetch_once/4:

  • First caller for key runs fetch_fn(current_repo) OUTSIDE the handle (in a linked Task) so the handle stays responsive to other reads.
  • Subsequent concurrent callers with the same key do NOT re-run the fetch; they block waiting for the first caller's result.
  • Task completes → result commits to the handle's ETS, all waiters receive the same return value.

fetch_fn receives the current repo snapshot and must return {:ok, new_repo} or {:error, reason}.

Example

# Three LV users trigger blame on the same file at once.
# Each tries to prefetch history. fetch_once ensures only ONE
# network fetch happens; the other two wait.
RepoHandle.fetch_once(handle, {:history, commit_sha}, fn repo ->
  Exgit.FS.prefetch_history(repo, "HEAD")
end)

Errors

{:error, :dead_handle} if the handle isn't running. Propagates fetch_fn's errors verbatim. If fetch_fn raises, throws, or exits — or the fetch task is killed — all waiters receive {:error, {:fetch_crashed, reason}}.

put(handle, new_repo)

@spec put(t(), Exgit.Repository.t()) :: :ok

Replace the stored repository value wholesale.

Primarily a convenience for callers who've computed a new repo value outside the handle (e.g. an async prefetch task that finished) and want to commit it atomically without another round trip through the update function.

start_link(initial_repo, opts \\ [])

@spec start_link(
  Exgit.Repository.t(),
  keyword()
) :: GenServer.on_start()

Start a handle owning initial_repo.

Options are forwarded to GenServer.start_link/3. Common ones:

  • :name — register the handle under this name
  • :hibernate_after — hibernate when idle

stop(handle)

@spec stop(t()) :: :ok

Stop the handle process and destroy its ETS table.

table(handle)

@spec table(t()) :: :ets.table()

Get the ETS table reference for a handle. Exposed so very latency-sensitive callers can cache it across many reads.

update(handle, fun, timeout \\ 60000)

@spec update(t(), (Exgit.Repository.t() -> update_result()), timeout()) ::
  :ok | {:error, term()}

Apply fun to the current repo value and store the result.

fun runs inside the handle process — keep it fast. If fun returns {:ok, new_repo} the handle is updated and :ok is returned. If fun returns {:error, reason} the handle is unchanged and the error is surfaced. Any other return is treated as the new repo value directly.

Raises on timeout (default 60s) to surface deadlocks rather than hide them.