Exgit.Blame (exgit v0.1.0)

Copy Markdown View Source

Per-line authorship attribution for a file at a ref.

For each line of path at ref, blame/3 returns the commit that most recently introduced or modified that line, plus the commit's author metadata.

Semantics

Follows git blame --first-parent semantics:

  • Walks only the first-parent chain. Merge commits are traversed by their first parent; contributions from merged branches are attributed to the merge commit itself if the line's first appearance on the first-parent chain is there.
  • No move/copy detection. Lines that moved or were copied between files are attributed to the commit that placed the line at its current path.
  • No rename following. If path was renamed at some commit in history, blame attributes everything before the rename to the rename commit.
  • Lines are compared by exact byte equality. Whitespace changes count as changes.

The 80% version. Full git blame has ~15 years of heuristics (whitespace ignoring, --ignore-revs, patience diff, move + copy detection) that aren't implemented here. For agent workflows that want "who introduced this line?" this is sufficient; for deep forensics, shell out to real git.

API

{:ok, entries, repo} = Exgit.Blame.blame(repo, ref, path)

Each entry:

%{
  line_number: 1..N,
  line: "source text",
  commit_sha: <<20-byte raw sha>>,
  author_name: "Alice",
  author_email: "alice@example.com",
  author_time: 1_700_000_000,   # Unix seconds
  summary: "first line of commit message"
}

Returns {:error, :not_found} if path doesn't exist at ref, {:error, :not_a_blob} if it's a directory, {:error, :unbounded_history} if the walk exceeds @max_commits_walked (hostile-input guard).

Summary

Functions

Produce per-line authorship attribution for path at reference.

Types

entry()

@type entry() :: %{
  line_number: pos_integer(),
  line: String.t(),
  commit_sha: binary(),
  author_name: String.t(),
  author_email: String.t(),
  author_time: integer(),
  summary: String.t()
}

Functions

blame(repo, reference, path, opts \\ [])

@spec blame(Exgit.Repository.t(), String.t() | binary(), String.t(), keyword()) ::
  {:ok, [entry()], Exgit.Repository.t()} | {:error, term()}

Produce per-line authorship attribution for path at reference.

Options

  • :auto_fetch (default true) — when the repo is a Promisor-backed lazy clone, blame walks history and historical blob versions that FS.prefetch/3 does not pull. With auto_fetch: true, blame transparently triggers a batched commit-graph fetch and a batched path-history blob fetch before starting the walk. The first blame call on a cold repo pays this one-time cost (typically 200-800 ms); subsequent calls are warm.

    With auto_fetch: false, blame does NOT trigger any network requests. If required objects aren't cached, blame truncates its walk at the first missing object and attributes remaining lines to the current commit. Useful when callers want predictable no-network behavior — they should call FS.prefetch_history/2 explicitly beforehand.

  • :on_handle — pass a RepoHandle pid. When set, auto-fetches route through RepoHandle.fetch_once/4, which deduplicates concurrent callers: N concurrent blames on the same file trigger ONE history fetch and ONE path-blob fetch instead of N of each. Without this option, each blame sees its own snapshot of the repo and triggers its own fetches — wasteful for shared-cache scenarios (LiveView, agent pools).

Regardless of the flags, every auto-fetch emits [:exgit, :blame, :auto_fetch, :start] and [:exgit, :blame, :auto_fetch, :stop] telemetry events so silent slowness is visible to operators.