textmetrics/diff

Diff algorithms over arbitrary lists.

myers implements the O(ND) algorithm of Myers (1986) and produces an optimal edit script. patience implements Bram Cohen’s patience diff over List(String), which often yields more readable diffs for source code with moved blocks. to_unified renders an edit script of strings in the POSIX unified-diff format.

Types

Which name field rejected the input.

pub type NameField {
  OldName
  NewName
}

Constructors

  • OldName
  • NewName

Validated options for to_unified.

Construct via unified_options; override fields through with_context_lines. Read fields via the old_name / new_name / context_lines accessors.

pub opaque type UnifiedOptions

Returned by with_context_lines when given a negative argument, and by unified_options_checked / with_old_name_checked / with_new_name_checked when the name contains a byte that would corrupt the unified-diff header (\n, \r, \u{0000}, \t).

pub type UnifiedOptionsError {
  ContextLinesNegative(got: Int)
  NameContainsForbiddenBytes(field: NameField, value: String)
}

Constructors

  • ContextLinesNegative(got: Int)
  • NameContainsForbiddenBytes(field: NameField, value: String)

Values

pub fn context_lines(options: UnifiedOptions) -> Int

Read the context-line count.

pub fn myers(old: List(a), new: List(a)) -> List(edit.Edit(a))

Optimal edit script transforming old into new, computed by the Myers (1986) O(ND) algorithm.

The script’s cost (Insert + Delete) equals levenshtein_list of old vs new with substitution counted as one insert plus one delete. Tie-breaking prefers earlier deletions over insertions, matching GNU diff(1).

pub fn new_name(options: UnifiedOptions) -> String

Read the new-file label.

pub fn old_name(options: UnifiedOptions) -> String

Read the old-file label.

pub fn patience(
  old: List(String),
  new: List(String),
) -> List(edit.Edit(String))

Patience diff (Bram Cohen). Identifies anchor lines that are unique in both inputs, computes the longest increasing subsequence of those anchors, and recursively diffs the surrounding segments. Falls back to myers at leaves where no unique anchors exist. Operates on List(String).

pub fn to_unified(
  script: List(edit.Edit(String)),
  options: UnifiedOptions,
) -> String

Render an edit script of strings in POSIX unified-diff format.

When the script contains no Insert or Delete steps the output is exactly the empty string.

pub fn unified_options(
  old_name old_name: String,
  new_name new_name: String,
) -> UnifiedOptions

Default constructor: context_lines = 3, matching POSIX diff -u3. Silently strips bytes that would corrupt the --- <old_name> / +++ <new_name> header (\n, \r, \u{0000}, \t) — see unified_options_checked for the strict variant that surfaces those bytes as a typed error.

pub fn unified_options_checked(
  old_name old_name: String,
  new_name new_name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)

Strict counterpart of unified_options. Returns Error(NameContainsForbiddenBytes(field, value)) when either old_name or new_name contains \n, \r, \u{0000}, or \t. The non-strict variant silently strips those bytes; callers passing user-supplied paths should reach for this builder so the bad input surfaces at the call site instead of producing a label that disagrees with what was passed in.

pub fn with_context_lines(
  options: UnifiedOptions,
  n: Int,
) -> Result(UnifiedOptions, UnifiedOptionsError)

Override the context-line count. Returns Error(ContextLinesNegative(n)) when n < 0.

pub fn with_new_name(
  options: UnifiedOptions,
  name: String,
) -> UnifiedOptions

Override the new-file label. Silently strips \n / \r / \u{0000} / \t — see with_new_name_checked for the strict variant.

pub fn with_new_name_checked(
  options: UnifiedOptions,
  name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)

Strict counterpart of with_new_name. Returns Error(NameContainsForbiddenBytes(NewName, value)) when name contains \n, \r, \u{0000}, or \t.

pub fn with_old_name(
  options: UnifiedOptions,
  name: String,
) -> UnifiedOptions

Override the old-file label. Silently strips \n / \r / \u{0000} / \t — see with_old_name_checked for the strict variant.

pub fn with_old_name_checked(
  options: UnifiedOptions,
  name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)

Strict counterpart of with_old_name. Returns Error(NameContainsForbiddenBytes(OldName, value)) when name contains \n, \r, \u{0000}, or \t.

Search Document