textmetrics/diff
Diff algorithms over arbitrary lists.
myers implements the O(ND) algorithm of Myers (1986)
and produces an optimal edit script.
patience implements Bram Cohen’s patience diff over
List(String), which often yields more readable diffs for source
code with moved blocks.
to_unified renders an edit script of strings in
the POSIX unified-diff format.
Types
Which name field rejected the input.
pub type NameField {
OldName
NewName
}
Constructors
-
OldName -
NewName
Validated options for to_unified.
Construct via unified_options; override fields
through with_context_lines. Read fields via
the old_name / new_name /
context_lines accessors.
pub opaque type UnifiedOptions
Returned by with_context_lines when given
a negative argument, and by unified_options_checked /
with_old_name_checked /
with_new_name_checked when the name
contains a byte that would corrupt the unified-diff header
(\n, \r, \u{0000}, \t).
pub type UnifiedOptionsError {
ContextLinesNegative(got: Int)
NameContainsForbiddenBytes(field: NameField, value: String)
}
Constructors
-
ContextLinesNegative(got: Int) -
NameContainsForbiddenBytes(field: NameField, value: String)
Values
pub fn myers(old: List(a), new: List(a)) -> List(edit.Edit(a))
Optimal edit script transforming old into new, computed by the
Myers (1986) O(ND) algorithm.
The script’s cost (Insert + Delete) equals levenshtein_list
of old vs new with substitution counted as one insert plus
one delete. Tie-breaking prefers earlier deletions over insertions,
matching GNU diff(1).
pub fn patience(
old: List(String),
new: List(String),
) -> List(edit.Edit(String))
Patience diff (Bram Cohen). Identifies anchor lines that are unique
in both inputs, computes the longest increasing subsequence of
those anchors, and recursively diffs the surrounding segments.
Falls back to myers at leaves where no unique anchors
exist. Operates on List(String).
pub fn to_unified(
script: List(edit.Edit(String)),
options: UnifiedOptions,
) -> String
Render an edit script of strings in POSIX unified-diff format.
When the script contains no Insert or Delete steps the output
is exactly the empty string.
pub fn unified_options(
old_name old_name: String,
new_name new_name: String,
) -> UnifiedOptions
Default constructor: context_lines = 3, matching POSIX
diff -u3. Silently strips bytes that would corrupt the
--- <old_name> / +++ <new_name> header (\n, \r,
\u{0000}, \t) — see unified_options_checked
for the strict variant that surfaces those bytes as a typed error.
pub fn unified_options_checked(
old_name old_name: String,
new_name new_name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)
Strict counterpart of unified_options.
Returns Error(NameContainsForbiddenBytes(field, value)) when
either old_name or new_name contains \n, \r, \u{0000},
or \t. The non-strict variant silently strips those bytes;
callers passing user-supplied paths should reach for this builder
so the bad input surfaces at the call site instead of producing a
label that disagrees with what was passed in.
pub fn with_context_lines(
options: UnifiedOptions,
n: Int,
) -> Result(UnifiedOptions, UnifiedOptionsError)
Override the context-line count. Returns
Error(ContextLinesNegative(n)) when n < 0.
pub fn with_new_name(
options: UnifiedOptions,
name: String,
) -> UnifiedOptions
Override the new-file label. Silently strips \n / \r /
\u{0000} / \t — see with_new_name_checked
for the strict variant.
pub fn with_new_name_checked(
options: UnifiedOptions,
name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)
Strict counterpart of with_new_name. Returns
Error(NameContainsForbiddenBytes(NewName, value)) when name
contains \n, \r, \u{0000}, or \t.
pub fn with_old_name(
options: UnifiedOptions,
name: String,
) -> UnifiedOptions
Override the old-file label. Silently strips \n / \r /
\u{0000} / \t — see with_old_name_checked
for the strict variant.
pub fn with_old_name_checked(
options: UnifiedOptions,
name: String,
) -> Result(UnifiedOptions, UnifiedOptionsError)
Strict counterpart of with_old_name. Returns
Error(NameContainsForbiddenBytes(OldName, value)) when name
contains \n, \r, \u{0000}, or \t.