Computes structural fingerprints (hashes) for AST subtrees.
Every subtree whose mass (node count) meets the threshold is hashed. Two normalized ASTs with the same hash are structurally identical clones.
Each fragment also carries a set of lightweight sub-hashes from its child
subtrees, computed during the same walk, for efficient Jaccard-based
fuzzy candidate pruning in ExDNA.Detection.Fuzzy.
Sliding windows over sibling sequences in module bodies are fingerprinted to catch clones that span multiple adjacent statements.
Summary
Functions
Compute a deterministic hash for a normalized AST.
Walk an AST and return all subtree fragments that meet min_mass.
Count the number of AST nodes in a tree (its "mass").
Types
@type fragment() :: %{ hash: hash(), mass: pos_integer(), ast: Macro.t(), file: String.t(), line: pos_integer(), sub_hashes: MapSet.t(integer()) }
@type hash() :: binary()
Functions
Compute a deterministic hash for a normalized AST.
@spec fragments(Macro.t(), String.t(), pos_integer(), keyword()) :: [fragment()]
Walk an AST and return all subtree fragments that meet min_mass.
@spec mass(Macro.t()) :: non_neg_integer()
Count the number of AST nodes in a tree (its "mass").