ExDNA.AST.Fingerprint (ExDNA v1.5.0)

Copy Markdown View Source

Computes structural fingerprints (hashes) for AST subtrees.

Every subtree whose mass (node count) meets the threshold is hashed. Two normalized ASTs with the same hash are structurally identical clones.

Each fragment also carries a set of lightweight sub-hashes from its child subtrees, computed during the same walk, for efficient Jaccard-based fuzzy candidate pruning in ExDNA.Detection.Fuzzy.

Sliding windows over sibling sequences in module bodies are fingerprinted to catch clones that span multiple adjacent statements.

Summary

Functions

Compute a deterministic hash for a normalized AST.

Walk an AST and return all subtree fragments that meet min_mass.

Count the number of AST nodes in a tree (its "mass").

Types

fragment()

@type fragment() :: %{
  hash: hash(),
  mass: pos_integer(),
  ast: Macro.t(),
  file: String.t(),
  line: pos_integer(),
  sub_hashes: MapSet.t(integer())
}

hash()

@type hash() :: binary()

Functions

compute_hash(normalized_ast)

@spec compute_hash(Macro.t()) :: hash()

Compute a deterministic hash for a normalized AST.

fragments(ast, file, min_mass, opts \\ [])

@spec fragments(Macro.t(), String.t(), pos_integer(), keyword()) :: [fragment()]

Walk an AST and return all subtree fragments that meet min_mass.

mass(list)

@spec mass(Macro.t()) :: non_neg_integer()

Count the number of AST nodes in a tree (its "mass").