mix metastatic.detect_duplicates
(Metastatic v0.21.2)
View Source
Detects code duplication across source files using the unified MetaAST representation.
Parses source files through the appropriate language adapter, abstracts them to MetaAST, and then detects structural code clones (Type I-III) across the resulting documents. Cross-language detection works out of the box.
Workflow
flowchart TD
Input["Input: files or --dir"] --> Discover["Discover files<br/>filter by extension"]
Discover --> Parse["Parse each file"]
Parse --> Detect{"Language<br/>detection"}
Detect --> Adapter["Select adapter<br/>Python/Elixir/Ruby/..."]
Adapter --> M2["Source -> M1 -> M2<br/>Builder.from_source"]
M2 --> Docs["List of Documents"]
Docs --> Dup["Duplication.detect_in_list<br/>Type I-III clone detection"]
Dup --> Groups["Duplicate groups"]
Groups --> Report["Reporter.format_groups<br/>text / json / detailed"]
Report --> Output["stdout or --output file"]Usage
mix metastatic.detect_duplicates FILE1 FILE2 [OPTIONS]
mix metastatic.detect_duplicates --dir PATH [OPTIONS]Options
--format FORMAT- Output format: text (default), json, or detailed--threshold FLOAT- Similarity threshold for Type III detection (default: 0.8)--output PATH- Write output to file instead of stdout--cross-language- Enable cross-language detection (default: true)--dir PATH- Scan all supported files in directory recursively--help- Display this help message
Supported Languages
Language is auto-detected from file extension.
See Metastatic.Languages for the current list of supported languages
and file extensions.
Examples
# Detect duplicates between two files
mix metastatic.detect_duplicates lib/foo.ex lib/bar.ex
# Cross-language detection
mix metastatic.detect_duplicates lib/foo.ex src/foo.py
# Scan entire directory
mix metastatic.detect_duplicates --dir lib/
# Output as JSON with custom threshold
mix metastatic.detect_duplicates lib/foo.ex lib/bar.ex --format json --threshold 0.85
# Save detailed report to file
mix metastatic.detect_duplicates --dir lib/ --format detailed --output report.txtProgrammatic API
alias Metastatic.{Document, Analysis.Duplication}
doc1 = Document.new(ast1, :elixir)
doc2 = Document.new(ast2, :python)
{:ok, result} = Duplication.detect(doc1, doc2)
{:ok, groups} = Duplication.detect_in_list([doc1, doc2, doc3])