Downloads and indexes Hex.pm packages into a DuckDB/QuackDB-backed Exograph index by default.
mix exograph.index.hex
mix exograph.index.hex --mode top --limit 5000
mix exograph.index.hex --mode latest --concurrency 8
mix exograph.index.hex --mode latest --web --port 4200Packages are downloaded as tarballs, extracted to a temp directory, indexed,
then cleaned up. Peak disk usage is proportional to --concurrency, not the
total number of packages.
Already-indexed packages (by name+version) are skipped by default.
Use --force to re-index everything.
Options
--mode-latest(default),top, orall--limit- max packages to index--prefix- table prefix (default:hex)--concurrency- parallel download+index workers (default:4)--duckdb-shards- shard count for DuckDB corpus indexing (recommended for large corpora)--duckdb-threads- DuckDB execution threads per shard/server--duckdb-recovery-mode- DuckDB managed-server recovery mode (no_wal_writesfor rebuildable indexes)--manifest-path- write a sharded DuckDB manifest to this path--shard-dir- directory for managed DuckDB shard files--min-mass- minimum fragment AST mass (default:8)--reach- include Reach call graph extraction--force- re-index already-indexed packages--no-bm25- skip ParadeDB BM25 index creation--mirror- tarball mirror URL (repeatable)--cache-tarballs- directory to cache downloaded tarballs--backend-duckdb(default) orpostgres--database-url- Postgres URL (or setEXOGRAPH_DATABASE_URL)--postgres-maintenance-work-mem- session-local maintenance_work_mem during Postgres index builds--postgres-max-parallel-maintenance-workers- session-local max_parallel_maintenance_workers during Postgres index builds--postgres-unlogged- use UNLOGGED Postgres tables for rebuildable local indexes--postgres-defer-indexes- build non-unique Postgres query indexes after corpus loading--postgres-copy- use Postgres COPY for supported high-volume append tables--quackdb-uri- QuackDB URI for DuckDB backend (or setQUACKDB_URI/QUACKDB_TEST_URI)--quackdb-token- QuackDB token for DuckDB backend (or setQUACKDB_TOKEN/QUACKDB_TEST_TOKEN)--duckdb-database- managed DuckDB database path when--quackdb-uriis omitted--repo- Ecto repo module (uses built-in if omitted)--timeout- per-package timeout in seconds (default:300)--web- start web UI with live progress dashboard--port- web UI port (default:4200, requires--web)