Recollect.Maintenance.Reembed (recollect v0.5.1)

Copy Markdown View Source

Re-embeds entries, chunks, and entities using a pluggable embedding function and tracks per-row provenance via embedding_model_id.

Options

  • :embedding_fn(text -> {:ok, vector, model_id} | {:ok, vector} | {:error, term}) called per row. Defaults to a function that delegates to the configured Recollect.EmbeddingProvider (existing behavior).

  • :progress_callback(progress_map -> :ok) invoked per batch with %{table:, processed:, total:, current_batch:}.
  • :batch_size — rows per batch (default: 100).
  • :concurrency — parallel embedding tasks per batch (default: 2).
  • :tables — which tables to re-embed (default: all three).
  • :scope — what to select for re-embedding:
    • :nil_only (default) — rows where embedding IS NULL
    • :all — every row in the table
    • {:stale_model, current_model_id} — rows whose stored embedding_model_id differs from current_model_id or is NULL

Summary

Functions

Re-embeds rows in the specified tables according to the given scope.

Functions

run(opts \\ [])

Re-embeds rows in the specified tables according to the given scope.

See module documentation for available options.