Recollect (recollect v0.5.1)

Copy Markdown View Source

Pluggable memory engine for Elixir applications.

Provides document ingestion with markdown-aware chunking, configurable vector embeddings (pgvector), LLM-powered entity/relation extraction, knowledge graph storage, hybrid search, pipeline tracking, and memory decay.

Two Tiers

Tier 1 — Full Pipeline: Collections → Documents → Chunks → Entities → Relations. For structured ingestion of documents with chunking, embedding, entity extraction, and graph queries.

Tier 2 — Lightweight Knowledge: Entries + Edges. For simple knowledge storage with embeddings, access tracking, and edge traversal.

Quick Start

# Configure in your app
config :recollect,
  repo: MyApp.Repo,
  embedding: [provider: Recollect.Embedding.OpenRouter, model: "google/text-embedding-004"],
  extraction: [llm_fn: &MyApp.LLM.chat/2]

# Lightweight API (Tier 2)
{:ok, entry} = Recollect.remember("The deploy script lives at scripts/deploy.sh",
  scope_id: workspace_id, owner_id: user_id, entry_type: "note")

{:ok, results} = Recollect.search("deploy script", scope_id: workspace_id)

# Full Pipeline API (Tier 1)
{:ok, doc} = Recollect.ingest("Meeting Notes", long_markdown, owner_id: user_id)
{:ok, _run} = Recollect.process(doc)
{:ok, context} = Recollect.search("project timeline", owner_id: user_id)

Summary

Functions

Format search results as text suitable for LLM system prompts.

Answer a question using memory search as LLM context.

Create an edge between two entries.

Archive stale entries based on access patterns.

Delete a knowledge entry.

Get the most recent session handoff.

Create a session handoff for continuing work across sessions.

Ingest content as a document with deduplication.

Run active invalidation to detect breaking changes and weaken related memories.

Run the learning pipeline to automatically extract knowledge from external sources.

Run the full pipeline on a document: chunk → embed → extract → sync.

Same as process/2 but runs asynchronously via TaskSupervisor.

Re-embed all entries/chunks with nil embeddings.

Store a knowledge entry with auto-embedding.

Hybrid search combining vector similarity and graph traversal.

Graph neighborhood expansion from an entity or entry.

Vector-only search across entries and/or chunks.

Functions

build_context(search_results)

Format search results as text suitable for LLM system prompts.

complete(question, opts \\ [])

Answer a question using memory search as LLM context.

Requires an :llm_fn callback that accepts a list of message maps and returns {:ok, answer_string} or {:error, reason}.

Options

  • :llm_fn (required) — fn messages -> {:ok, string} | {:error, reason} end

  • :system_prompt — Override the default system prompt
  • :owner_id, :scope_id — Scope the search
  • :limit — Max chunks (default: 10)
  • :hops — Graph depth (default: 2)

connect(source_id, target_id, relation, opts \\ [])

Create an edge between two entries.

Options

  • :weight — Float 0.0-1.0 (default: 1.0)
  • :metadata — Map of extra data

decay(opts \\ [])

Archive stale entries based on access patterns.

Default: entries not accessed in 90 days with < 3 accesses.

Options

  • :max_age_days — Days since last access (default: 90)
  • :min_access_count — Minimum accesses to survive (default: 3)

forget(entry_id)

Delete a knowledge entry.

get_handoff(scope_id)

Get the most recent session handoff.

handoff(scope_id, opts \\ [])

Create a session handoff for continuing work across sessions.

Options

  • :scope_id - Required. The scope
  • :what - What you were working on
  • :next - List of next steps
  • :artifacts - Files/links to continue with
  • :blockers - What's blocking progress

Example

Recollect.handoff(workspace_id,
  what: "Implementing user auth",
  next: ["Add login controller", "Create session middleware"]
)

ingest(title, content, opts \\ [])

Ingest content as a document with deduplication.

Returns {:ok, document} or {:ok, :unchanged} if content hash matches.

Options

  • :owner_id (required) — UUID of the owning user/workspace
  • :collection_name — Collection name (default: "default")
  • :source_type — "artifact", "conversation", "manual" (default: "manual")
  • :source_id — External ID for dedup tracking

invalidate(opts \\ [])

Run active invalidation to detect breaking changes and weaken related memories.

Options

  • :scope_id - Required. The scope to invalidate memories for
  • :days - Number of days to scan (default: 7)

Example

{:ok, result} = Recollect.invalidate(scope_id: workspace_id)

learn(opts \\ [])

Run the learning pipeline to automatically extract knowledge from external sources.

Options

  • :scope_id - Required. The scope to learn for
  • :sources - List of sources (default: all enabled)
  • :since - Learn from events since (default: "7 days ago")
  • :dry_run - Preview without creating entries

Example

{:ok, result} = Recollect.learn(scope_id: workspace_id)

process(document, opts \\ [])

Run the full pipeline on a document: chunk → embed → extract → sync.

Returns {:ok, pipeline_run} with tracking info.

process_async(document, opts \\ [])

Same as process/2 but runs asynchronously via TaskSupervisor.

reembed(opts \\ [])

Re-embed all entries/chunks with nil embeddings.

remember(content, opts \\ [])

Store a knowledge entry with auto-embedding.

Options

  • :scope_id — UUID scope (workspace, user, etc.)
  • :owner_id — UUID of the owner
  • :entry_type — String type (default: "note")
  • :summary — Brief summary
  • :source — "agent", "system", "user" (default: "system")
  • :metadata — Map of extra data
  • :confidence — Float 0.0-1.0 (default: 1.0)

search(query, opts \\ [])

Hybrid search combining vector similarity and graph traversal.

Searches both chunks (Tier 1) and entries (Tier 2) if available.

Options

  • :scope_id — Scope UUID (searches entries)
  • :owner_id — Owner UUID (searches chunks + entities)
  • :limit — Max results (default: 10)
  • :min_score — Minimum similarity score (default: 0.0)
  • :hops — Graph expansion depth (default: 1)
  • :tier:full, :lightweight, or :both (default: :both)

search_graph(entity_id, opts \\ [])

Graph neighborhood expansion from an entity or entry.

search_vectors(query, opts \\ [])

Vector-only search across entries and/or chunks.