AgentSea.Ingest.Pipeline (agentsea_ingest v0.1.0)

Copy Markdown

A Broadway pipeline that embeds chunk messages and upserts them into a vector store. Concurrency, batching, backpressure, and retries are Broadway settings — there is no hand-rolled scheduler (this is the design's "EvaluationPipeline parallelism bug is structurally impossible" point).

Each message's data is a chunk %{id, text, metadata} (see AgentSea.Ingest.chunk_documents/2). Messages are collected into batches, embedded in a single call, then upserted.

Starting

AgentSea.Ingest.Pipeline.start_link(
  name: MyPipeline,
  embedder: AgentSea.Embedder.Hashing,
  store_mod: AgentSea.VectorStore.Memory,
  store: store_pid
)

The default producer is Broadway.DummyProducer (drive it with Broadway.test_message/3 or Broadway.test_batch/3). For production, pass a real :producer that emits chunk messages.

Summary

Functions

start_link(opts)