AgentSea. Ingest. Pipeline
(agentsea_ingest v0.1.0)
Copy Markdown
A Broadway pipeline that embeds chunk messages and upserts them into a
vector store. Concurrency, batching, backpressure, and retries are Broadway
settings — there is no hand-rolled scheduler (this is the design's
"EvaluationPipeline parallelism bug is structurally impossible" point).
Each message's data is a chunk %{id, text, metadata} (see
AgentSea.Ingest.chunk_documents/2). Messages are collected into batches,
embedded in a single call, then upserted.
Starting
AgentSea.Ingest.Pipeline.start_link(
name: MyPipeline,
embedder: AgentSea.Embedder.Hashing,
store_mod: AgentSea.VectorStore.Memory,
store: store_pid
)The default producer is Broadway.DummyProducer (drive it with
Broadway.test_message/3 or Broadway.test_batch/3). For production, pass a
real :producer that emits chunk messages.