Gralkor.Client.HTTP (gralkor_ex v2.0.6)

Copy Markdown View Source

Real Gralkor.Client implementation over HTTP using Req.

Reads config from Application.get_env(:gralkor_ex, :client_http):

  • :url — required. Base URL of the Gralkor server (e.g. "http://127.0.0.1:4000").
  • :plug — optional Req.Test plug tuple for stubbing in tests. Unset in production so Req hits the network directly.

No auth: Gralkor is expected to run under the consumer app's supervision tree, bound to loopback. The consumer owns the trust boundary.

Per-endpoint receive_timeouts, calibrated to the workload. The two endpoints that call Gemini synchronously (/recall and /tools/memory_search) are sized to encompass the google-genai SDK's per-attempt timeout (10 s) and its bounded retry policy (2 attempts, 1–3 s backoff — see server/main.py and gralkor/TEST_TREES.md > Retry ownership). Under sustained Vertex throttling the SDK may still exceed these windows; the consumer's jido_gralkor plugin then degrades gracefully to a memory-less turn.

  • /health (2 s) — cheap liveness check; tight so Gralkor.Connection doesn't flap when the server is under LLM load.
  • /recall (25 s) — graph search (graphiti.search() — RRF, edges only, calls the embedder) plus interpret_facts LLM call. Two sequential L6 calls; each worst-case ~23 s under SDK retry. 25 s covers one full L6.5 retry cycle on the slower call.
  • /tools/memory_search (30 s) — slow graph search (graphiti.search_() with COMBINED_HYBRID_SEARCH_CROSS_ENCODER — cross-encoder reranking + BFS) plus interpret_facts. More upstream work per call than /recall; sized a few seconds higher.
  • /capture (5 s) — server returns 204 immediately after buffering. No synchronous LLM call here; the flush runs in the server-side capture buffer (its own retry schedule).
  • /session_end (5 s) — server returns 204 immediately after scheduling the flush.
  • /tools/memory_add (60 s) — Graphiti entity/edge extraction is slow; only reached from a background Task in the consumer, so the agent never waits.
  • /build-indices, /build-communities (:infinity) — admin operations that scan the whole graph; can run for minutes to hours on a populated database. The operator invokes them explicitly, so blocking the caller is fine.

Returns {:error, reason} on non-2xx or transport failure; raises on missing config or blank session_id. Callers let those surface.