A small Elixir library for building a coding-agent harness on top of OpenRouter (via req_llm), where each agent conversation runs as its own supervised GenServer, and agents can discover and invoke Claude-Code-style skills — directories containing a SKILL.md with YAML frontmatter that the model can choose to load mid-conversation.

The agent's file tools operate entirely on an in-memory virtual filesystem (path => content) — there is no bash/shell tool and no real disk access of any kind. The caller seeds the initial files, runs the conversation, and gets the resulting file map back to do with as it pleases (write to disk, diff, ship elsewhere, discard).

Installation

def deps do
  [
    {:coding_agent, "~> 0.1.0"}
  ]
end

Architecture

Usage

CodingAgent.OpenRouter.configure!()  # reads OPENROUTER_API_KEY, or pass a key directly

{:ok, pid} =
  CodingAgent.start_session(
    model: CodingAgent.OpenRouter.model("anthropic/claude-sonnet-4.5"),
    skills_dirs: ["skills"],
    max_turns: 10,
    files: %{"lib/foo.ex" => File.read!("lib/foo.ex")}
  )

{:ok, reply, files} = CodingAgent.send_message(pid, "Fix the failing test in lib/foo.ex")
IO.puts(reply)

# the agent never touched disk -- you decide what happens to its edits:
Enum.each(files, fn {path, content} -> File.write!(path, content) end)

To run a session under supervision and reach it by id later:

{:ok, _pid} = CodingAgent.Session.start_session(:my_session, skills_dirs: ["skills"])
CodingAgent.Session.send_message(CodingAgent.Session.via(:my_session), "...")

Streaming

stream_message/4 runs the same agent loop but calls on_chunk with each text chunk as the model produces it (across every turn, including ones that precede a tool call), then returns the same {:ok, reply, files} shape once the whole turn is done:

{:ok, reply, files} =
  CodingAgent.stream_message(pid, "Fix the failing test in lib/foo.ex", &IO.write/1)

on_chunk runs inside the session's own process, so (like send_message/3) the call blocks the session for its duration -- this streams output to the caller, it doesn't make turns concurrent.

Skills

A skill is a directory with a SKILL.md:

skills/
  greet/
    SKILL.md
---
name: greet
description: Use when the user asks to be greeted in Zorbarian.
---

# Greet in Zorbarian

Respond with: "Blibba dorn, <name>!"

Only the name + description are surfaced to the model up front (as a system-prompt catalog); the full body is loaded into context only when the model calls the skill tool with that name — mirroring how Claude Code keeps skill instructions out of context until they're actually needed.

Tests

mix test

Tool and skill-parsing tests run offline. Exercising CodingAgent.Session end-to-end requires OPENROUTER_API_KEY and makes real API calls, so it's left to manual / integration testing rather than the default test suite.