ptc_runner_mcp can run as an opt-in Streamable HTTP MCP server for
private-network deployments. The HTTP mode is intended for a trusted
agentic application or load balancer talking to one BEAM node over a
small network boundary; stdio remains the default local-client mode.
Quick Start
Generate a strong bearer token:
export PTC_RUNNER_MCP_HTTP_AUTH_TOKEN="$(openssl rand -base64 32)"
Start the release on loopback:
_build/prod/rel/ptc_runner_mcp/bin/ptc_runner_mcp start \
--http \
--http-auth-token "$PTC_RUNNER_MCP_HTTP_AUTH_TOKEN"
The MCP endpoint is POST /mcp on 127.0.0.1:7332 by default. In
HTTP mode stdio is not attached.
Required Headers
Every /mcp request needs:
Authorization: Bearer <token>The first request is initialize without an MCP-Session-Id. The
response includes:
MCP-Session-Id: <opaque id>Subsequent POST and DELETE requests send that session id. Clients should
also send MCP-Protocol-Version; if it is absent the server falls back
to the version stored on the HTTP session.
Health And Readiness
GET /health is liveness: it returns 200 while the process and
listener are alive.
GET /ready is load-balancer readiness: it returns 200 when the
session registry is accepting work, and 503 when the node is draining
or saturated.
Both endpoints are unauthenticated and expose only process status. Keep them on a private network or restrict them at the load balancer.
Security Posture
Use a private subnet, security group, firewall, or equivalent network boundary. The bearer token is the inner boundary, not the only boundary.
TLS should terminate at the load balancer or edge proxy. If traffic from the proxy to the BEAM node is plaintext, anything able to observe that link can see the bearer token.
Binding to a non-loopback address always requires a token.
--http-disable-auth is only permitted on loopback binds and cannot be
combined with --http-allow-unsafe-network.
The v1 auth model has one static token. All token holders are the same
owner, so an MCP-Session-Id is a bearer capability inside that trust
boundary. Session ids are random and are never logged raw.
Failed-auth rate limiting
Repeated failed bearer authentications from the same source are rate
limited to reduce brute-force risk (OPLANE_REQ-00080322). Both missing
and invalid bearer failures count. Once a source exceeds
--http-auth-rate-limit-max-failures within
--http-auth-rate-limit-window-ms, it is temporarily blocked for
--http-auth-rate-limit-block-ms; blocked requests receive 429 with a
Retry-After header and no www-authenticate challenge (so a blocked
caller is not told whether its token was missing or invalid). A
successful authentication immediately resets the source's failure state.
A misconfigured-but-legitimate client can briefly be blocked; it
self-heals after the failure window elapses (not just the block window
— if block_ms < window_ms, the failure count persists across block
expiry until the window resets, so the effective recovery time is
window_ms). The limiter is on by default and is a
no-op when no token is configured or when --http-auth-rate-limit is
false; if the limiter process is unavailable it fails open.
The source is keyed by the peer IP (conn.remote_ip). X-Forwarded-For
/ Forwarded headers are not trusted — behind a reverse proxy every
request appears to originate from the proxy, so deploy per-source
limiting at the proxy too (or in addition). Host/Origin rejections
are not bearer-auth failures and never count toward the limit. Block
decisions log a JSON line (http_auth_rate_limited) with the raw source
IP, which is operationally useful for acting on abuse; the bearer token
itself is never logged.
Token generation
The server requires a minimum of 32 bytes but does not programmatically verify entropy quality. Generate tokens with a cryptographically secure source:
# Recommended — 32 random bytes, base64-encoded (44 characters)
openssl rand -base64 32
# Alternative
python3 -c "import secrets; print(secrets.token_urlsafe(32))"
Do not use low-entropy strings (repeated characters, dictionary words, predictable patterns). Entropy quality is the caller's responsibility.
Redaction
The bearer token is registered with the credential redaction system at
startup. If the raw token value accidentally appears in a log line or
trace payload, the redactor replaces it with [REDACTED]. This is a
defense-in-depth measure — the primary protection is that request logs
and traces use hashed owner identifiers, never the raw token.
Observability
HTTP request logs are JSON lines on stderr. Each request log includes an
instance label, X-Request-Id, method, path, status, duration, and
hashed owner/session ids when known.
The server emits sanitized telemetry under [:ptc_lisp, :http, ...]
for request start/stop, session create/close, auth failures, failed-auth
rate-limit blocks ([:ptc_lisp, :http, :auth, :rate_limited]), limit
rejections, and cancellations.
When --trace-dir is enabled, HTTP tool-call trace events include
owner_hash, mcp_session_hash, and transport_request_id metadata.
Trace payload policy is still controlled by --trace-payloads.
Prometheus /metrics is not shipped in this PR. The config flags are
reserved, but scraping support remains a follow-up.
Rolling Deploy Drain
On application shutdown the registry enters drain mode, so /ready
returns 503 and new /mcp POSTs receive a JSON-RPC server draining
error. The process waits --http-shutdown-grace-ms, then cancels any
remaining in-flight HTTP workers before exit.
DELETE /mcp closes one protocol session immediately. Reaped or deleted
sessions return 404 on later use.
Important Flags
| Flag | Default | Meaning |
|---|---|---|
--http | false | Enable the HTTP listener. |
--http-host | 127.0.0.1 | Bind IP address or localhost. |
--http-port | 7332 | Bind port. |
--http-path | /mcp | MCP endpoint path. |
--http-auth-token | unset | Static bearer token; minimum 32 characters. |
--http-allowed-origin | unset | Exact allowed browser Origin; repeatable or comma-separated. |
--http-max-sessions | 256 | Global HTTP protocol-session cap. |
--http-max-sessions-per-owner | 32 | Per-owner protocol-session cap. |
--http-max-in-flight-per-session | 4 | Per-session executing request cap. |
--http-auth-rate-limit | true | Rate limit failed bearer auth per source. |
--http-auth-rate-limit-window-ms | 60000 | Window over which failures accumulate. |
--http-auth-rate-limit-max-failures | 5 | Failures per window before a source is blocked. |
--http-auth-rate-limit-block-ms | 60000 | Block duration after the threshold is exceeded (429 + Retry-After). |
--http-session-ttl-ms | 3600000 | Absolute protocol-session lifetime. |
--http-session-idle-timeout-ms | 900000 | Idle protocol-session timeout. |
--http-instance-label | hostname | Label stamped into HTTP logs/telemetry/traces. |
For loopback binds, the MCP endpoint rejects requests whose
Host/authority is not loopback. Missing Origin remains valid for
non-browser clients, but invalid browser Origin values are rejected.
POST requests with a present Content-Type must be JSON
(application/json or a +json media type).