ExAtlas.Providers.RunPod (ExAtlas v0.5.0)

Copy Markdown View Source

ExAtlas.Provider implementation for RunPod.

Wraps three RunPod APIs through the single ExAtlas contract:

  • REST management — pod/endpoint/template/network-volume CRUD and pod lifecycle operations. Base URL https://rest.runpod.io/v1.
  • Serverless runtime — job submission, status, streaming against a specific endpoint. Base URL https://api.runpod.ai/v2/<endpoint_id>.
  • Legacy GraphQL — the only surface that exposes GPU catalog pricing. Base URL https://api.runpod.io/graphql.

All calls go through Req (see ExAtlas.Providers.RunPod.Client). Authentication uses Authorization: Bearer <api_key> for REST/runtime and ?api_key= for GraphQL. Every request emits a [:ex_atlas, :runpod, :request] telemetry event.

Capabilities

RunPod reports the following capability atoms:

[:spot, :serverless, :network_volumes, :http_proxy, :raw_tcp,
 :symmetric_ports, :webhooks, :global_networking]

Spawn example

{:ok, pod} =
  ExAtlas.spawn_compute(
    provider: :runpod,
    gpu: :h100,
    image: "pytorch/pytorch:2.5.0-cuda12.1-cudnn9-runtime",
    cloud_type: :secure,
    ports: [{8000, :http}],
    volume_gb: 50,
    auth: :bearer
  )

pod.ports
# [%{internal: 8000, external: nil, protocol: :http,
#    url: "https://abc123-8000.proxy.runpod.net"}]

Serverless example

{:ok, job} =
  ExAtlas.run_job(
    provider: :runpod,
    endpoint: "my-endpoint-id",
    input: %{prompt: "hello"},
    mode: :async
  )

{:ok, done} = ExAtlas.get_job(job.id, provider: :runpod, endpoint: "my-endpoint-id")