ExAtlas is a composable, pluggable Elixir SDK for managing GPU and CPU compute
across multiple cloud providers (RunPod, Fly.io Machines, Lambda Labs, Vast.ai,
or any module you write that implements ExAtlas.Provider).
The top-level API is intentionally thin: it validates input, resolves the
provider, builds a ctx, and delegates to the provider module. That means you
write the same call against RunPod today, Lambda Labs tomorrow, and your own
bare-metal backend the day after — only the :provider option changes.
Quick start
# 1. Configure
config :ex_atlas, default_provider: :runpod
config :ex_atlas, :runpod, api_key: System.get_env("RUNPOD_API_KEY")
# 2. Spawn a GPU pod
{:ok, compute} =
ExAtlas.spawn_compute(
gpu: :h100,
image: "pytorch/pytorch:2.5.0-cuda12.1-cudnn9-runtime",
ports: [{8000, :http}],
auth: :bearer
)
compute.ports
# [%{internal: 8000, external: nil, protocol: :http,
# url: "https://<pod_id>-8000.proxy.runpod.net"}]
compute.auth.header
# "Authorization: Bearer kX9fP..."
# 3. Your user's browser talks to the pod directly (bearer token guards access).
# 4. Shut it down when done
:ok = ExAtlas.terminate(compute.id)Running a serverless inference job
{:ok, job} =
ExAtlas.run_job(
endpoint: "abc123",
input: %{prompt: "a beautiful sunset"},
mode: :async
)
{:ok, done} = ExAtlas.get_job(job.id)
done.outputStream a job
ExAtlas.stream_job(job.id) |> Enum.each(&IO.inspect/1)Swapping providers
ExAtlas.spawn_compute(provider: :runpod, gpu: :h100, ...)
ExAtlas.spawn_compute(provider: :lambda_labs, gpu: :h100, ...) # v0.2
ExAtlas.spawn_compute(provider: MyInternalCloud.Provider, gpu: :h100, ...)See ExAtlas.Provider for the behaviour contract and ExAtlas.Config for how
provider + API key resolution works.
Summary
Functions
Cancel an in-flight serverless job.
Return the capability atoms honored by a provider.
Fetch a compute resource by id.
Fetch a serverless job by id.
List compute resources, optionally filtered.
Return the provider's catalog of GPU types + pricing.
Submit a serverless inference job.
Spawn a compute resource.
Resume a stopped compute resource.
Stop a compute resource without destroying storage.
Stream partial results from a running job as a lazy Enumerable.
Terminate and destroy a compute resource.
Types
@type opts() :: keyword()
Functions
Cancel an in-flight serverless job.
Return the capability atoms honored by a provider.
@spec get_compute(String.t(), opts()) :: {:ok, ExAtlas.Spec.Compute.t()} | {:error, term()}
Fetch a compute resource by id.
@spec get_job(String.t(), opts()) :: {:ok, ExAtlas.Spec.Job.t()} | {:error, term()}
Fetch a serverless job by id.
@spec list_compute(opts()) :: {:ok, [ExAtlas.Spec.Compute.t()]} | {:error, term()}
List compute resources, optionally filtered.
@spec list_gpu_types(opts()) :: {:ok, [ExAtlas.Spec.GpuType.t()]} | {:error, term()}
Return the provider's catalog of GPU types + pricing.
@spec run_job(opts()) :: {:ok, ExAtlas.Spec.Job.t()} | {:error, term()}
Submit a serverless inference job.
@spec run_job(ExAtlas.Spec.JobRequest.t(), opts()) :: {:ok, ExAtlas.Spec.Job.t()} | {:error, term()}
@spec spawn_compute(opts()) :: {:ok, ExAtlas.Spec.Compute.t()} | {:error, term()}
Spawn a compute resource.
Accepts either a keyword list (convenience) or a pre-built
ExAtlas.Spec.ComputeRequest. See ExAtlas.Spec.ComputeRequest for the full
field list.
@spec spawn_compute(ExAtlas.Spec.ComputeRequest.t(), opts()) :: {:ok, ExAtlas.Spec.Compute.t()} | {:error, term()}
Resume a stopped compute resource.
Stop a compute resource without destroying storage.
@spec stream_job(String.t(), opts()) :: Enumerable.t()
Stream partial results from a running job as a lazy Enumerable.
Terminate and destroy a compute resource.