ExTorch.Export.Server (extorch v0.3.0)

Copy Markdown

A GenServer that wraps a loaded torch.export.save model for concurrent serving.

Uses the pure Elixir ATen graph interpreter -- no JIT, no AOTI, no C++ ExportedProgram support needed.

Telemetry Events

The server emits the following :telemetry events:

  • [:extorch, :export, :load, :start | :stop] - Model loading.

  • [:extorch, :export, :forward, :start | :stop | :exception] - Inference.

All events include %{path: String.t()} in metadata. Forward events also include %{input_count: integer()}.

Example

{:ok, pid} = ExTorch.Export.Server.start_link(path: "model.pt2")
output = ExTorch.Export.Server.predict(pid, [input])

Named servers

{:ok, _} = ExTorch.Export.Server.start_link(path: "model.pt2", name: MyModel)
output = ExTorch.Export.Server.predict(MyModel, [input])

Summary

Functions

Returns a specification to start this module under a supervisor.

Get information about the loaded model.

Run inference on the model (synchronous).

Start an Export model server.

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

info(server)

@spec info(GenServer.server()) :: map()

Get information about the loaded model.

predict(server, inputs, timeout \\ 30000)

@spec predict(GenServer.server(), [ExTorch.Tensor.t()], timeout()) :: term()

Run inference on the model (synchronous).

Returns

The output tensor (or list of tensors for multi-output models).

start_link(opts)

@spec start_link(keyword()) :: GenServer.on_start()

Start an Export model server.

Options

  • :path (required) - Path to the .pt2 archive from torch.export.save.
  • :name - Optional registered name.