ExTorch.JIT.Server
(extorch v0.4.0)
Copy Markdown
A GenServer that wraps a loaded TorchScript model for concurrent serving.
Provides process isolation, fault tolerance, and serialized access to model inference. Forward calls are serialized through the GenServer to ensure thread safety for models with mutable state (e.g., BatchNorm, Dropout).
Telemetry Events
The server emits the following :telemetry events:
[:extorch, :jit, :load, :start]- When model loading begins.- Measurements:
%{system_time: integer} - Metadata:
%{path: String.t(), device: atom()}
- Measurements:
[:extorch, :jit, :load, :stop]- When model loading completes.- Measurements:
%{duration: native_time} - Metadata:
%{path: String.t(), device: atom()}
- Measurements:
[:extorch, :jit, :load, :exception]- When model loading fails.- Measurements:
%{duration: native_time} - Metadata:
%{path: String.t(), device: atom(), kind: atom(), reason: term()}
- Measurements:
[:extorch, :jit, :forward, :start]- When inference begins.- Measurements:
%{system_time: integer} - Metadata:
%{path: String.t(), device: atom(), input_count: integer()}
- Measurements:
[:extorch, :jit, :forward, :stop]- When inference completes.- Measurements:
%{duration: native_time} - Metadata:
%{path: String.t(), device: atom(), input_count: integer()}
- Measurements:
[:extorch, :jit, :forward, :exception]- When inference fails.- Measurements:
%{duration: native_time} - Metadata:
%{path: String.t(), device: atom(), input_count: integer(), kind: atom(), reason: term()}
- Measurements:
Example
{:ok, pid} = ExTorch.JIT.Server.start_link(path: "model.pt", device: :cpu)
result = ExTorch.JIT.Server.predict(pid, [input_tensor])
Summary
Functions
Returns a specification to start this module under a supervisor.
Get information about the loaded model.
Run inference on the model (synchronous).
Start a model server.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec info(GenServer.server()) :: map()
Get information about the loaded model.
@spec predict(GenServer.server(), [ExTorch.Tensor.t()], timeout()) :: term()
Run inference on the model (synchronous).
Arguments
server- PID or registered name of the model server.inputs- List of input tensors.timeout- Call timeout in milliseconds (default: 30_000).
Returns
The model output.
@spec start_link(keyword()) :: GenServer.on_start()
Start a model server.
Options
:path(required) - Path to the.ptmodel file.:device- Device to load the model onto (default::cpu).:name- Optional registered name for the server.:eval- Whether to set the model to eval mode on load (default:true).