ExTorch.Export
(extorch v0.4.0)
Copy Markdown
Read and introspect PyTorch ExportedProgram .pt2 archives.
This module provides a pure-Elixir reader for .pt2 files produced by
torch.export.save(). It can extract the model graph, weight metadata,
and raw weight tensors without requiring Python or C++ ExportedProgram support.
Python export workflow
import torch
model = MyModel()
model.eval()
exported = torch.export.export(model, (example_input,))
torch.export.save(exported, "model.pt2")Elixir usage
# Load and run inference directly
model = ExTorch.Export.load("model.pt2")
output = ExTorch.Export.forward(model, [input])
# Or read schema and weights separately
schema = ExTorch.Export.read_schema("model.pt2")
weights = ExTorch.Export.read_weights("model.pt2")
# Generate DSL source code
IO.puts(ExTorch.Export.to_elixir("model.pt2", "MyModel"))Note
This reads .pt2 files from torch.export.save, NOT from
aoti_compile_and_package. AOTI-compiled .pt2 files don't contain
the graph or separable weights -- use ExTorch.AOTI for those.
Summary
Functions
Run inference on a loaded Export model.
Run inference using the pre-compiled graph executor.
Run inference using the native graph executor.
Run forward/2 with per-node timing instrumentation. Returns
{output, %{op_target => %{count: N, total_us: T}}}, aggregated by op
target so you can see which ops dominate inference time.
Load an exported .pt2 model for inference.
Read the model schema from an exported .pt2 archive.
Load weight tensors from an exported .pt2 archive.
Generate an ExTorch.NN.Module DSL definition from an exported .pt2 archive.
Functions
@spec forward(ExTorch.Export.Model.t(), [ExTorch.Tensor.t()]) :: ExTorch.Tensor.t() | [ExTorch.Tensor.t()]
Run inference on a loaded Export model.
Interprets the ATen computation graph, dispatching each operation to the corresponding ExTorch tensor function.
Args
model(ExTorch.Export.Model) - the loaded model.inputs([ExTorch.Tensor]) - input tensors, matching the model's user inputs.
Returns
The output tensor (or list of tensors for multi-output models).
Example
model = ExTorch.Export.load("model.pt2")
input = ExTorch.randn({1, 10})
output = ExTorch.Export.forward(model, [input])
@spec forward_compiled(ExTorch.Export.Model.t(), [ExTorch.Tensor.t()]) :: ExTorch.Tensor.t() | [ExTorch.Tensor.t()]
Run inference using the pre-compiled graph executor.
The fastest Export inference path. All op schemas were resolved and
argument templates pre-built at load/2 time. This function only
passes tensors to C++ and gets tensors back — zero encoding overhead.
Falls back to forward_native/2 if the graph couldn't be pre-compiled.
model = ExTorch.Export.load("model.pt2", device: :cuda)
output = ExTorch.Export.forward_compiled(model, [input])
@spec forward_native(ExTorch.Export.Model.t(), [ExTorch.Tensor.t()]) :: ExTorch.Tensor.t() | [ExTorch.Tensor.t()]
Run inference using the native graph executor.
Compiles the schema graph into an instruction stream and executes the
entire graph in a single NIF call via execute_graph, eliminating
per-node NIF boundary crossings. This is significantly faster than
forward/2 for high-node-count models (e.g., ViT with 430 nodes)
while still supporting all ops through the c10::Dispatcher.
Falls back gracefully for ops registered via ExTorch.Export.OpRegistry
since those are also dispatched through the same C++ dispatcher.
model = ExTorch.Export.load("vit_b_16.pt2", device: :cuda)
input = ExTorch.Tensor.to(input, device: :cuda)
output = ExTorch.Export.forward_native(model, [input])
@spec forward_profiled(ExTorch.Export.Model.t(), [ExTorch.Tensor.t()]) :: {ExTorch.Tensor.t() | [ExTorch.Tensor.t()], map()}
Run forward/2 with per-node timing instrumentation. Returns
{output, %{op_target => %{count: N, total_us: T}}}, aggregated by op
target so you can see which ops dominate inference time.
Only meant for diagnostics. Adds ~1μs of measurement overhead per node
from :erlang.monotonic_time/1.
@spec load( String.t(), keyword() ) :: ExTorch.Export.Model.t()
Load an exported .pt2 model for inference.
Reads the graph and weights, and prepares the model for forward/2.
Args
path(String) - path to the.pt2file fromtorch.export.save.opts(keyword) - optional::device(:cpu | :cuda | {:cuda, index}) - device to place all weight tensors on. Defaults to:cpu. When set to:cuda, every loaded parameter/buffer is moved to the GPU at load time, so subsequentforward/2calls run entirely on the GPU (as long as the user input is also on the GPU).
Returns
An %ExTorch.Export.Model{} struct.
Example
# CPU (default)
model = ExTorch.Export.load("model.pt2")
output = ExTorch.Export.forward(model, [input_tensor])
# GPU
model = ExTorch.Export.load("model.pt2", device: :cuda)
input = ExTorch.Tensor.to(cpu_input, device: :cuda)
output = ExTorch.Export.forward(model, [input])
Read the model schema from an exported .pt2 archive.
Returns a map with:
:graph- the computation graph as a list of node maps:inputs- graph input names:outputs- graph output names:weights- weight metadata (name → shape, dtype, requires_grad)
@spec read_weights(String.t()) :: %{required(String.t()) => ExTorch.Tensor.t()}
Load weight tensors from an exported .pt2 archive.
Returns a map of %{fqn => %ExTorch.Tensor{}}.
Generate an ExTorch.NN.Module DSL definition from an exported .pt2 archive.
Maps ATen operations in the graph to ExTorch NN layer types where possible.
Args
path- path to the.pt2file.module_name- name for the generated Elixir module.