View Source Ortex.Serving (Ortex v0.1.5)
Ortex.Serving
Documentation
This is a lightweight wrapper for using Nx.Serving
behaviour with Ortex
. Using jit
and
defn
functions in this are not supported, it is strictly for serving batches to
an Ortex.Model
for inference.
examples
Examples
inline-serverless-workflow
Inline/serverless workflow
To quickly create an Ortex.Serving
and run it
iex> model = Ortex.load("./models/resnet50.onnx")
iex> serving = Nx.Serving.new(Ortex.Serving, model)
iex> batch = Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}])
iex> {result} = Nx.Serving.run(serving, batch)
iex> result |> Nx.backend_transfer |> Nx.argmax(axis: 1)
#Nx.Tensor<
s64[1]
[499]
>
stateful-process-workflow
Stateful/process workflow
An Ortex.Serving
can also be started in your Application's supervision tree
model = Ortex.load("./models/resnet50.onnx")
children = [
{Nx.Serving,
serving: Nx.Serving.new(Ortex.Serving, model),
name: MyServing,
batch_size: 10,
batch_timeout: 100}
]
opts = [strategy: :one_for_one, name: OrtexServing.Supervisor]
Supervisor.start_link(children, opts)
With the application started, batches can now be sent to the Ortex.Serving
process
iex> Nx.Serving.batched_run(MyServing, Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}]))
...> {#Nx.Tensor<
f32[1][1000]
Ortex.Backend
[
[...]
]
>}