View Source Ortex.Serving (Ortex v0.1.10)

Ortex.Serving Documentation

This is a lightweight wrapper for using Nx.Serving behaviour with Ortex. Using jit and defn functions in this are not supported, it is strictly for serving batches to an Ortex.Model for inference.

examples

Examples

inline-serverless-workflow

Inline/serverless workflow

To quickly create an Ortex.Serving and run it

iex> model = Ortex.load("./models/resnet50.onnx")
iex> serving = Nx.Serving.new(Ortex.Serving, model)
iex> batch = Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}])
iex> {result} = Nx.Serving.run(serving, batch)
iex> result |> Nx.backend_transfer |> Nx.argmax(axis: 1)
#Nx.Tensor<
  s64[1]
  [499]
>

stateful-process-workflow

Stateful/process workflow

An Ortex.Serving can also be started in your Application's supervision tree

model = Ortex.load("./models/resnet50.onnx")
children = [
    {Nx.Serving,
     serving: Nx.Serving.new(Ortex.Serving, model),
     name: MyServing,
     batch_size: 10,
     batch_timeout: 100}
  ]
opts = [strategy: :one_for_one, name: OrtexServing.Supervisor]
Supervisor.start_link(children, opts)

With the application started, batches can now be sent to the Ortex.Serving process

iex> Nx.Serving.batched_run(MyServing, Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}]))
...> {#Nx.Tensor<
f32[1][1000]
Ortex.Backend
 [
   [...]
 ]
>}