View Source YOLO.Model behaviour (YOLO v0.1.1)
Defines a behaviour for implementing YOLO object detection models.
This module provides the structure for loading and running YOLO models for object detection.
The default implementation is YOLO.Models.YoloV8
, but you can create custom implementations
for other YOLO variants.
Required Callbacks
To implement this behaviour, you need to define these functions:
preprocess/3
: Prepares an input image for the model- Takes a model struct, input image, and options
- Returns
{preprocessed_tensor, scaling_config}
- See
YOLO.Models.YoloV8
for an example implementation
postprocess/4
: Processes the model's raw output into detected objects- Takes model struct, model output tensor, scaling config, and options
- Returns list of detected objects as
[cx, cy, w, h, prob, class_idx]
- Handles tasks like non-maximum suppression and coordinate scaling
Types
t()
: The model struct containing::ref
- Reference to loaded ONNX model:model_impl
- Module implementing this behaviour:shapes
- Input/output tensor shapes:classes
- Map of class indices to labels
detected_object()
: Map containing detection results::bbox
- Bounding box coordinates (cx, cy, w, h):class
- Detected class name:class_idx
- Class index:prob
- Detection probability
Summary
Callbacks
Post-processes the model's raw output to produce a list of detected objects.
Prepares input image tensors for the model.
Types
Callbacks
@callback postprocess( model :: t(), model_output :: Nx.Tensor.t(), scaling_config :: ScalingConfig.t(), options :: Keyword.t() ) :: [[float()]]
Post-processes the model's raw output to produce a list of detected objects.
The raw output from the model is a tensor containing bounding box coordinates and class probabilities for each candidate detection.
For example, YOLOv8 outputs a {1, 84, 8400}
tensor where:
- 84 represents 4 bbox coordinates + 80 class probabilities
- 8400 represents the number of candidate detections
Returns a list of detections where each detection is a list of 6 elements:
[cx, cy, w, h, prob, class_idx]
where:
cx
,cy
: center x,y coordinates of bounding boxw
,h
: width and height of bounding boxprob
: detection probabilityclass_idx
: class index
The implementation should:
- Filter low probability detections
- Apply non-maximum suppression (NMS) to remove overlapping boxes
- Scale back the coordinates using the
scaling_config
andYOLO.FrameScaler
module, since the detections are based on the model's input resolution rather than the original image size
See YOLO.Models.YoloV8.postprocess/4
for a reference implementation.
@callback preprocess(model :: t(), image :: term(), options :: Keyword.t()) :: {Nx.Tensor.t(), ScalingConfig.t()}
Prepares input image tensors for the model.
Parameters
model
- The YOLO.Model struct containing model informationimage
- Input image in implementation's native format (e.g. Evision.Mat)options
- Keyword list of options::frame_scaler
- Module implementing YOLO.FrameScaler behaviour (required)
Returns
{input_tensor, scaling_config}
tuple where:input_tensor
is the preprocessed Nx tensor ready for model input, where shape is{1, channels, height, width}
scaling_config
contains scaling/padding info for postprocessing
Look at the YOLO.Models.YoloV8.preprocess/3
implementation to see how this callback is implemented.