View Source YOLO.NMS (YOLO v0.2.0)

Elixir NMS (Non-Maximum Suppression)

Learn more about Non-Maximum Suppression (NMS) at: https://builtin.com/machine-learning/non-maximum-suppression

This implementation applies NMS independently for each output class. The following steps are executed for each class:

  1. Filter out all bounding boxes with maximum class probability below prob_threshold (default: 0.5).
  2. Select the bounding box with the highest prob.
  3. Remove any remaining bounding boxes with an IoU >= iou_threshold (default: 0.5).

Summary

Functions

Filters detections based on a confidence probability threshold.

IoU (Intersection over Union) measures the ratio of the area of overlap between two bounding boxes to the area of their union.

Applies Non-Maximum Suppression (NMS) for each class, discarding bounding boxes with an IoU exceeding the specified iou_threshold.

Runs both filter_predictions/2 and nms/2.

Functions

filter_predictions(model_output, prob_threshold, transpose)

@spec filter_predictions(Nx.Tensor.t(), float(), boolean()) :: [[float()]]

Filters detections based on a confidence probability threshold.

Selects detections from model_output where the highest class confidence score exceeds prob_threshold.

Arguments

  • model_output: Input tensor. The standard expected shape is {detections, bbox+classes} (e.g., {8400, 84}). The function can also handle inputs with a leading batch dimension (e.g., {1, 8400, 84}), effectively squeezing internally. If the input shape is {bbox+classes, detections} (e.g., {84, 8400}) or {1, bbox+classes, detections} (e.g., {1, 84, 8400}), set transpose?: true for internal transposition. The first 4 elements of the bbox+classes dimension must be the bounding box coordinates.

  • prob_threshold: Minimum confidence probability to keep a detection.

  • opts: Keyword list options:

    • :transpose? (boolean, default: false): If true, transpose the input model_output before processing.

Returns

A list of detections [cx, cy, w, h, prob, class_idx], sorted descending by prob. Returns [] if no detections meet the threshold.

iou(list1, list2)

IoU (Intersection over Union) measures the ratio of the area of overlap between two bounding boxes to the area of their union.

IoU = A ∩ B / A U B

nms(bboxes, iou_threshold)

Applies Non-Maximum Suppression (NMS) for each class, discarding bounding boxes with an IoU exceeding the specified iou_threshold.

run(tensor, options \\ [])

@spec run(Nx.Tensor.t(), Keyword.t()) :: [[float()]]

Runs both filter_predictions/2 and nms/2.

  1. Filters out detections with a probability below prob_threshold (p < prob_threshold)
  2. Applies Non-Maximum Suppression (NMS) for each class, discarding bounding boxes with an IoU exceeding the specified iou_threshold.

The input tensor is the YOLO model output that can have different shapes depending on the model.

When transpose: false (default):

  • Shape {1=batch_size, num_detections, bbox_coords + num_classes} or {num_detections, bbox_coords + num_classes}
  • Example: {1, 8400, 84} or {8400, 84} (where 4 is the number of bounding box coordinates and 80 is the number of classes)

When transpose: true:

  • Shape {bbox_coords + num_classes, num_detections} or {1, bbox_coords + num_classes, num_detections}
  • Example: {84, 8400} or {1, 84, 8400} (where 4 is the number of bounding box coordinates and 80 is the number of classes)

Where:

  • num_detections: Number of candidate detections (varies by model)
  • bbox_coords: Number of bounding box coordinates (4 for Ultralytics YOLO)
  • num_classes: Number of classes (80 for Ultralytics YOLO trained on COCO dataset)

Returns a list of [bbox_cx, bbox_cy, bbox_w, bbox_h, prob, class_idx].