View Source Changelog
v1.0.0-rc.0 (2026-06-08)
This is the first release candidate for evision v1.0.0, the first major version. Two headline changes land together:
- evision now targets OpenCV 5.0.0. This is a breaking upgrade (C++17, the legacy C API removed, modules reorganised, new core element types, and a rewritten DNN engine); see OpenCV's release notes for the upstream details.
Evision.Backendis now a working Nx backend, implementing the majority of theNx.Backendbehaviour with results verified againstNx.BinaryBackend.
Added
Evision.Backendis now a working Nx backend (#48). 72 of the 85Nx.Backendcallbacks are implemented, each verified element-for-element againstNx.BinaryBackendacross dtypes, ranks, axes, and broadcasting, socv::Matcan backNxtensors for most workflows. Newly implemented callbacks include:- Reductions:
sum,product,reduce_max,reduce_min,all,any, andargmax/argmin(backed bycv::reduceArgMax/reduceArgMin, with Nx tie-break semantics). - Cumulative ops:
cumulative_sum,cumulative_product,cumulative_min,cumulative_max. - Sorting:
sortandargsortalong any axis, stable in both directions, including the wide-integer depths (u32/s64/u64) thatcv::sortrejects. - Indexing:
take,gather,indexed_add, andindexed_put, with Nx-compatible bounds checking. - Elementwise unary math and
select. init/1(required by theNx.Backendbehaviour since Nx 0.7) and scalar (0-dimensional) tensors infrom_binary.
- Reductions:
- OpenCV 5.0's new native depths are mapped to Nx dtypes:
CV_64Sto{:s, 64},CV_32Uto{:u, 32},CV_64Uto{:u, 64}, andCV_16BFto{:bf, 16}. 64-bit values now round-trip throughcv::Matlosslessly (the old s64-to-s32 downcast that truncated values above 2^31 is gone), andEvision.Mat.at/2returns full-width 64-bit values. - Haar/HOG parity:
Evision.CascadeClassifierandEvision.HOGDescriptorbuild again via the contribxobjdetectmodule, where OpenCV 5.0 moved them. mix evision.backend.bench, a benchmark task for the Evision backend with an optional Torchx comparison.
Changed
- Uses OpenCV 5.0.0.
- Requires Nx
~> 0.12.1. - Module reorganisation follows OpenCV 5.0. Most classes keep their
Evision.*names, but a few feature detectors moved to the contribxfeatures2dmodule and are now underEvision.XFeatures2D.*:AKAZE,KAZE,AgastFeatureDetector, andBRISK. - Multi-channel
raw_typecodes changed. OpenCV 5.0 bumpedCV_CN_SHIFTfrom 3 to 5, so a multi-channelMat's integer type code differs (for examplecv_8UC3is now 64, was 16).Evision.Constant.cv_8UC3/0and friends compute the correct 5.0 values; code that hardcoded these numbers must be updated. Evision.VideoWriter.write/2now returns a boolean. OpenCV 5.0 changedcv::VideoWriter::writeto return a success flag instead ofvoid, so the call no longer returns the writer; write to the same writer handle in a loop.
Removed
- OpenCV 4.x support and the multi-version selection mechanism. evision targets OpenCV 5.0.x only.
- The DNN Darknet and Caffe importers (removed upstream in 5.0). Use
Evision.DNN.readNetFromONNX/1or ONNX-converted models instead. - The DNN Halide backend (removed upstream in 5.0).
Fixed
Evision.BackendN-dimensional broadcasting now matches Nx semantics for rank-differing and multi-axis broadcasts (e.g.{3, 4}to{2, 3, 4},{2, 1, 4}to{2, 3, 4}) and honours Nx's explicit:axes, so non right-aligned broadcasts are correct. Previously every elementwise binary op (add/subtract/divide/min/max/comparisons) could silently disagree withNx.BinaryBackend, and on AArch64 a divide-by-zero in the tiling path returned garbage instead of trapping.Evision.Backendlogical_and/logical_or/logical_xornow use truthiness semantics, so they are correct for non-boolean inputs (logical_and(2, 1)is1, not0) andlogical_xorno longer raises on non-u8types.Evision.Backendinteger scalar operands inmultiply/divideno longer raise ato_nx/2clause error (OpenCV only treats a 1x1 operand as a scalar when it isf64).- Building from source no longer re-runs the OpenCV install on every
mix compile; the cmake-config gate now matches OpenCV 5.0's install path. - 32-bit ARMv8 Nerves targets (rpi3/rpi3a/rpi02, Cortex-A53 built as armv7hf)
compile again. OpenCV 5.0.0's
v_floorNEON fast path uses the AArch64-onlyvcvtmq_s32_f32intrinsic under#if __ARM_ARCH > 7, which GCC's AArch32arm_neon.hdoes not provide; the fast path is now gated on `_aarch64` so AArch32 keeps the portable floor fallback. - Building without contrib modules compiles again against OpenCV 5.0.0. The
hand-written
cv::stereo::MatchQuasiDensevector converter was gated onHAVE_OPENCV_STEREO, but in OpenCV 5.0stereois a new main module (so that macro is always defined) while the quasi-dense stereo types moved to the contribxstereomodule. The converter is now gated onHAVE_OPENCV_XSTEREO, so a build without contrib modules no longer references the absentcv::stereonamespace. - Nerves ARM targets build again with OpenCV 5.0.0. Nerves cross toolchains set
the C/C++ compiler but not the ASM compiler, so cmake's
enable_language(ASM)falls back to the host x86 assembler and hands it OpenCV's bundled MLAS AArch64.SGEMM kernels, which it cannot assemble (no such instruction: 'fmla v5.4s...'). This broke both the 32-bit boards (which report an arm64 processor name) and the 64-bit rpi5. MLAS is now skipped when it needs ASM and the build is cross-compiling, and the DNN module falls back to its built-in SGEMM. - Building on FreeBSD compiles again with OpenCV 5.0.0. Intel IPP ICV's
vendored safestring header declares a 3-argument
memset_sthat conflicts with FreeBSD libc's C11 4-argumentmemset_s, so the optional IPP accelerator is now disabled on FreeBSD.
Performance
Evision.Backendelementwise loops are parallelised withcv::parallel_for_, with stripe counts sized to the thread pool rather than the range length (a naive port dispatched one block per element and ran some ops slower in parallel than serially). Read-only NIF inputs are markedINPUT_ONLYso a cv-owned sourceMatis shared instead of deep-copied; a no-op reshape of a 2 MB tensor drops from ~110us to ~3us.- Reductions read their input in its native dtype and promote per element to the
wide accumulator, dropping a separate cast pass, and a new leading-axis path
avoids transposing first:
reduce_maxover axis 0 of a 512x1024 tensor drops from ~3.2ms to ~0.25ms. - Conv gains an im2row + GEMM fast path for the common 2-D case (single batch
group, no input dilation,
f32/f64), reaching parity with the libtorch backend; other shapes fall back to the general N-d kernel. - Scalar-operand elementwise ops (
add/subtract/multiply/divide/min/max/comparisons/pow/atan2/quotient/remainder/shifts) take a fast path that casts the scalar to a single element instead of materialising a full broadcast array: scalaraddon a 2048x2048 tensor drops from ~20ms to ~1.6ms.
Change logs for v0.2.x are in
CHANGELOG.v0.2.md;
v0.1.x is in
CHANGELOG.v0.1.md.