Active and future work. Shipped milestones live in PLAN.md as the historical record; the current shape of the library is in ARCHITECTURE.md.

Goals

  • Correctness over performance at every layer. Every layer has its own oracle.
  • No synchronous C++ → BEAM calls and no NIF that blocks on a BEAM operation. No GenServer on the hot path.
  • Bumblebee-first. DistilBERT, Qwen3, ViT, Whisper, plus the Bumblebee 0.7 family (NomicBERT, SmolLM3, ModernBERT).
  • Shippable at every milestone. Backend-only mode is useful on its own; the Defn compiler is additive.

Non-goals

  • Ahead-of-time compilation (mlx::export / IREE-style). Complementary, separate effort.
  • Windows or non-Apple-Silicon Linux GPU. CPU-only Linux is a nice-to-have for CI.
  • Distributed training (mlx::distributed::*), a native optimizer library, FSDP / ring allreduce. Autodiff + small-scale training loops are in scope; large-scale distributed is not.
  • Drop-in replacement for EMLX. We borrow where it's clearly right, but we're not constrained by its API.
  • User-level GPU kernel JIT from Elixir (fast::metal_kernel / fast::cuda_kernel). Orthogonal to Emily's "Nx backend, not a framework" stance.

Deferred to post-1.0

Each line summarises a deferred milestone; the rationale and full revisit plan stays in PLAN.md so readers can find the exact scope that was deferred and why.

  • Typed exception hierarchy (Emily.ShapeError, Emily.DtypeError, Emily.MLXError). Re-evaluate at the 2.x line. See PLAN M19.
  • GPU interop pointers (from_pointer / to_pointer on Nx.Backend, plus a public include/emily.h for downstream NIFs). Revisit when a concrete downstream consumer asks. See PLAN M20.
  • mix emily.doctor extensions for source-build diagnostics. The Mix task itself shipped in 0.4.x for the precompiled-NIF path; the broader source-build probe set (Xcode CLT, CMake version skew, MLX source-tree state) is deferred until adoption surfaces a pattern of failures that elixir_make errors don't already explain. See PLAN M21.

In-roadmap MLX capability gaps

Catalogued from the 2026-04-22 audit against MLX 0.31.1+69. Items already shipped (einsum, SDPA sinks, microscaled quantization modes) are recorded in PLAN. The remaining open items:

#CapabilityStatusTrigger to revisit
B3Sparse / MoE matmuls: gather_qmm, gather_mm, block_masked_mm, segmented_mmDeferredFirst MoE model target (e.g. a Qwen3-MoE variant)
B4bFP8 dtype (to_fp8 / from_fp8)Blocked on Nx upstreamNx gains FP8, or M16 surfaces a concrete user story
B5ThreadLocalStream / set_default_streamInvestigativeSpike to confirm whether it simplifies the per-worker model

A defn-callable fallback for Emily.Fast.einsum/2 (currently eager-only) is also open if a user surfaces cross-backend composability needs — see PLAN M27.

1.0 release

Tracking checklist:

  • API docs and HexDocs reviewed for stale references — see issue #96.
  • CHANGELOG.md accumulated across releases (it is, since 0.3.0).
  • MAINTAINING.md reflects the precompiled-NIF release flow (it does, since 0.3.0).
  • Worked Bumblebee + quantized-Qwen3 examples in notebooks/ (present and grouped in the HexDocs Notebooks section).