MicrogradEx.Value (MicrogradEx v0.1.0)

Copy Markdown View Source

A scalar value that remembers the expression graph that produced it.

Value is intentionally scalar, just like Andrej Karpathy's original micrograd. Vectors, neurons, layers, and MLPs are built by composing many scalar values. That keeps the math visible: every addition, multiplication, power, and ReLU contributes a small local derivative edge to the graph.

The important Elixir-specific difference is that grad is not mutated during backward/1. The field exists only as a convenient annotation for inspected values. The source of truth is the MicrogradEx.Gradients table returned by backward/1.

Summary

Functions

Adds two values or numbers.

Runs reverse-mode autodiff and returns an immutable gradient table.

Promotes plain numbers to values and leaves existing values unchanged.

Divides the first value or number by the second.

Fetches this value's gradient from a gradient table.

Multiplies two values or numbers.

Negates a value or number.

Creates a new leaf value.

Raises a value or number to a scalar exponent.

Applies the rectified linear unit activation.

Subtracts the second value or number from the first.

Sums a list of values or numbers.

Returns a copy of value with its grad field filled from a gradient table.

Types

t()

@type t() :: %MicrogradEx.Value{
  data: float(),
  grad: float(),
  graph: %{required(pos_integer()) => MicrogradEx.Value.Node.t()},
  id: pos_integer(),
  label: String.t() | nil
}

Functions

add(left, right, opts \\ [])

Adds two values or numbers.

The derivative of a + b with respect to each parent is 1, so the output node stores two parent edges with local gradient 1.0.

backward(output)

Runs reverse-mode autodiff and returns an immutable gradient table.

coerce(value)

Promotes plain numbers to values and leaves existing values unchanged.

This helper is what lets the public arithmetic functions accept both %Value{} structs and numbers:

iex> x = MicrogradEx.Value.new(2.0)
iex> MicrogradEx.Value.mul(x, 3).data
6.0

div(left, right)

Alias for divide/2.

Kernel.div/2 is integer division, so longer examples generally read better with divide/2. This alias exists for users who expect the shorter name from the original arithmetic operation.

divide(left, right)

Divides the first value or number by the second.

Division is represented as multiplication by right ** -1, the same identity used in the original Python source. Keeping it as composition means the graph naturally contains the reciprocal operation and then the multiplication.

grad(value, gradients)

Fetches this value's gradient from a gradient table.

Values that do not influence the output have gradient 0.0.

mul(left, right, opts \\ [])

Multiplies two values or numbers.

For a * b, the local derivative with respect to a is b, and the local derivative with respect to b is a. These parent data values are captured at graph-construction time, just like the closure in the Python version.

neg(value, opts \\ [])

Negates a value or number.

This is implemented as its own operation instead of mul(value, -1) so the graph remains compact and the local derivative is explicit: d(-x)/dx = -1.

new(data, opts \\ [])

Creates a new leaf value.

A leaf is an input to a computation: a training example, a model parameter, a constant promoted into the graph, or any other scalar whose gradient may be interesting later.

The optional :label is never used for math. It exists for debugging, examples, and graph inspection.

pow(value, exponent, opts \\ [])

Raises a value or number to a scalar exponent.

The exponent must be a plain number, not another Value. That matches the original micrograd implementation and keeps the local derivative simple:

d(x ** n) / dx = n * x ** (n - 1)

relu(value, opts \\ [])

Applies the rectified linear unit activation.

ReLU keeps positive inputs and clamps negative inputs to zero. At exactly zero this port follows the original micrograd code: the gradient is 0.0 because the output is not greater than zero.

sub(left, right, opts \\ [])

Subtracts the second value or number from the first.

The output stores local derivatives 1 for the left parent and -1 for the right parent, which is exactly the derivative of a - b.

sum(values, initial \\ new(0.0))

Sums a list of values or numbers.

This is useful when writing losses because Elixir does not have Python's overloaded sum for custom objects. The optional initial value defaults to a differentiable zero leaf.

with_grad(value, gradients)

Returns a copy of value with its grad field filled from a gradient table.

This is only for display or debugging. The returned struct is not special and does not mutate the original computation graph.