Conflict-free Replicated Tree CRDTs for Elixir.

Grove is a CRDT library for building collaborative editors, structured-document editors, and any application that needs real-time synchronization of tree-structured data (including ASTs) across distributed nodes.

Features

  • Operation-based Tree CRDT — efficient synchronization with minimal bandwidth
  • Eg-walker optimization — near-zero CRDT overhead for sequential editing (the common case)
  • Move operations — reorganize nodes without ever creating cycles
  • Undo/redo — built-in local undo/redo stacks for editor integration
  • Schema DSL — per-field CRDT types (LWW, MV, OR-Set, RGA list, counter) with migrations
  • Primitive CRDTs — counters, registers, sets, maps, and an RGA sequence, all with delta-state support
  • Phoenix LiveView integration — real-time collaboration in web applications
  • Distributed Erlang support — native :pg-based clustering on the BEAM
  • Plain-data conversion — keep plain maps/JSON in your database, use CRDTs only for sync

Architecture Overview

flowchart TB
    subgraph Database
        JSON[(Plain JSON/AST)]
    end

    subgraph "Grove Session"
        SM[Session Manager]
        Tree[CRDT Tree]
    end

    subgraph "LiveView Replicas"
        LV1[LiveView A]
        LV2[LiveView B]
        LV3[LiveView C]
    end

    JSON -->|from_data| Tree
    Tree -->|to_data| JSON

    LV1 <-->|ops| SM
    LV2 <-->|ops| SM
    LV3 <-->|ops| SM

    SM --- Tree

Installation

Add grove to your list of dependencies in mix.exs:

def deps do
  [
    {:grove, "~> 0.2.0"}
  ]
end

Grove requires Elixir ~> 1.20 and is tested against Erlang/OTP 29.

Quick Start

alias Grove.{Tree, Node}

# Create a tree for a replica (the replica id makes node ids globally unique)
tree = Tree.new("node_1")

# Build nodes; a node's parent is set with :parent_id
root  = Node.new("root", :function, attrs: %{name: "hello", arity: 1})
param = Node.new("param_1", :param, parent_id: "root", attrs: %{name: "name"})
body  = Node.new("body_1", :body, parent_id: "root")

# Insert them
tree =
  tree
  |> Tree.put_node(root)
  |> Tree.put_node(param)
  |> Tree.put_node(body)

# Move a node under a new parent (cycle-safe; returns {:error, reason} if invalid)
tree = Tree.move_node(tree, "param_1", "body_1")

# Query
Tree.get_node(tree, "param_1")
Tree.children(tree, "body_1")   # => ["param_1"]

Core Concepts

Plain Data Conversion

Grove separates CRDT state from application data. Store plain maps/JSON in your database and use Grove only for real-time sync:

# Load plain data and convert to a CRDT tree
plain = %{
  id: "form_1",
  type: :form,
  attrs: %{title: "Survey"},
  children: [
    %{id: "field_1", type: :text, attrs: %{label: "Name"}}
  ]
}

tree = Grove.from_data(plain, replica_id: "node_a")

# After editing, convert back to plain data for storage
data = Grove.to_data(tree)

# Round-trip guarantee when all nodes have explicit ids
^plain = Grove.to_data(Grove.from_data(plain, replica_id: "a"))

Replica IDs

Every node in a Grove cluster needs a unique replica id. It is used to generate globally unique node identifiers and to resolve concurrent operations deterministically.

tree = Tree.new(System.get_env("NODE_ID") || "node_1")

Node Structure

Each node in the tree has:

FieldDescription
idUnique identifier (string)
typeNode type (atom)
attrsNode attributes (map)
parent_idParent node reference
childrenOrdered list of child node ids
deleted?Tombstone flag

Operations & Sync

Grove is operation-based: each mutation records an operation and tracks it in the tree's pending_ops. To synchronize, you broadcast those operations to other replicas and apply incoming events with Grove.Tree.apply_remote/2. The Phoenix LiveView integration wires this up for you (broadcast on change, apply on receive), so most applications never touch the low-level event plumbing directly.

Batch Operations

Group multiple operations into a single atomic unit for undo and broadcast:

# Add a composite "address" group as one undoable action.
# Tree.batch/3 returns {updated_tree, delta}; the delta is what you broadcast.
{tree, _delta} =
  Tree.batch(tree, fn t ->
    addr   = Node.new("addr", :group, parent_id: "form_1", attrs: %{label: "Address"})
    street = Node.new("street", :text, parent_id: "addr", attrs: %{label: "Street"})
    city   = Node.new("city", :text, parent_id: "addr", attrs: %{label: "City"})

    t
    |> Tree.put_node(addr)
    |> Tree.put_node(street)
    |> Tree.put_node(city)
  end)

# A single undo reverts the whole batch
{:ok, tree} = Tree.undo(tree)

Operation Metadata

Attach context to operations for audit trails and version history:

tree =
  Tree.update_node(
    tree,
    "field_1",
    fn node -> Node.update_attrs(node, %{label: "Full name"}) end,
    meta: %{user_id: "u1", user_name: "Alice"}
  )

# Read history and pull metadata back out
tree
|> Tree.history(where: [user_id: "u1"])
|> Enum.map(fn entry ->
  meta = Tree.operation_meta(entry.operation)
  "#{meta[:user_name]} edited #{Enum.join(entry.node_ids, ", ")}"
end)

Query API

Find and traverse nodes in the tree:

# Find by id
Tree.get_node(tree, "field_1")

# Find by type
Tree.find_nodes(tree, type: :text)

# Find by attribute
Tree.find_nodes(tree, where: [required: true])

# Combine filters
Tree.find_nodes(tree, type: :text, where: [required: true])

# Path to a node (for breadcrumbs)
Tree.path_to(tree, "field_1")   # => ["form_1", "addr", "field_1"]

# Traversal
Tree.root(tree)
Tree.parent(tree, "field_1")
Tree.children(tree, "addr")
Tree.siblings(tree, "field_1")
Tree.ancestors(tree, "field_1")

Move Operation Semantics

The move operation is the most complex part of tree CRDTs. Grove implements the algorithm from "A highly-available move operation for replicated trees" (Kleppmann et al., 2021) to handle concurrent moves safely.

Cycle Prevention

When two replicas concurrently move nodes in conflicting ways, Grove guarantees no cycle is ever created:

# Replica A: move X under Y
tree_a = Tree.move_node(tree_a, "X", "Y")

# Replica B: move Y under X (concurrent)
tree_b = Tree.move_node(tree_b, "Y", "X")

# After sync, one move "wins" deterministically (by Lamport/HLC timestamp);
# the losing move is discarded so no cycle forms.

Undo / Redo

Grove maintains per-tree undo/redo stacks. Every mutation pushes its inverse onto the undo stack; a new operation clears the redo stack.

{:ok, tree} = Tree.undo(tree)   # or {:error, :empty_stack}
{:ok, tree} = Tree.redo(tree)   # or {:error, :empty_stack}

Tree.can_undo?(tree)   # => boolean
Tree.can_redo?(tree)   # => boolean

Undo/redo is informed by "A Generic Undo Support for State-Based CRDTs" (Yu et al., 2019).

Schema DSL

Define node types with per-field CRDT semantics. Each field can use a different CRDT type, and schemas support versioned migrations.

defmodule MyApp.FormSchema do
  use Grove.Schema

  version 1

  node :dropdown do
    field :label, :lww_register, default: "Untitled"
    field :options, :rga_list, default: []      # ordered list, concurrent inserts merge
    field :tags, :or_set, default: MapSet.new()  # set, adds/removes merge
  end

  node :text_field do
    field :label, :lww_register, default: ""
    field :placeholder, :lww_register, default: ""
  end
end

# Create a node whose attrs are backed by per-field CRDTs
node = MyApp.FormSchema.new_node(:dropdown, "node_1", %{label: "Country"})

# Update a single field (CRDT-aware)
node = MyApp.FormSchema.update_field(node, :label, "Region", "node_1")

# Introspection
MyApp.FormSchema.node_types()        # => [:dropdown, :text_field]
MyApp.FormSchema.fields(:dropdown)   # => field definitions

Schema migrations:

defmodule MyApp.FormSchemaV2 do
  use Grove.Schema

  version 2

  node :dropdown do
    field :label, :lww_register
    field :required, :lww_register, default: false  # new in v2
  end

  migrate 1 do
    add_field :dropdown, :required, :lww_register, default: false
  end
end

Field CRDT types:

TypeMerge BehaviorUse Case
:lww_registerLast writer winsSimple values (default)
:mv_registerKeep all concurrent valuesExpose conflicts to the user
:or_setAdd-wins setTags, categories
:rga_listInterleave concurrent insertsOrdered options
:counterSum of incrementsCounts, votes

Phoenix LiveView Integration

Grove provides first-class LiveView integration: mount_tree/3 subscribes the socket to a document topic, apply_and_broadcast/2 applies a local change and broadcasts the delta, and apply_remote/2 applies an incoming delta.

defmodule MyAppWeb.EditorLive do
  use MyAppWeb, :live_view

  def mount(%{"id" => doc_id}, _session, socket) do
    socket =
      Grove.LiveView.mount_tree(socket, doc_id,
        replica_id: socket.id,
        pubsub: MyApp.PubSub
      )

    {:ok, socket}
  end

  def handle_event("update_field", %{"node_id" => node_id, "value" => value}, socket) do
    socket =
      Grove.LiveView.apply_and_broadcast(socket, fn tree ->
        updated =
          Grove.Tree.update_node(tree, node_id, fn node ->
            Grove.Node.update_attrs(node, %{value: value})
          end)

        delta = {:update_attrs, node_id, %{value: value}}
        {updated, delta}
      end)

    {:noreply, socket}
  end

  # Incoming changes from other replicas
  def handle_info({:grove_delta, delta, from_pid}, socket) when from_pid != self() do
    {:noreply, Grove.LiveView.apply_remote(socket, delta)}
  end
end

The current tree is available with Grove.LiveView.get_tree/1, and local-only updates can be made with Grove.LiveView.update_tree/2.

Presence Tracking

When the optional phoenix_pubsub/phoenix_live_view dependencies are present, Grove tracks per-replica cursors, selections, and focus (backed by Phoenix.Tracker). The helpers operate on the socket:

socket = Grove.set_cursor(socket, "node_123", 4)
socket = Grove.set_selection(socket, "node_123", "node_124")
socket = Grove.set_focus(socket, "node_123")

Grove.get_presences(socket)     # all replicas' presence (with auto-assigned colors)
Grove.get_my_presence(socket)   # this replica's presence

Session Management

Grove.Session runs a document as a supervised process that owns the tree, fans out changes to subscribers, and persists periodically:

{:ok, session} = Grove.Session.get_or_start("doc_123", replica_id: "node_1")

Grove.Session.join(session, self())

# Apply a mutation to the shared tree
Grove.Session.mutate(session, fn tree ->
  Grove.Tree.put_node(tree, Grove.Node.new("n1", :text))
end)

Grove.Session.get_state(session)   # current Grove.Tree
Grove.Session.info(session)        # %{replicas: ..., last_activity: ..., ...}

Grove.Session.leave(session, self())

Session lifecycle:

  • Grace period — the session stays alive for a short window after the last replica leaves
  • Periodic persistence — auto-saves on a timer when the document is dirty

Distributed Erlang Integration

Grove starts a :pg-based membership process (Grove.Cluster.Membership) in its supervision tree, and an optional Grove.Replication.Server provides periodic delta sync between replicas over :pg or Phoenix PubSub. This lets Grove cluster natively across BEAM nodes without any external coordination service.

Persistence

Grove ships pluggable storage adapters and a document-storage strategy that combines snapshots with an event log:

# Snapshot the current tree and restore it later.
# create_snapshot/1 succeeds at a "critical" version, returning {:ok, snapshot}.
{:ok, snapshot} = Grove.Tree.create_snapshot(tree)
tree = Grove.Tree.from_snapshot(snapshot, "node_1")

Serialization uses a compact columnar encoding (RLE + zlib) for tree events (Grove.Tree.EventSerializer).

Performance

Grove is optimized for real-time collaborative editing:

OperationTime ComplexitySpace Complexity
InsertO(1) amortizedO(1)
MoveO(log n)O(1)
Apply OpO(1) amortizedO(1)
Find NodeO(1)
TraverseO(n)O(depth)

Benchmarks

Date: 2026-06-14
Elixir: 1.20.1, OTP: 29

Scenario                          ips        Average    Memory
Sequential (fast-path)            918        1.09 ms    1.36 MB
Collaborative (mixed)             781        1.28 ms    1.68 MB
High-concurrency (slow-path)      834        1.20 ms    1.51 MB

Local Operations                  ips        Average
get_node                        9.59 M       104 ns
put_node                        2.08 M       481 ns
update_node                     1.11 M       900 ns

Note: The collaborative and high-concurrency scenarios generate their workload with an unseeded Enum.shuffle, so those rows are not reproducible run-to-run — treat them as indicative, not exact. The fast-path/local/scalability figures are deterministic.

Key insight: For sequential editing (the common case), Eg-walker applies operations with near-zero CRDT overhead. The fast-path/slow-path gap grows with the degree of genuine concurrency in the workload.

See the benchmarks guide for detailed analysis.

Eg-walker Optimization

Grove implements the Eg-walker algorithm (EuroSys 2025 Best Paper) which optimizes for the common case: sequential editing.

  • Fast-path: when edits are sequential (one user or turn-taking), operations apply directly with near-zero CRDT overhead
  • Slow-path: when true concurrency occurs, a full CRDT merge ensures correctness

Most collaborative editing is sequential, so this optimization provides significant real-world performance benefits.

Testing

Grove.Testing provides utilities for verifying CRDT convergence:

defmodule MyApp.CrdtTest do
  use ExUnit.Case
  import Grove.Testing

  test "concurrent edits converge" do
    # Create N isolated replicas of a CRDT module
    [a, b] = create_replicas(Grove.Set.ORSet, 2)

    # Make concurrent edits on each replica
    a = Grove.Set.ORSet.add(a, "x")
    b = Grove.Set.ORSet.add(b, "y")

    # Sync all replicas and assert convergence
    [a, b] = sync_all([a, b])
    assert_convergent([a, b])
  end
end
FunctionDescription
create_replicas/3Create N isolated replicas of a CRDT module
sync/2Sync two replicas bidirectionally
sync_all/1Sync all replicas in a list
assert_convergent/1Assert all replicas have the same value
replica_values/1Materialize each replica's value
mutate_replica/3Apply a function to one replica by index
random_operations/3Generate random operations for fuzzing

Roadmap

The following are planned but not yet implemented. See guides/roadmap.md for the full list.

  • A public node-delete operation and tombstone garbage collection for tree nodes (today, structural GC exists only for OR-Set/OR-Map metadata via Grove.GC.Structural)
  • Selective undo of a specific past operation (only stack-based undo/redo exists today)
  • Elixir AST conversion helpers (from_elixir_ast / to_elixir_ast)
  • JSON serialization and an Ecto type for storing trees
  • Gossip / anti-entropy replication and version-vector full-state sync for new replicas

Research Background

Grove is based on peer-reviewed research in distributed systems and CRDTs:

Core Papers

Foundational CRDT Research

Undo/Redo

See the architecture guide for implementation details.

Comparison with Other Libraries

FeatureGroveAutomergeYjsLoro
LanguageElixirRust/JSJSRust/JS
Tree CRDT✅ First-class✅ JSON✅ XML✅ Movable tree
Move operation❌ (research only)❌ (removed)
Eg-walker optimization✅ (event-graph replay)
Undo/redo✅ local✅ local
Rich text
BEAM native
Phoenix integration

Contributing

Contributions are welcome!

# Clone the repository
git clone https://gitlab.com/tristanperalta/grove.git
cd grove

# Install dependencies
mix deps.get

# Run tests
mix test

# Run benchmarks
mix run benchmarks/eg_walker_bench.exs

License

Grove is released under the MIT License. See LICENSE for details.