Grove Architecture

View Source

A conflict-free replicated tree (CRDT) library for Elixir, designed for collaborative editing of hierarchical data structures like ASTs, documents, and form builders.

Design Philosophy

  1. Tree-First: Purpose-built for hierarchical data, not a generic CRDT library
  2. Operation-Based: Efficient sync via operations, not full-state transfer
  3. Move-Safe: Concurrent moves never create cycles
  4. BEAM-Native: Leverage processes, ETS, :pg, and OTP patterns
  5. Phoenix-Ready: First-class LiveView integration for real-time collaboration

Core Data Model

Tree Structure

Grove represents trees as a collection of nodes with parent-child relationships:

                    
                       root   
                    
           
            
       node_1      node_2      node_3  
            
                       
       node_1a                   node_3a 
                       

Node Structure

Each node contains:

FieldTypeDescription
idString.tUnique identifier ({replica_id}_{lamport_timestamp})
typeatomNode type (e.g., :function, :param, :section)
attrsmapNode attributes (LWW-Register semantics per field)
parent_idString.t | nilParent node reference
children[String.t]Ordered list of child node IDs
deleted?booleanTombstone flag
metamapOperation metadata (user_id, timestamp, etc.)
%Grove.Node{
  id: "replica_1_42",
  type: :dropdown,
  attrs: %{label: "Country", required: true},
  parent_id: "replica_1_12",
  children: ["replica_1_43", "replica_1_44"],
  deleted?: false,
  meta: %{updated_by: "user_123", updated_at: ~U[2024-01-15 10:30:00Z]}
}

Tree State

The tree maintains:

%Grove.Tree{
  replica_id: "replica_1",
  nodes: %{id => Node.t},           # All nodes by ID
  root_id: "root",                  # Root node ID
  clock: 42,                        # Lamport clock
  pending_ops: [],                  # Operations not yet flushed
  undo_stack: [],                   # For undo support
  redo_stack: []                    # For redo support
}

Operations

Grove uses operation-based CRDTs. Each mutation generates an operation that can be broadcast to other replicas.

Operation Types

OperationDescriptionConflict Resolution
InsertAdd new nodeUnique IDs prevent conflicts
DeleteMark node as tombstoneAdd-wins (concurrent insert + delete → node exists)
UpdateModify node attributesLWW per attribute field
MoveChange node's parentLamport timestamp tiebreaker, cycle prevention

Operation Structure

%Grove.Op.Insert{
  id: "replica_1_43",
  type: :option,
  attrs: %{value: "US", label: "United States"},
  parent_id: "replica_1_42",
  position: 0,
  meta: %{user_id: "user_123"}
}

%Grove.Op.Move{
  id: "replica_1_43",
  new_parent_id: "replica_1_50",
  position: 2,
  old_parent_id: "replica_1_42",   # For undo
  old_position: 0,                  # For undo
  timestamp: 156                    # Lamport timestamp for conflict resolution
}

%Grove.Op.Update{
  id: "replica_1_42",
  attrs: %{label: "Select Country"},
  old_attrs: %{label: "Country"},   # For undo
  timestamp: 157
}

%Grove.Op.Delete{
  id: "replica_1_43",
  timestamp: 158
}

%Grove.Op.Batch{
  id: "batch_replica_1_159",
  ops: [op1, op2, op3],             # Atomic group
  timestamp: 159
}

Move Operation Semantics

The move operation is the most complex part. Grove implements the algorithm from Kleppmann et al. (2021).

Cycle Prevention

When concurrent moves would create a cycle, one move is rejected based on Lamport timestamps:

Before:          Concurrent moves:           After sync:
    A                A: move BC               A
                    B: move CB               
    B                                          B
                                              
    C                                          C
                                         (higher timestamp wins)

Move vs Delete Conflicts

When a node is moved to a destination that's concurrently deleted:

# Replica A: move X under Y
# Replica B: delete Y (concurrent)

# Result: X is moved to Y's former parent (orphan rescue)

Implementation

defmodule Grove.Op.Move do
  def apply(tree, %__MODULE__{} = op) do
    cond do
      # Target deleted? Rescue to its former parent
      deleted?(tree, op.new_parent_id) ->
        rescue_to_ancestor(tree, op)

      # Would create cycle?
      creates_cycle?(tree, op.id, op.new_parent_id) ->
        # Compare timestamps, lower timestamp wins (its move is undone)
        resolve_cycle_conflict(tree, op)

      true ->
        do_move(tree, op)
    end
  end

  defp creates_cycle?(tree, node_id, new_parent_id) do
    # Check if new_parent_id is a descendant of node_id
    ancestors = get_ancestors(tree, new_parent_id)
    node_id in ancestors
  end
end

Plain Data Conversion

Grove separates CRDT state from application data. You store plain JSON/maps and only use Grove for real-time sync.

Converting To/From Plain Data

# Load plain data, convert to CRDT tree
plain_ast = load_from_database(doc_id)
tree = Grove.from_data(plain_ast, replica_id: socket.id)

# After editing, convert back to plain data for storage
plain_ast = Grove.to_data(tree)
save_to_database(doc_id, plain_ast)

# Round-trip guarantee
assert Grove.to_data(Grove.from_data(data, replica_id: "a")) == data

What Gets Stripped

PreservedStripped
Node IDsVector clocks
Tree structureTombstones
Attribute valuesPending operations
Node typesUndo/redo stacks
Child orderingReplica metadata

Batch Operations

Composite operations that should be treated as a single unit for undo and broadcast.

alias Grove.{Tree, Node}

# Adding an address field (4 nodes) as one atomic operation.
# Tree.batch/3 returns {updated_tree, compound_delta}.
{tree, _delta} =
  Tree.batch(tree, fn t ->
    addr   = Node.new("addr", :group, parent_id: "form", attrs: %{label: "Address"})
    street = Node.new("street", :text, parent_id: "addr", attrs: %{label: "Street"})
    city   = Node.new("city", :text, parent_id: "addr", attrs: %{label: "City"})
    zip    = Node.new("zip", :text, parent_id: "addr", attrs: %{label: "ZIP"})

    t
    |> Tree.put_node(addr)
    |> Tree.put_node(street)
    |> Tree.put_node(city)
    |> Tree.put_node(zip)
  end)

# The batch is recorded as a single compound operation on the history/undo stack,
# so a single undo reverts all four inserts.
{:ok, tree} = Tree.undo(tree)

Attribute-Level CRDT Types

Different fields can use different conflict resolution strategies:

defmodule MyApp.FormSchema do
  use Grove.Schema

  node :dropdown do
    field :label, :lww_register       # Last-writer-wins (default)
    field :options, :rga_list         # Ordered list, concurrent inserts merge
    field :tags, :or_set              # Unordered set, add-wins
    field :config, :mv_register       # Multi-value, expose conflicts
  end
end

# Merge behavior per field type:
# - :lww_register → Higher timestamp wins
# - :rga_list → Interleaved merge preserving all items
# - :or_set → Union of all added items
# - :mv_register → Returns list of concurrent values

Module Structure

grove/
 lib/
    grove.ex                      # Main API facade
    grove/
        tree.ex                   # Tree state and core operations
        node.ex                   # Node struct
       
        op/                       # Operations
           insert.ex
           delete.ex
           update.ex
           move.ex
           batch.ex
       
        conflict/                 # Conflict resolution
           move_resolver.ex      # Cycle prevention
           lww.ex                # Last-write-wins
           mv.ex                 # Multi-value
       
        query/                    # Tree queries
           find.ex               # find_nodes, get_node
           traverse.ex           # path_to, ancestors, descendants
           filter.ex             # Query DSL
       
        undo/                     # Undo/redo support
           stack.ex
           inverse.ex            # Generate inverse operations
       
        sync/                     # Synchronization
           operations.ex         # Operation encoding/decoding
           merge.ex              # Apply remote operations
       
        data/                     # Plain data conversion
           from_data.ex
           to_data.ex
       
        schema/                   # Schema DSL
           dsl.ex
           types.ex              # Field type registry
       
        session/                  # Document sessions
           manager.ex            # GenServer per document
           registry.ex           # Session lookup
           presence.ex           # Cursor/selection tracking
       
        live_view/                # Phoenix integration
           hooks.ex              # on_mount hooks
           helpers.ex            # mount_tree, apply_and_broadcast
       
        gc/                       # Garbage collection
           tombstone.ex
       
        testing/                  # Test utilities
            generators.ex         # StreamData generators
            helpers.ex            # sync, assert_convergent

 test/
     properties/
         convergence_test.exs
         move_test.exs
         cycle_test.exs

Session Management

Each document gets a managed session that handles replication and lifecycle.

defmodule Grove.Session.Manager do
  use GenServer

  defstruct [
    :doc_id,
    :tree,
    :replicas,            # Connected LiveView PIDs
    :last_activity,
    :persist_timer,
    :grace_timer
  ]

  # Lifecycle
  def join(doc_id, replica_pid)   # Register replica, return current tree
  def leave(doc_id, replica_pid)  # Unregister replica
  def info(doc_id)                # Get session info

  # Operations
  def apply_op(doc_id, op)        # Apply and broadcast
  def get_state(doc_id)           # Get current tree

  # Automatic behaviors:
  # - Grace period before shutdown when no replicas
  # - Periodic persistence
  # - Operation buffering for debouncing
end

LiveView Integration

Mount Hooks

defmodule Grove.LiveView do
  def on_mount(:subscribe, _params, _session, socket) do
    if connected?(socket) do
      topic = "grove:doc:#{socket.assigns.doc_id}"
      Phoenix.PubSub.subscribe(Grove.PubSub, topic)
      Grove.Session.Manager.join(socket.assigns.doc_id, self())
    end
    {:cont, socket}
  end
end

Usage

defmodule MyAppWeb.EditorLive do
  use Phoenix.LiveView

  def mount(%{"id" => id}, _session, socket) do
    socket = Grove.LiveView.mount_tree(socket, id, replica_id: socket.id, pubsub: MyApp.PubSub)
    {:ok, socket}
  end

  def handle_event("add_field", %{"id" => node_id}, socket) do
    socket =
      Grove.LiveView.apply_and_broadcast(socket, fn tree ->
        node = Grove.Node.new(node_id, :text)
        {Grove.Tree.put_node(tree, node), {:put_node, node_id, node}}
      end)

    {:noreply, socket}
  end

  def handle_info({:grove_delta, delta, from_pid}, socket) when from_pid != self() do
    {:noreply, Grove.LiveView.apply_remote(socket, delta)}
  end
end

Presence Tracking

Built-in cursor and selection tracking for collaborative UIs:

When the optional Phoenix dependencies are present, presence is tracked per-replica on the LiveView socket (backed by Phoenix.Tracker):

# Track focus / selection (operate on the socket)
socket = Grove.set_focus(socket, "node_123")
socket = Grove.set_selection(socket, "node_123", "node_124")

# Get all replica presence (with auto-assigned colors)
Grove.get_presences(socket)
# => %{
#   "replica_a" => %{focus: "node_123", selection: [], color: "#4CAF50"},
#   "replica_b" => %{focus: "node_456", selection: ["node_456"], color: "#2196F3"}
# }

Query API

Find and traverse nodes in the tree:

alias Grove.Tree

# Find by ID
Tree.get_node(tree, "node_123")

# Find by type
Tree.find_nodes(tree, type: :dropdown)

# Find by attribute
Tree.find_nodes(tree, where: [required: true])

# Traversal
Tree.parent(tree, "node_123")
Tree.children(tree, "node_123")
Tree.siblings(tree, "node_123")
Tree.ancestors(tree, "node_123")

# Path (for breadcrumbs)
Tree.path_to(tree, "node_123")
# => ["root", "section_1", "field_group", "node_123"]

Garbage Collection

Structural GC of OR-Set/OR-Map metadata is available today via Grove.GC.Structural. Causal-stability and time-based tombstone GC for tree nodes is planned (see roadmap); the intended design is:

Planned GC Strategy

  1. Track observation: Each replica tracks its vector clock
  2. Exchange clocks: Periodically gossip vector clocks via :pg
  3. Compute stability: Find operations observed by ALL replicas
  4. Prune: Remove tombstones in the stable causal past

Testing Utilities

Grove.Testing provides utilities for verifying CRDT convergence:

defmodule MyApp.CrdtTest do
  use ExUnit.Case
  import Grove.Testing

  test "concurrent edits converge" do
    # Create N isolated replicas of a CRDT module
    [a, b] = create_replicas(Grove.Set.ORSet, 2)

    # Make concurrent edits on each replica
    a = Grove.Set.ORSet.add(a, "x")
    b = Grove.Set.ORSet.add(b, "y")

    # Sync all replicas and assert convergence
    [a, b] = sync_all([a, b])
    assert_convergent([a, b])
  end
end

Performance

OperationTime ComplexitySpace Complexity
put_nodeO(1) amortizedO(1)
move_nodeO(log n)O(1)
update_nodeO(1)O(1)
get_nodeO(1)
find_nodesO(n)O(matches)
apply_remoteO(k)O(k)
path_toO(depth)O(depth)

Research Background

Grove implements algorithms from: