# Grove Architecture

A conflict-free replicated tree (CRDT) library for Elixir, designed for collaborative editing of hierarchical data structures like ASTs, documents, and form builders.

## Design Philosophy

1. **Tree-First**: Purpose-built for hierarchical data, not a generic CRDT library
2. **Operation-Based**: Efficient sync via operations, not full-state transfer
3. **Move-Safe**: Concurrent moves never create cycles
4. **BEAM-Native**: Leverage processes, ETS, `:pg`, and OTP patterns
5. **Phoenix-Ready**: First-class LiveView integration for real-time collaboration

---

## Core Data Model

### Tree Structure

Grove represents trees as a collection of nodes with parent-child relationships:

```
                    ┌──────────┐
                    │   root   │
                    └────┬─────┘
           ┌─────────────┼─────────────┐
      ┌────▼────┐   ┌────▼────┐   ┌────▼────┐
      │ node_1  │   │ node_2  │   │ node_3  │
      └────┬────┘   └─────────┘   └────┬────┘
      ┌────▼────┐                 ┌────▼────┐
      │ node_1a │                 │ node_3a │
      └─────────┘                 └─────────┘
```

### Node Structure

Each node contains:

| Field | Type | Description |
|-------|------|-------------|
| `id` | `String.t` | Unique identifier (`{replica_id}_{lamport_timestamp}`) |
| `type` | `atom` | Node type (e.g., `:function`, `:param`, `:section`) |
| `attrs` | `map` | Node attributes (LWW-Register semantics per field) |
| `parent_id` | `String.t \| nil` | Parent node reference |
| `children` | `[String.t]` | Ordered list of child node IDs |
| `deleted?` | `boolean` | Tombstone flag |
| `meta` | `map` | Operation metadata (user_id, timestamp, etc.) |

```elixir
%Grove.Node{
  id: "replica_1_42",
  type: :dropdown,
  attrs: %{label: "Country", required: true},
  parent_id: "replica_1_12",
  children: ["replica_1_43", "replica_1_44"],
  deleted?: false,
  meta: %{updated_by: "user_123", updated_at: ~U[2024-01-15 10:30:00Z]}
}
```

### Tree State

The tree maintains:

```elixir
%Grove.Tree{
  replica_id: "replica_1",
  nodes: %{id => Node.t},           # All nodes by ID
  root_id: "root",                  # Root node ID
  clock: 42,                        # Lamport clock
  pending_ops: [],                  # Operations not yet flushed
  undo_stack: [],                   # For undo support
  redo_stack: []                    # For redo support
}
```

---

## Operations

Grove uses operation-based CRDTs. Each mutation generates an operation that can be broadcast to other replicas.

### Operation Types

| Operation | Description | Conflict Resolution |
|-----------|-------------|---------------------|
| `Insert` | Add new node | Unique IDs prevent conflicts |
| `Delete` | Mark node as tombstone | Add-wins (concurrent insert + delete → node exists) |
| `Update` | Modify node attributes | LWW per attribute field |
| `Move` | Change node's parent | Lamport timestamp tiebreaker, cycle prevention |

### Operation Structure

```elixir
%Grove.Op.Insert{
  id: "replica_1_43",
  type: :option,
  attrs: %{value: "US", label: "United States"},
  parent_id: "replica_1_42",
  position: 0,
  meta: %{user_id: "user_123"}
}

%Grove.Op.Move{
  id: "replica_1_43",
  new_parent_id: "replica_1_50",
  position: 2,
  old_parent_id: "replica_1_42",   # For undo
  old_position: 0,                  # For undo
  timestamp: 156                    # Lamport timestamp for conflict resolution
}

%Grove.Op.Update{
  id: "replica_1_42",
  attrs: %{label: "Select Country"},
  old_attrs: %{label: "Country"},   # For undo
  timestamp: 157
}

%Grove.Op.Delete{
  id: "replica_1_43",
  timestamp: 158
}

%Grove.Op.Batch{
  id: "batch_replica_1_159",
  ops: [op1, op2, op3],             # Atomic group
  timestamp: 159
}
```

---

## Move Operation Semantics

The move operation is the most complex part. Grove implements the algorithm from Kleppmann et al. (2021).

### Cycle Prevention

When concurrent moves would create a cycle, one move is rejected based on Lamport timestamps:

```
Before:          Concurrent moves:           After sync:
    A                A: move B→C               A
    │                B: move C→B               │
    B                                          B
    │                                          │
    C                                          C
                                         (higher timestamp wins)
```

### Move vs Delete Conflicts

When a node is moved to a destination that's concurrently deleted:

```elixir
# Replica A: move X under Y
# Replica B: delete Y (concurrent)

# Result: X is moved to Y's former parent (orphan rescue)
```

### Implementation

```elixir
defmodule Grove.Op.Move do
  def apply(tree, %__MODULE__{} = op) do
    cond do
      # Target deleted? Rescue to its former parent
      deleted?(tree, op.new_parent_id) ->
        rescue_to_ancestor(tree, op)

      # Would create cycle?
      creates_cycle?(tree, op.id, op.new_parent_id) ->
        # Compare timestamps, lower timestamp wins (its move is undone)
        resolve_cycle_conflict(tree, op)

      true ->
        do_move(tree, op)
    end
  end

  defp creates_cycle?(tree, node_id, new_parent_id) do
    # Check if new_parent_id is a descendant of node_id
    ancestors = get_ancestors(tree, new_parent_id)
    node_id in ancestors
  end
end
```

---

## Plain Data Conversion

Grove separates CRDT state from application data. You store plain JSON/maps and only use Grove for real-time sync.

### Converting To/From Plain Data

```elixir
# Load plain data, convert to CRDT tree
plain_ast = load_from_database(doc_id)
tree = Grove.from_data(plain_ast, replica_id: socket.id)

# After editing, convert back to plain data for storage
plain_ast = Grove.to_data(tree)
save_to_database(doc_id, plain_ast)

# Round-trip guarantee
assert Grove.to_data(Grove.from_data(data, replica_id: "a")) == data
```

### What Gets Stripped

| Preserved | Stripped |
|-----------|----------|
| Node IDs | Vector clocks |
| Tree structure | Tombstones |
| Attribute values | Pending operations |
| Node types | Undo/redo stacks |
| Child ordering | Replica metadata |

---

## Batch Operations

Composite operations that should be treated as a single unit for undo and broadcast.

```elixir
alias Grove.{Tree, Node}

# Adding an address field (4 nodes) as one atomic operation.
# Tree.batch/3 returns {updated_tree, compound_delta}.
{tree, _delta} =
  Tree.batch(tree, fn t ->
    addr   = Node.new("addr", :group, parent_id: "form", attrs: %{label: "Address"})
    street = Node.new("street", :text, parent_id: "addr", attrs: %{label: "Street"})
    city   = Node.new("city", :text, parent_id: "addr", attrs: %{label: "City"})
    zip    = Node.new("zip", :text, parent_id: "addr", attrs: %{label: "ZIP"})

    t
    |> Tree.put_node(addr)
    |> Tree.put_node(street)
    |> Tree.put_node(city)
    |> Tree.put_node(zip)
  end)

# The batch is recorded as a single compound operation on the history/undo stack,
# so a single undo reverts all four inserts.
{:ok, tree} = Tree.undo(tree)
```

---

## Attribute-Level CRDT Types

Different fields can use different conflict resolution strategies:

```elixir
defmodule MyApp.FormSchema do
  use Grove.Schema

  node :dropdown do
    field :label, :lww_register       # Last-writer-wins (default)
    field :options, :rga_list         # Ordered list, concurrent inserts merge
    field :tags, :or_set              # Unordered set, add-wins
    field :config, :mv_register       # Multi-value, expose conflicts
  end
end

# Merge behavior per field type:
# - :lww_register → Higher timestamp wins
# - :rga_list → Interleaved merge preserving all items
# - :or_set → Union of all added items
# - :mv_register → Returns list of concurrent values
```

---

## Module Structure

```
grove/
├── lib/
│   ├── grove.ex                      # Main API facade
│   └── grove/
│       ├── tree.ex                   # Tree state and core operations
│       ├── node.ex                   # Node struct
│       │
│       ├── op/                       # Operations
│       │   ├── insert.ex
│       │   ├── delete.ex
│       │   ├── update.ex
│       │   ├── move.ex
│       │   └── batch.ex
│       │
│       ├── conflict/                 # Conflict resolution
│       │   ├── move_resolver.ex      # Cycle prevention
│       │   ├── lww.ex                # Last-write-wins
│       │   └── mv.ex                 # Multi-value
│       │
│       ├── query/                    # Tree queries
│       │   ├── find.ex               # find_nodes, get_node
│       │   ├── traverse.ex           # path_to, ancestors, descendants
│       │   └── filter.ex             # Query DSL
│       │
│       ├── undo/                     # Undo/redo support
│       │   ├── stack.ex
│       │   └── inverse.ex            # Generate inverse operations
│       │
│       ├── sync/                     # Synchronization
│       │   ├── operations.ex         # Operation encoding/decoding
│       │   └── merge.ex              # Apply remote operations
│       │
│       ├── data/                     # Plain data conversion
│       │   ├── from_data.ex
│       │   └── to_data.ex
│       │
│       ├── schema/                   # Schema DSL
│       │   ├── dsl.ex
│       │   └── types.ex              # Field type registry
│       │
│       ├── session/                  # Document sessions
│       │   ├── manager.ex            # GenServer per document
│       │   ├── registry.ex           # Session lookup
│       │   └── presence.ex           # Cursor/selection tracking
│       │
│       ├── live_view/                # Phoenix integration
│       │   ├── hooks.ex              # on_mount hooks
│       │   └── helpers.ex            # mount_tree, apply_and_broadcast
│       │
│       ├── gc/                       # Garbage collection
│       │   └── tombstone.ex
│       │
│       └── testing/                  # Test utilities
│           ├── generators.ex         # StreamData generators
│           └── helpers.ex            # sync, assert_convergent
│
└── test/
    └── properties/
        ├── convergence_test.exs
        ├── move_test.exs
        └── cycle_test.exs
```

---

## Session Management

Each document gets a managed session that handles replication and lifecycle.

```elixir
defmodule Grove.Session.Manager do
  use GenServer

  defstruct [
    :doc_id,
    :tree,
    :replicas,            # Connected LiveView PIDs
    :last_activity,
    :persist_timer,
    :grace_timer
  ]

  # Lifecycle
  def join(doc_id, replica_pid)   # Register replica, return current tree
  def leave(doc_id, replica_pid)  # Unregister replica
  def info(doc_id)                # Get session info

  # Operations
  def apply_op(doc_id, op)        # Apply and broadcast
  def get_state(doc_id)           # Get current tree

  # Automatic behaviors:
  # - Grace period before shutdown when no replicas
  # - Periodic persistence
  # - Operation buffering for debouncing
end
```

---

## LiveView Integration

### Mount Hooks

```elixir
defmodule Grove.LiveView do
  def on_mount(:subscribe, _params, _session, socket) do
    if connected?(socket) do
      topic = "grove:doc:#{socket.assigns.doc_id}"
      Phoenix.PubSub.subscribe(Grove.PubSub, topic)
      Grove.Session.Manager.join(socket.assigns.doc_id, self())
    end
    {:cont, socket}
  end
end
```

### Usage

```elixir
defmodule MyAppWeb.EditorLive do
  use Phoenix.LiveView

  def mount(%{"id" => id}, _session, socket) do
    socket = Grove.LiveView.mount_tree(socket, id, replica_id: socket.id, pubsub: MyApp.PubSub)
    {:ok, socket}
  end

  def handle_event("add_field", %{"id" => node_id}, socket) do
    socket =
      Grove.LiveView.apply_and_broadcast(socket, fn tree ->
        node = Grove.Node.new(node_id, :text)
        {Grove.Tree.put_node(tree, node), {:put_node, node_id, node}}
      end)

    {:noreply, socket}
  end

  def handle_info({:grove_delta, delta, from_pid}, socket) when from_pid != self() do
    {:noreply, Grove.LiveView.apply_remote(socket, delta)}
  end
end
```

---

## Presence Tracking

Built-in cursor and selection tracking for collaborative UIs:

When the optional Phoenix dependencies are present, presence is tracked per-replica on the LiveView socket (backed by `Phoenix.Tracker`):

```elixir
# Track focus / selection (operate on the socket)
socket = Grove.set_focus(socket, "node_123")
socket = Grove.set_selection(socket, "node_123", "node_124")

# Get all replica presence (with auto-assigned colors)
Grove.get_presences(socket)
# => %{
#   "replica_a" => %{focus: "node_123", selection: [], color: "#4CAF50"},
#   "replica_b" => %{focus: "node_456", selection: ["node_456"], color: "#2196F3"}
# }
```

---

## Query API

Find and traverse nodes in the tree:

```elixir
alias Grove.Tree

# Find by ID
Tree.get_node(tree, "node_123")

# Find by type
Tree.find_nodes(tree, type: :dropdown)

# Find by attribute
Tree.find_nodes(tree, where: [required: true])

# Traversal
Tree.parent(tree, "node_123")
Tree.children(tree, "node_123")
Tree.siblings(tree, "node_123")
Tree.ancestors(tree, "node_123")

# Path (for breadcrumbs)
Tree.path_to(tree, "node_123")
# => ["root", "section_1", "field_group", "node_123"]
```

---

## Garbage Collection

Structural GC of OR-Set/OR-Map metadata is available today via `Grove.GC.Structural`. Causal-stability and time-based **tombstone GC for tree nodes is planned** (see [roadmap](roadmap.md)); the intended design is:

### Planned GC Strategy

1. **Track observation**: Each replica tracks its vector clock
2. **Exchange clocks**: Periodically gossip vector clocks via `:pg`
3. **Compute stability**: Find operations observed by ALL replicas
4. **Prune**: Remove tombstones in the stable causal past

---

## Testing Utilities

`Grove.Testing` provides utilities for verifying CRDT convergence:

```elixir
defmodule MyApp.CrdtTest do
  use ExUnit.Case
  import Grove.Testing

  test "concurrent edits converge" do
    # Create N isolated replicas of a CRDT module
    [a, b] = create_replicas(Grove.Set.ORSet, 2)

    # Make concurrent edits on each replica
    a = Grove.Set.ORSet.add(a, "x")
    b = Grove.Set.ORSet.add(b, "y")

    # Sync all replicas and assert convergence
    [a, b] = sync_all([a, b])
    assert_convergent([a, b])
  end
end
```

---

## Performance

| Operation | Time Complexity | Space Complexity |
|-----------|-----------------|------------------|
| `put_node` | O(1) amortized | O(1) |
| `move_node` | O(log n) | O(1) |
| `update_node` | O(1) | O(1) |
| `get_node` | O(1) | — |
| `find_nodes` | O(n) | O(matches) |
| `apply_remote` | O(k) | O(k) |
| `path_to` | O(depth) | O(depth) |

---

## Research Background

Grove implements algorithms from:

- **Move operations**: [A highly-available move operation for replicated trees](https://martin.kleppmann.com/papers/move-op.pdf) — Kleppmann et al., 2021
- **Tree CRDTs**: [A coordination-free, convergent, and safe replicated tree](https://arxiv.org/abs/2103.04828) — Nair et al., 2021
- **Undo/Redo**: [A Generic Undo Support for State-Based CRDTs](https://drops.dagstuhl.de/opus/volltexte/2020/11800/pdf/LIPIcs-OPODIS-2019-14.pdf) — Yu et al., 2019
