Performance Benchmarks
View SourceGrove includes a comprehensive benchmark suite for measuring Eg-walker CRDT performance.
Running Benchmarks
# Run full benchmark suite
mix run benchmarks/eg_walker_bench.exs
# Run with production optimizations
MIX_ENV=prod mix run benchmarks/eg_walker_bench.exs
Benchmark Scenarios
| Scenario | Description | Tests |
|---|---|---|
| Sequential | Single replica, 1000 ops | Fast-path optimization |
| Collaborative | 5 replicas × 200 ops, 20% reorder | Mixed fast/slow path |
| High Concurrency | 10 replicas × 100 concurrent ops | Slow-path merging |
| Large Document | 100-10k nodes + 100 ops | Scalability |
| Move-heavy | 500-1k nodes, 500-1k moves | Operation log performance |
Current Results
Date: 2026-06-14
Elixir: 1.20.1
OTP: 29Reproducibility note: The collaborative and high-concurrency scenarios build their workloads with an unseeded
Enum.shuffle, so those rows vary run-to-run and should be read as indicative rather than exact. In this run the shuffled concurrent workload happened to stay light, so the fast-path/slow-path gap is much smaller than the algorithm's worst case. The deterministic suites (sequential, local operations, scalability) are the reliable signal — they show a consistent ~1.5–1.8x throughput gain from the Elixir 1.18→1.20 / OTP 25→29 upgrade with no code changes.
Eg-walker vs Traditional CRDT
Direct comparison using the same 1000 sequential operations:
| Implementation | ips | Average | Memory |
|---|---|---|---|
| Eg-walker (fast-path) | 956.71 | 1.05 ms | 1.36 MB |
| Traditional CRDT (slow-path) | 987.87 | 1.01 ms | 1.37 MB |
For pure sequential put_node operations: Only ~3% difference.
Why? Simple put_node operations don't have much merge overhead. The real benefit appears with:
- Concurrent operations requiring merge
- Move operations with undo-do-redo
- Long histories requiring traversal
See the high-concurrency benchmark below for the real difference.
apply_remote Performance
| Scenario | ips | Average | Median | 99th % | Memory |
|---|---|---|---|---|---|
| Sequential (fast-path) | 918 | 1.09 ms | 1.05 ms | 2.17 ms | 1.36 MB |
| Collaborative (mixed) | 781 | 1.28 ms | 1.24 ms | 2.27 ms | 1.68 MB |
| High-concurrency (slow-path) | 834 | 1.20 ms | 1.14 ms | 2.40 ms | 1.51 MB |
Key insight: The fast-path applies sequential edits with near-zero CRDT overhead. The fast-path/slow-path gap scales with how much genuine concurrency the workload contains; see the reproducibility note above for why the concurrent rows in this run stayed close to the fast-path.
The real Eg-walker benefit:
- Moving from sequential to genuinely concurrent workloads costs throughput and memory in proportion to the concurrency (the slow-path must build transient merge state and replay via undo-do-redo)
- This is where Eg-walker shines: sequential editing has near-zero CRDT overhead
Local Operations
| Operation | ips | Average | Memory |
|---|---|---|---|
| get_node | 9.59 M | 104 ns | 0 B |
| put_node | 2.08 M | 481 ns | 896 B |
| update_node | 1.11 M | 900 ns | 2960 B |
Scalability (Document Size)
| Document Size | ips | Average | Memory |
|---|---|---|---|
| 100 nodes + 100 ops | 9.22 K | 108 μs | 128 KB |
| 1k nodes + 100 ops | 10.20 K | 98 μs | 138 KB |
| 10k nodes + 100 ops | 9.59 K | 104 μs | 150 KB |
Key insight: Document size has minimal impact on operation time (O(1) node access via Map)
Fast-path vs Slow-path
The Eg-walker implementation optimizes for sequential editing (the common case):
Fast-path (Sequential Events)
- When: Events extend the frontier linearly (single user or turn-taking)
- Performance: 918 ips, 1.36 MB memory
- Overhead: Near-zero CRDT overhead
- How it works: Direct application, no transformation needed
Slow-path (Concurrent Events)
- When: Multiple replicas with divergent branches
- Overhead: Full CRDT merge with undo-do-redo
- How it works: Builds transient merge state, replays operations
- Cost: Throughput and memory degrade in proportion to the concurrency in the workload (see the reproducibility note above — the shuffled concurrent benchmark is not a fixed workload, so its absolute numbers vary per run)
Key Takeaway
For the same sequential workload:
- Eg-walker fast-path: 956 ips
- Traditional CRDT (forced slow-path): 987 ips
- Difference: Only ~3%
The benefit of the fast-path grows with the degree of genuine concurrency: a purely sequential workload pays almost no CRDT overhead, while a heavily concurrent one must build transient merge state and replay operations.
Optimization History
event_index Optimization (2026-01-02)
Added persistent event_index field to Tree struct for O(1) event lookups.
| Metric | Before | After | Improvement |
|---|---|---|---|
| Sequential throughput | 604 ips | 605 ips | +0.2% |
| Collaborative throughput | 122.5 ips | 140.5 ips | +14.7% |
| Slow-path throughput | 55.5 ips | 57.4 ips | +3.4% |
| Memory (collaborative) | 5.91 MB | 5.64 MB | -4.6% |
Implementation:
event_indexfield in Tree struct (lazy construction)ensure_index/1builds index from history when neededget_event/2for O(1) lookup by event ID- Incremental updates in
record_history/3andrecord_remote_event/2
Metrics Explained
- ips: Iterations per second (higher is better)
- Average: Mean time per operation
- Median: 50th percentile latency
- 99th %: Tail latency (important for UI responsiveness)
- Memory: Total bytes allocated per benchmark run
Future Optimizations
Remaining targets identified:
find_insertion_point- O(n) list scan → binary searchbuild_children_mapin EventGraph - O(n) rebuild → cache- Pass event_index to EventGraph functions to avoid rebuilding