# RustyJson Benchmarks

Comprehensive benchmarks comparing RustyJson vs Jason across synthetic and real-world datasets.

## Key Findings

1. **Encoding is where RustyJson shines** - 3-6x faster, 2-3x less memory
2. **Decoding is faster** (2-3x) but memory usage is similar (both produce identical Elixir terms)
3. **Larger payloads = bigger advantage** - Real-world 10MB files show better results than synthetic benchmarks
4. **BEAM scheduler load dramatically reduced** - 100-28,000x fewer reductions

## Test Environment

| Attribute | Value |
|-----------|-------|
| OS | macOS |
| CPU | Apple M1 Pro |
| Cores | 10 |
| Memory | 16 GB |
| Elixir | 1.19.4 |
| Erlang/OTP | 28.2 |

## Real-World Benchmarks: Amazon Settlement Reports

These are production JSON files from Amazon SP-API settlement reports, representing real-world API response patterns with nested objects, arrays of transactions, and mixed data types.

### Encoding Performance (Elixir → JSON)

| File Size | RustyJson | Jason | Speed | Memory |
|-----------|-----------|-------|-------|--------|
| 10.87 MB | 24 ms | 131 ms | **5.5x faster** | **2.7x less** |
| 9.79 MB | 21 ms | 124 ms | **5.9x faster** | **2-3x less** |
| 9.38 MB | 21 ms | 104 ms | **5.0x faster** | **2-3x less** |

### Decoding Performance (JSON → Elixir)

| File Size | RustyJson | Jason | Speed | Memory |
|-----------|-----------|-------|-------|--------|
| 10.87 MB | 61 ms | 152 ms | **2.5x faster** | similar |
| 9.79 MB | 55 ms | 134 ms | **2.4x faster** | similar |
| 9.38 MB | 50 ms | 119 ms | **2.4x faster** | similar |

### BEAM Reductions (Scheduler Load)

| File Size | RustyJson | Jason | Reduction |
|-----------|-----------|-------|-----------|
| 10.87 MB encode | 404 | 11,570,847 | **28,641x fewer** |

This is the most dramatic difference - RustyJson offloads virtually all work to native code.

## Synthetic Benchmarks: nativejson-benchmark

Using standard datasets from [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark):

| Dataset | Size | Description |
|---------|------|-------------|
| canada.json | 2.1 MB | Geographic coordinates (number-heavy) |
| citm_catalog.json | 1.6 MB | Event catalog (mixed types) |
| twitter.json | 617 KB | Social media with CJK (unicode-heavy) |

### Roundtrip Performance (Decode + Encode)

| Input | RustyJson | Jason | Speedup |
|-------|-----------|-------|---------|
| canada.json | 14 ms | 48 ms | **3.4x faster** |
| citm_catalog.json | 6 ms | 14 ms | **2.5x faster** |
| twitter.json | 4 ms | 9 ms | **2.3x faster** |

### BEAM Reductions by Dataset

| Dataset | RustyJson | Jason | Ratio |
|---------|-----------|-------|-------|
| canada.json | ~3,500 | ~964,000 | **275x fewer** |
| citm_catalog.json | ~300 | ~621,000 | **2,000x fewer** |
| twitter.json | ~2,000 | ~511,000 | **260x fewer** |

## Why Encoding Shows Bigger Gains

### iolist Encoding Pattern (Pure Elixir)

```
encode(data)
  → allocate "{" binary
  → allocate "\"key\"" binary
  → allocate ":" binary
  → allocate "\"value\"" binary
  → allocate list cells to link them
  → return iolist (many BEAM allocations)
```

### RustyJson's Encoding Pattern (NIF)

```
encode(data)
  → [Rust: walk terms, write to single buffer]
  → copy buffer to BEAM binary
  → return binary (one BEAM allocation)
```

Pure-Elixir encoders create many small BEAM allocations. RustyJson creates one.

### Why Decoding Memory is Similar

Both libraries produce identical Elixir data structures when decoding. The resulting maps, lists, and strings take the same space regardless of which library created them.

## Why Benchee Memory Measurements Don't Work for NIFs

**Important**: Benchee's `memory_time` option gives misleading results for NIF-based libraries.

### What Benchee Reports (Incorrect)

```
| Library   | Memory    |
|-----------|-----------|
| RustyJson | 0.00169 MB |
| Jason     | 20.27 MB   |
```

This suggests 12,000x less memory - which is wrong.

### Why This Happens

Benchee measures memory using `:erlang.memory/0`, which only tracks BEAM allocations:
- BEAM process heap
- BEAM binary space
- ETS tables

RustyJson allocates memory in **Rust via mimalloc**, completely invisible to BEAM tracking. The 0.00169 MB is just NIF call overhead.

### How We Measure Instead

We use `:erlang.memory(:total)` delta in isolated spawned processes:

```elixir
spawn(fn ->
  :erlang.garbage_collect()
  before = :erlang.memory(:total)
  results = for _ <- 1..10, do: RustyJson.encode!(data)
  after_mem = :erlang.memory(:total)
  # Report (after_mem - before) / 10
end)
```

This captures BEAM allocations during the operation. For total system memory (including NIF), we verified with RSS measurements that Rust adds only ~1-2 MB temporary overhead.

### Actual Memory Comparison

For a 10 MB settlement report encode:

| Metric | RustyJson | Jason |
|--------|-----------|-------|
| BEAM memory | 6.7 MB | 17.9 MB |
| NIF overhead | ~1-2 MB | N/A |
| **Total** | **~8 MB** | **~18 MB** |
| **Ratio** | | **2-3x less** |

## Running Benchmarks

```bash
# 1. Download synthetic test data
mkdir -p bench/data && cd bench/data
curl -LO https://raw.githubusercontent.com/miloyip/nativejson-benchmark/master/data/canada.json
curl -LO https://raw.githubusercontent.com/miloyip/nativejson-benchmark/master/data/citm_catalog.json
curl -LO https://raw.githubusercontent.com/miloyip/nativejson-benchmark/master/data/twitter.json
cd ../..

# 2. Run memory benchmarks (no extra deps needed)
mix run bench/memory_bench.exs

# 3. (Optional) Run speed benchmarks with Benchee
# Add to mix.exs: {:benchee, "~> 1.0", only: :dev}
mix deps.get
mix run bench/stress_bench.exs
```

## Key Interning Benchmarks

The `keys: :intern` option provides significant speedups when decoding arrays of objects with repeated keys (common in API responses, database results, etc.).

### When Key Interning Helps: Homogeneous Arrays

Arrays where every object has the same keys:

```json
[{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}, ...]
```

| Scenario | Default | `keys: :intern` | Improvement |
|----------|---------|-----------------|-------------|
| 100 objects × 5 keys | 34.2 µs | 23.6 µs | **31% faster** |
| 100 objects × 10 keys | 67.5 µs | 44.8 µs | **34% faster** |
| 1,000 objects × 5 keys | 335 µs | 237 µs | **29% faster** |
| 1,000 objects × 10 keys | 688 µs | 463 µs | **33% faster** |
| 10,000 objects × 5 keys | 3.46 ms | 2.45 ms | **29% faster** |
| 10,000 objects × 10 keys | 6.92 ms | 4.88 ms | **29% faster** |

### When Key Interning Hurts: Unique Keys

Single objects or heterogeneous arrays where keys aren't repeated:

| Scenario | Default | `keys: :intern` | Penalty |
|----------|---------|-----------------|---------|
| Single object, 100 keys | 5.1 µs | 13.6 µs | **2.6x slower** |
| Single object, 1,000 keys | 52 µs | 169 µs | **3.2x slower** |
| Single object, 5,000 keys | 260 µs | 831 µs | **3.2x slower** |
| Heterogeneous 100 objects | 35 µs | 96 µs | **2.7x slower** |
| Heterogeneous 500 objects | 186 µs | 475 µs | **2.5x slower** |

### Scaling: Benefit Increases with Object Count

With 5 keys per object, the benefit grows as more objects reuse the cached keys:

| Objects | Default | `keys: :intern` | Improvement |
|---------|---------|-----------------|-------------|
| 10 | 3.5 µs | 3.0 µs | 13% faster |
| 50 | 17.1 µs | 12.5 µs | 27% faster |
| 100 | 33.8 µs | 23.8 µs | 30% faster |
| 500 | 170 µs | 119 µs | 30% faster |
| 1,000 | 339 µs | 242 µs | 29% faster |
| 5,000 | 1.81 ms | 1.24 ms | 31% faster |
| 10,000 | 3.47 ms | 2.49 ms | 28% faster |

### Usage Recommendation

```elixir
# API responses, database results, bulk data
RustyJson.decode!(json, keys: :intern)

# Config files, single objects, unknown schemas
RustyJson.decode!(json)  # default, no interning
```

**Rule of thumb**: Use `keys: :intern` when you know you're decoding arrays of 10+ objects with the same schema.

**Note**: Keys containing escape sequences (e.g., `"field\nname"`) are not interned because the raw JSON bytes differ from the decoded string. This is rare in practice and has negligible performance impact.

## Summary

| Operation | Speed | Memory | Reductions |
|-----------|-------|--------|------------|
| **Encode (large)** | 5-6x faster | 2-3x less | 28,000x fewer |
| **Encode (medium)** | 2-3x faster | 2-3x less | 200-2000x fewer |
| **Decode** | 2-3x faster | similar | — |
| **Decode (keys: :intern)** | +30% faster* | similar | — |

*For arrays of objects with repeated keys (API responses, DB results, etc.)

**Bottom line**: RustyJson's biggest advantage is encoding large payloads, where it's 5-6x faster with 2-3x less memory and dramatically reduced BEAM scheduler load. For decoding bulk data, enable `keys: :intern` for an additional 30% speedup.
