AshScylla.DataLayer.Compression (AshScylla v0.9.0)

Copy Markdown View Source

Compression support for large payloads in ScyllaDB.

Provides:

  • Application-level compression/decompression for large text and binary fields
  • Table-level compression configuration CQL generation
  • Transparent compression for fields marked as compressible

Compression Algorithms

  • :lz4 — Fast compression/decompression, good for real-time workloads
  • :snappy — Google's Snappy, balanced speed/ratio
  • :deflate — zlib/gzip, better ratio but slower
  • :zstd — Zstandard, excellent ratio with good speed

Usage

# Compress a value before storing
compressed = AshScylla.DataLayer.Compression.compress(large_text, :zstd)

# Decompress after reading
original = AshScylla.DataLayer.Compression.decompress(compressed, :zstd)

# Generate table compression CQL
AshScylla.DataLayer.Compression.table_compression_cql(:lz4)

Summary

Functions

Generates CQL for chunk length configuration.

Compresses a binary value using the specified algorithm.

Compresses a value only if it exceeds the given threshold size.

Generates the full compression clause for a CREATE TABLE statement.

Generates CQL for CRC check chance configuration.

Decompresses a binary value using the algorithm indicated by the 1-byte marker prefix.

Generates CQL for the default compression class.

Returns the default compression threshold in bytes.

Estimates the compressed size without actually compressing.

Checks if a value should be compressed based on size threshold.

Generates CQL for table-level compression configuration.

Functions

chunk_length_cql(size_kb)

@spec chunk_length_cql(non_neg_integer()) :: String.t()

Generates CQL for chunk length configuration.

Returns a chunk_length_kb = N string suitable for inclusion in a compression clause.

Examples

iex> AshScylla.DataLayer.Compression.chunk_length_cql(64)
"chunk_length_kb = 64"

compress(value, algorithm)

@spec compress(binary(), atom()) :: binary()

Compresses a binary value using the specified algorithm.

The compressed output is prefixed with a 1-byte algorithm marker so that decompress/2 can identify which algorithm was used.

Examples

iex> AshScylla.DataLayer.Compression.compress("hello world", :deflate)
<<3, ...>>

iex> AshScylla.DataLayer.Compression.compress("hello world", :lz4)
<<1, ...>>

compress_if_large(value, algorithm, threshold)

@spec compress_if_large(binary(), atom(), non_neg_integer()) ::
  {:compressed, binary()} | {:ok, binary()}

Compresses a value only if it exceeds the given threshold size.

Returns {:compressed, compressed_binary} if the value was compressed, or {:ok, original_binary} if the value was below the threshold.

Examples

iex> AshScylla.DataLayer.Compression.compress_if_large("small", :deflate, 1024)
{:ok, "small"}

iex> AshScylla.DataLayer.Compression.compress_if_large(String.duplicate("a", 2048), :deflate, 1024)
{:compressed, <<3, ...>>}

compression_clause(algorithm, opts \\ [])

@spec compression_clause(
  atom(),
  keyword()
) :: String.t()

Generates the full compression clause for a CREATE TABLE statement.

Combines algorithm class, chunk length, and CRC check chance into a single WITH compression = {...} clause.

Options

  • :chunk_length_kb — Chunk size in kilobytes (positive integer)
  • :crc_check_chance — Probability of CRC check (float between 0.0 and 1.0)

Examples

iex> AshScylla.DataLayer.Compression.compression_clause(:lz4, chunk_length_kb: 64)
"WITH compression = {'class': 'LZ4Compressor', 'chunk_length_kb': 64}"

iex> AshScylla.DataLayer.Compression.compression_clause(:zstd, chunk_length_kb: 128, crc_check_chance: 0.75)
"WITH compression = {'class': 'ZstdCompressor', 'chunk_length_kb': 128, 'crc_check_chance': 0.75}"

crc_check_chance_cql(chance)

@spec crc_check_chance_cql(float()) :: String.t()

Generates CQL for CRC check chance configuration.

Returns a crc_check_chance = N string suitable for inclusion in a compression clause.

Examples

iex> AshScylla.DataLayer.Compression.crc_check_chance_cql(0.5)
"crc_check_chance = 0.5"

decompress(data, algorithm)

@spec decompress(binary(), atom()) :: binary()

Decompresses a binary value using the algorithm indicated by the 1-byte marker prefix.

Examples

iex> data = AshScylla.DataLayer.Compression.compress("hello world", :deflate)
iex> AshScylla.DataLayer.Compression.decompress(data, :deflate)
"hello world"

default_compression_cql(algorithm)

@spec default_compression_cql(atom()) :: String.t()

Generates CQL for the default compression class.

Returns a WITH compression = {'class': '...'} clause.

Examples

iex> AshScylla.DataLayer.Compression.default_compression_cql(:lz4)
"WITH compression = {'class': 'LZ4Compressor'}"

default_threshold()

@spec default_threshold() :: non_neg_integer()

Returns the default compression threshold in bytes.

Values smaller than this threshold are not compressed by compress_if_large/3.

Examples

iex> AshScylla.DataLayer.Compression.default_threshold()
1024

estimated_size(value, algorithm)

@spec estimated_size(binary(), atom()) :: non_neg_integer()

Estimates the compressed size without actually compressing.

Uses a heuristic ratio based on the algorithm. This is useful for deciding whether compression is worthwhile before actually compressing.

Examples

iex> AshScylla.DataLayer.Compression.estimated_size(String.duplicate("a", 1000), :deflate)
350

should_compress?(value, threshold)

@spec should_compress?(binary(), non_neg_integer()) :: boolean()

Checks if a value should be compressed based on size threshold.

Returns true if the byte size of the value exceeds the threshold.

Examples

iex> AshScylla.DataLayer.Compression.should_compress?("small", 1024)
false

iex> AshScylla.DataLayer.Compression.should_compress?(String.duplicate("a", 2048), 1024)
true

table_compression_cql(algorithm, opts \\ [])

@spec table_compression_cql(
  atom(),
  keyword()
) :: String.t()

Generates CQL for table-level compression configuration.

Returns the compression = {...} clause value as a string.

Examples

iex> AshScylla.DataLayer.Compression.table_compression_cql(:lz4)
"compression = {'class': 'LZ4Compressor'}"

iex> AshScylla.DataLayer.Compression.table_compression_cql(:snappy, chunk_length_kb: 64)
"compression = {'class': 'SnappyCompressor', 'chunk_length_kb': 64}"