Compression support for large payloads in ScyllaDB.
Provides:
- Application-level compression/decompression for large text and binary fields
- Table-level compression configuration CQL generation
- Transparent compression for fields marked as compressible
Compression Algorithms
:lz4— Fast compression/decompression, good for real-time workloads:snappy— Google's Snappy, balanced speed/ratio:deflate— zlib/gzip, better ratio but slower:zstd— Zstandard, excellent ratio with good speed
Usage
# Compress a value before storing
compressed = AshScylla.DataLayer.Compression.compress(large_text, :zstd)
# Decompress after reading
original = AshScylla.DataLayer.Compression.decompress(compressed, :zstd)
# Generate table compression CQL
AshScylla.DataLayer.Compression.table_compression_cql(:lz4)
Summary
Functions
Generates CQL for chunk length configuration.
Compresses a binary value using the specified algorithm.
Compresses a value only if it exceeds the given threshold size.
Generates the full compression clause for a CREATE TABLE statement.
Generates CQL for CRC check chance configuration.
Decompresses a binary value using the algorithm indicated by the 1-byte marker prefix.
Generates CQL for the default compression class.
Returns the default compression threshold in bytes.
Estimates the compressed size without actually compressing.
Checks if a value should be compressed based on size threshold.
Generates CQL for table-level compression configuration.
Functions
@spec chunk_length_cql(non_neg_integer()) :: String.t()
Generates CQL for chunk length configuration.
Returns a chunk_length_kb = N string suitable for inclusion in a compression clause.
Examples
iex> AshScylla.DataLayer.Compression.chunk_length_cql(64)
"chunk_length_kb = 64"
Compresses a binary value using the specified algorithm.
The compressed output is prefixed with a 1-byte algorithm marker so that
decompress/2 can identify which algorithm was used.
Examples
iex> AshScylla.DataLayer.Compression.compress("hello world", :deflate)
<<3, ...>>
iex> AshScylla.DataLayer.Compression.compress("hello world", :lz4)
<<1, ...>>
@spec compress_if_large(binary(), atom(), non_neg_integer()) :: {:compressed, binary()} | {:ok, binary()}
Compresses a value only if it exceeds the given threshold size.
Returns {:compressed, compressed_binary} if the value was compressed,
or {:ok, original_binary} if the value was below the threshold.
Examples
iex> AshScylla.DataLayer.Compression.compress_if_large("small", :deflate, 1024)
{:ok, "small"}
iex> AshScylla.DataLayer.Compression.compress_if_large(String.duplicate("a", 2048), :deflate, 1024)
{:compressed, <<3, ...>>}
Generates the full compression clause for a CREATE TABLE statement.
Combines algorithm class, chunk length, and CRC check chance into a single
WITH compression = {...} clause.
Options
:chunk_length_kb— Chunk size in kilobytes (positive integer):crc_check_chance— Probability of CRC check (float between 0.0 and 1.0)
Examples
iex> AshScylla.DataLayer.Compression.compression_clause(:lz4, chunk_length_kb: 64)
"WITH compression = {'class': 'LZ4Compressor', 'chunk_length_kb': 64}"
iex> AshScylla.DataLayer.Compression.compression_clause(:zstd, chunk_length_kb: 128, crc_check_chance: 0.75)
"WITH compression = {'class': 'ZstdCompressor', 'chunk_length_kb': 128, 'crc_check_chance': 0.75}"
Generates CQL for CRC check chance configuration.
Returns a crc_check_chance = N string suitable for inclusion in a compression clause.
Examples
iex> AshScylla.DataLayer.Compression.crc_check_chance_cql(0.5)
"crc_check_chance = 0.5"
Decompresses a binary value using the algorithm indicated by the 1-byte marker prefix.
Examples
iex> data = AshScylla.DataLayer.Compression.compress("hello world", :deflate)
iex> AshScylla.DataLayer.Compression.decompress(data, :deflate)
"hello world"
Generates CQL for the default compression class.
Returns a WITH compression = {'class': '...'} clause.
Examples
iex> AshScylla.DataLayer.Compression.default_compression_cql(:lz4)
"WITH compression = {'class': 'LZ4Compressor'}"
@spec default_threshold() :: non_neg_integer()
Returns the default compression threshold in bytes.
Values smaller than this threshold are not compressed by compress_if_large/3.
Examples
iex> AshScylla.DataLayer.Compression.default_threshold()
1024
@spec estimated_size(binary(), atom()) :: non_neg_integer()
Estimates the compressed size without actually compressing.
Uses a heuristic ratio based on the algorithm. This is useful for deciding whether compression is worthwhile before actually compressing.
Examples
iex> AshScylla.DataLayer.Compression.estimated_size(String.duplicate("a", 1000), :deflate)
350
@spec should_compress?(binary(), non_neg_integer()) :: boolean()
Checks if a value should be compressed based on size threshold.
Returns true if the byte size of the value exceeds the threshold.
Examples
iex> AshScylla.DataLayer.Compression.should_compress?("small", 1024)
false
iex> AshScylla.DataLayer.Compression.should_compress?(String.duplicate("a", 2048), 1024)
true
Generates CQL for table-level compression configuration.
Returns the compression = {...} clause value as a string.
Examples
iex> AshScylla.DataLayer.Compression.table_compression_cql(:lz4)
"compression = {'class': 'LZ4Compressor'}"
iex> AshScylla.DataLayer.Compression.table_compression_cql(:snappy, chunk_length_kb: 64)
"compression = {'class': 'SnappyCompressor', 'chunk_length_kb': 64}"