rdb_parser v0.3.0 RdbParser

Emits a stream that can be used to work through the entries without having to read the entire file into memory (which could be impossible).

Example - this creates a Map from the entries in the rdb file.

RdbParser.stream_entries("myredis.rdb")
|> Enum.reduce(%{}, fn
  {:entry, {key, value, metadata}}, acc ->
    Map.set(acc, key, value)
  _ ->
    acc
end)

Link to this section Summary

Functions

parse_length returns {length, rest} where length is the decoded length, and rest is the remaining part of the binary

Pass a filename and opts. The filename is read in chunks and parsed to avoid reading the entire backup file into memory

Link to this section Types

Link to this type field_type()
field_type() :: :entry | :aux | :version | :resizedb | :selectdb | :eof
Link to this type rdb_entry()
rdb_entry() ::
  {:version, version_number :: integer()}
  | {:resizedb, {:main | :expire, dbsize :: integer()}}
  | {:selectdb, db_number :: integer()}
  | {:aux, {key :: binary(), value :: redis_value()}}
  | {:entry, {key :: binary(), value :: redis_value(), Keyword.t()}}
  | {:eof, checksum :: binary()}
Link to this type redis_value()
redis_value() :: binary() | MapSet.t() | list() | map()
Link to this type stream_option()
stream_option() :: :chunk_size
Link to this type stream_options()
stream_options() :: [stream_option()]

Link to this section Functions

Link to this function parse_length(arg1)

parse_length returns {length, rest} where length is the decoded length, and rest is the remaining part of the binary.

Link to this function stream_entries(filename, opts \\ [])
stream_entries(binary(), stream_options()) :: Enumerable.t()

Pass a filename and opts. The filename is read in chunks and parsed to avoid reading the entire backup file into memory.

Options:

  • :chunk_size: The size of the chunks to read from the file at a time. This can be tuned based on expected sizes of the keys. Typically if you have larger keys, you should increase this. Default: 65,536 bytes.

The returned stream emits rdb_entry entries. Each is a tuple, with the first element reflecting the entry type.

  • {:version, version_number :: integer()}: The version of the rdb file.
  • {:resizedb, {:main, dbsize :: integer()}}: The number of keys in the database
  • {:resizedb, {:expire, dbsize :: integer()}: The number of keys with expirations
  • {:selectdb, db_number :: integer()}: The database number that will be read.
  • {:aux, {key :: binary(), value :: redis_value}}: A piece of metadata.
  • {:entry, {key :: binary(), value :: redis_value, metadata :: Keyword.t }}: A key/value pair. The metadata contains expiration information if any.
  • {:eof, checksum :: binary()}: If the file is parsed fully, this will be the last entry.

stream_entries returns a stream, so the result can be passed to Task.async_stream or Flow functions. Note that using an Enum function will start the enumeration, so an Enum.map will build the entire list of entries before doing additional steps. For parsing larger datasets it’s recommended to only use Stream or Flow type constructs and only to use Enum.reduce at the end of the function chain to avoid running out of memory.