View Source FastAvro (fastavro v0.5.0)

This library implements some fast avro access functions to be used in conjuction with avro_ex or schema_avro libraries.

It just contains some convenience functions useful when having high amount of avro records to process. It allows faster access than the pure elixir libraries for use cases like:

  • You need only to read one or a small amount of fields from the avro data but no modify it. As an example you just need to retrieve some time field to use it as partitioning value in your destination system.

  • You want to simplify the message by extracting some fields and reencode with a diferent schema.

To obtain that speed gain, FastAvro uses a rust wrapper arround the apache-avro for rust library. It only supports 'record' type at first level of the schema and only primitive types 'string', 'int', 'long' and 'double' as field types.

{
  "type": "record",
  "name": "person",
  "fields" : [
    {"name": "name", "type": "string"},
    {"name": "age", "type": "int"},
    {"name": "score", "type": "double"}
  ]
}

Link to this section Summary

Types

Is a decoded and schema validated avro record

Is a precompiled and validated avro schema

Functions

Creates a new avro record from a map of field names as string binaries and values compatible with the given schema.

Decodes avro data given as a binary using the provided schema. It decodes only raw data without any headers, no schema and no fingerprint.

Encodes avro data from a map using the provided schema. It raw encodes the data without any headers, no schema and no fingerprint.

Gets the value associated to a field name from a given avro record.

Gets the value associated to a field name from given avro data and schema.

Gets the values associated with a list of field names from given avro data and schema.

Gets the value associated to a field name from given avro data and schema and normalize the avro data so any unneeded bytes get removed.

Decodes and normalize avro data given as a binary using the provided schema. It decodes only raw data without any headers, no schema and no fingerprint.

This function parses and validates a avro schema given as a json encoded string.

Given a schema it makes a list of fields and their types.

Converts an avro_record() reference into an elixir map.

Link to this section Types

@type avro_record() :: reference()

Is a decoded and schema validated avro record

@type schema() :: reference()

Is a precompiled and validated avro schema

Link to this section Functions

@spec create_msg(map(), schema()) :: {:ok, avro_record()} | {:error, atom()}

Creates a new avro record from a map of field names as string binaries and values compatible with the given schema.

All mandatory fields must be provided and the asociated values must correctly typed.

parameters

Parameters

  • map: an elixir map with fields to be populated
  • schema: a schema() reference for the record format

returns

Returns

  • {:ok, avro_record}: an avro_record() reference already populated and ready to be encoded.
  • {:error, :wrong_type}: if the schema contains an unknown data type.

examples

Examples

iex> {:ok, record} = FastAvro.create_msg(
       %{ "name" => "John", "age" => 25, "score" => 5.6 },
       schema
     )
{:ok, #Reference<0.2515214245.918683654.17019>}
Link to this function

decode_avro_datum(avro_data, schema)

View Source
@spec decode_avro_datum(binary(), schema()) :: {:ok, avro_record()} | {:error, atom()}

Decodes avro data given as a binary using the provided schema. It decodes only raw data without any headers, no schema and no fingerprint.

parameters

Parameters

  • binary: valid avro data as a binary
  • schema: a schema() reference for a record definition compatible with the data.

returns

Returns

  • {:ok, avro_record()}: when successfully decoded
  • {:error, :incompatible_avro_schema}: when schema not valid to decode data
  • {:error, :all_data_not_read}: if the decode has not read all the binary

examples

Examples

iex> FastAvro.decode_avro_datum(avro_data, schema)
{:ok, #Reference<0.2887345315.2965241864.83696>}
Link to this function

encode_avro_datum(avro_map, schema)

View Source
@spec encode_avro_datum(map(), schema()) :: {:ok, binary()} | {:error, atom()}

Encodes avro data from a map using the provided schema. It raw encodes the data without any headers, no schema and no fingerprint.

parameters

Parameters

  • map: elixir map with field names and values to encode
  • schema: a schema() reference compatible with the fields and values in the map.

returns

Returns

  • {:ok, binary}: binary contains avro representation of the data in the map as described by the schema.
  • {:error, :wrong_type}: the schema contains an unsupported data type
  • {:error, :incompatible_avro_schema}: the schema does not match map contents
  • {:error, :field_not_found}: if map field missing from schema

examples

Examples

iex> FastAvro.encode_avro_datum(
  %{
    "tac" => 1432,
    "from" => "2023-01-25 00:45:52",
    "to" => "2023-01-25 01:00:00"
  },
  new_schema
)
{:ok, <<176, 22, 38, 50, 48, 50, 51, 45, 48, 49, 45, 50, 53, 32, 48, 48, 58,
52, 53, 58, 53, 50, 38, 50, 48, 50, 51, 45, 48, 49, 45, 50, 53, 32, 48,
49, 58, 48, 48, 58, 48, 48>>}
Link to this function

get_avro_value(msg, name)

View Source
@spec get_avro_value(avro_record(), String.t()) :: {:ok, term()} | {:error, atom()}

Gets the value associated to a field name from a given avro record.

parameters

Parameters

  • avro_record: a avro_record() reference already decoded
  • name: the field name to consult as a string

returns

Returns

  • {:ok, term}: term representing the value of the field in the avro record
  • {:error, :field_not_found}: If the field does not exist in the avro record
  • {:error, :not_a_record}: If the binary is not an avro record

examples

Examples

iex> FastAvro.get_avro_value(msg, "Dest_TAC")
{:ok, "TAC: 1142"}
Link to this function

get_raw_value(avro_binary, schema, name)

View Source
@spec get_raw_value(binary(), schema(), String.t()) ::
  {:ok, term()} | {:error, atom()}

Gets the value associated to a field name from given avro data and schema.

parameters

Parameters

  • binary: valid avro data as a binary
  • schema: a schema() reference compatible with that avro data.
  • name: the field name to consult as a string

returns

Returns

  • {:ok, term}: term representing the value of the field in the avro record
  • {:error, :field_not_found}: If the field does not exist in the avro record
  • {:error, :not_a_record}: If the binary is not an avro record
  • {:error, :incompatible_avro_schema}: If the schema is not compatible with the binary
  • {:error, :all_data_not_read}: if the decode has not read all the binary

examples

Examples

iex> FastAvro.get_raw_value(avro_binary, "Dest_TAC")
{:ok, "TAC: 1142"}
Link to this function

get_raw_values(avro_binary, schm, names)

View Source
@spec get_raw_values(binary(), schema(), [String.t()]) ::
  {:ok, map()} | {:error, atom()}

Gets the values associated with a list of field names from given avro data and schema.

parameters

Parameters

  • avro_binary: valid avro data as a binary
  • schema: a schema() reference compatible with that avro data.
  • names: a list of field names to consult as a strings

returns

Returns

  • {:ok, map}: a map with field names and values extracted from avro_binary
  • {:error, :not_a_record}: if avro_binary is not an avro record
  • {:error, :field_not_found}: if a name in names is not in the schema

If the field does not exists in the avro record you get :field_not_found.

examples

Examples

iex> FastAvro.get_raw_values(avro_data, schema, [
  "Dest_TAC",
  "Event_Start",
  "Event_Stop"
])
{:ok, %{
  "Dest_TAC" => "TAC: 1142",
  "Event_Start" => "20200914 18:03:03.174",
  "Event_Stop" => "20200914 18:03:03.224"
}
Link to this function

normalize_and_get_raw_value(avro_binary, schema, name)

View Source
@spec normalize_and_get_raw_value(binary(), schema(), String.t()) ::
  {:ok, {term(), binary()}} | {:error, atom()}

Gets the value associated to a field name from given avro data and schema and normalize the avro data so any unneeded bytes get removed.

parameters

Parameters

  • binary: avro data to be read and normalized as a binary
  • schema: a schema() reference compatible with that avro data.
  • name: the field name to consult as a string

returns

Returns

  • {:ok, {term, binary}}: a tuple with the term representing the value of the field in the avro record and the normalized binary for that record.
  • {:error, :field_not_found}: If the field does not exist in the avro record
  • {:error, :not_a_record}: If the binary is not an avro record
  • {:error, :incompatible_avro_schema}: If the schema is not compatible with the binary
  • {:error, :all_data_not_read}: if the decode has not read all the binary

examples

Examples

iex> FastAvro.normalize_and_get_raw_value(avro_binary, "Dest_TAC")
{:ok, {"TAC: 1142", <<...>>}}
Link to this function

normalized_avro_datum(avro_data, schema)

View Source
@spec normalized_avro_datum(binary(), schema()) ::
  {:ok, avro_record()} | {:error, atom()}

Decodes and normalize avro data given as a binary using the provided schema. It decodes only raw data without any headers, no schema and no fingerprint.

parameters

Parameters

  • binary: valid avro data as a binary
  • schema: a schema() reference for a record definition compatible with the data.

returns

Returns

  • {:ok, avro_record()}: when successfully decoded
  • {:error, :incompatible_avro_schema}: when schema not valid to decode data
  • {:error, :all_data_not_read}: if the decode has not read all the binary

examples

Examples

iex> FastAvro.FastAvro.normalized_avro_datum(avro_data, schema)
{:ok, #Reference<0.2887345315.2965241864.83696>}
@spec read_schema(String.t()) :: {:ok, schema()} | {:error, atom()}

This function parses and validates a avro schema given as a json encoded string.

It returns creates an internal representation of the schema ready to be used with FastAvro.create_msg/2 or FastAvro.decode_avro_datum/2.

parameters

Parameters

  • json: a string containing the schema definition json encoded.

returns

Returns

  • {:ok, schema}
  • {:error, reason}

examples

Examples

iex> {:ok, schm} = File.read!("bench/lte_202210.avsc") |> FastAvro.read_schema
{:ok, #Reference<0.3029127103.749076481.152983>}

In order to interoperate with the rest of the module the schema must define a 'record' with only primitive 'string', 'int', 'long' and 'double' fields.

@spec schema_fields(schema()) :: list()

Given a schema it makes a list of fields and their types.

parameters

Parameters

  • schema: a schema() reference.

resturns

Resturns

A map with field names and types as string binaries.

examples

Examples

iex> FastAvro.schema_fields(schema)
%{
  "S1_Attach_Attempt" => "Int",
  "Report_Reason" => "Int",
  "Dest_Cell_Id" => "String",
  "NRN_llamante" => "String",
  "Dest_SAC" => "String"
}

This is useful if you need to instrospect the schema.

@spec to_map(avro_record()) :: map()

Converts an avro_record() reference into an elixir map.

parameters

Parameters

  • avro_record: an avro_record() reference to convert.

retunrs

Retunrs

An elixir map with avro field names as keys and avro field values as values.

examples

Examples

iex> FastAvro.to_map(msg)
%{
  "S1_Attach_Attempt" => 0,
  "Report_Reason" => 19,
  "Dest_Cell_Id" => "",
  "NRN_llamante" => "",
  "Dest_SAC" => "TAC: 1142",
}