Adding Custom Field Types

In this example we will build a simplified version of the IP address type to use in the defformat macro. Any data type can be added to BinFormat by adding a new implementation of the BinFormat.Field type and a macro to add it to the list of fields in defformat.

Like the standard ip_addr implementation, we will use the IP address format used by the Erlang :inet modules in the struct and a 32 bit integer in the binary. This example will ignore any options such as little endian support for simplicity but you can refer to the standard implementation to see how it can be handled.

Defining the Module

The first step is to define the module that will hold our implentation. This can be part of your existing application and can have any name, but it makes it easier if the last section of the module name is a camel case version of the macro you want to use in defformat. We will call the module for this example MyIPAddr and the macro my_ip_addr.

To start with our module needs to contain a struct. Two fields are needed to build the IP address type: a name and a default. The name is used for generating variables and the default is used for struct definition. We give the nil as the default for both fields because we expect the values to be fully populated by our own code every time a struct is created.

We write our module initial module code as follows:

defmodule MyIPAddr do
  defstruct name: nil, default: nil
end

Implementing the Field protocol

The defformat macro builds the patterns and functions by retrieving the AST snippets it needs through the BinFormat.Field protocol. For our new module to be useful we must define an implementation. Normally this can be done after the module definition in the same file as the module is quite short. After adding the protocol implementation the file should look like this:

defmodule MyIPAddr do
  defstruct name: nil, default: nil
end

defimpl BinFormat.Field, for: MyIPAddr do
end

There are five main functions in this protocol:

  • struct_definition
  • struct_match_pattern
  • struct_build_pattern
  • bin_match_pattern
  • bin_build_pattern

struct_definition

The struct definition is easy to implement because it is the same for almost every type. Either the field does not appear in the struct (for example a constant) and the function returns :undefined or it is declared with the supplied name and default.

BinFormat.FieldType.Util.standard_struct_def will return the correct code for normal field types. Under the hood it escapes the value of the default and returns name: default in a tupple tagged with :ok to indicated the field exists.

The module argument is provided in case any variables or functions need to be accessed in their correct enviroment. They normally don’t in the struct definition so we will mark it with an underscore but in the rest of the functions it will be used as an argument to Macro.var. See the Elixir documentation on Macros for more information.

The code for struct_definition looks like this and goes in the defimpl’s do block:

def struct_definition(%MyIPAddr{name: name, default: default}, _module) do
  BinFormat.FieldType.Util.standard_struct_def(name, default)
end

struct_match_pattern

The struct_match_pattern returns the code needed to extract the data from a struct with pattern matching and store it in a variable. Like with the struct_definition function, there is a standard function in the BinFormat.FieldType.Util module but in this case we need to write our own. The standard function takes the name of the field and extracts the member of the struct with that name and stores it in a variable with (a prefixed version of) the same name.

The variable we are trying to match is a tuple with four terms and the name in the struct is exactly the same as the name supplied, so we need to match something like this:

address: {a, b, c, d}

Internally, matches for structs are just a prop list, so that becomes a tuple with the name as an atom on the left and our ip on the right:

{:address, {a, b, c, d}}

The four variables we are creating need to be picked up by the code that builds the binary later so we need to give them unique but predictable names incoperating the field name and supplied prefix (in this case “pfix_“):

{:address, {ip_a_pfix_address, ip_b_pfix_address, ip_c_pfix_address, 
              ip_d_pfix_address}

The prefix argument should be inserted just before the user supplied part of any variable name used in the code snippet returned to allow BinFormat to avoid naming conflicts in generated code.

We now have the code we want to insert, so we need to quote it to return the AST. We have the base name as an atom, the prefix as a string and we know the custom prefixes we want to give each variable so we can split the values out. We will define a helper function called var_name and add it to the defimpl block with defp to keep it private. The Elixir Macro library provides the var function to turn an atom into a variable reference so we will build up an atom for the full name first then return a variable name that can be inserted directly into the code. The module argument is required for giving the variable the correct scope.

defp var_name(name, part, prefix, module) do
  full_name = String.to_atom(prefix <> arg <> Atom.to_string(name))
  Macro.var(full_name, module)
end

We can now build up our code block by inserting the names with unquote. This is all we need to build the implementation of struct_match_pattern. The result is returned inside a tuple tagged with :ok to indicate that code needs to be inserted.

def struct_match_pattern(%MyIpAddr{name: name}, module, prefix) do
  a_name = var_name(name, "ip_a_", prefix, module)
  b_name = var_name(name, "ip_b_", prefix, module)
  c_name = var_name(name, "ip_c_", prefix, module)
  d_name = var_name(name, "ip_d_", prefix, module)

  pattern = quote do
    {unquote(name), {unquote(a_name), unquote(b_name), unquote(c_name), 
                      unquote(d_name)}
  end

  {:ok, pattern}
end

struct_build_pattern

The struct build pattern is like struct_match_pattern, but it is used when building structs from existing local variables defined by code in bin_match_pattern or struct_match_pattern. Normally the code snippet returned by this function will be inserted after bin_match_pattern but all returned code should use the same set of variables to make the field type future proof.

In this case the code snippet needed is the same as the one returned by struct_match_pattern as we will set up bin_match_pattern to initialise the same variables.

We don’t want to rewrite the code so we will copy the contents of the struct_match_pattern to a private helper function and call it from both struct_match_pattern and struct_build_pattern:

defp struct_pattern(%MyIpAddr{name: name}, module, prefix) do
  a_name = var_name(name, "ip_a_", prefix, module)
  b_name = var_name(name, "ip_b_", prefix, module)
  c_name = var_name(name, "ip_c_", prefix, module)
  d_name = var_name(name, "ip_d_", prefix, module)

  pattern = quote do
    {unquote(name), {unquote(a_name), unquote(b_name), unquote(c_name), 
                      unquote(d_name)}
  end

  {:ok, pattern}
end

def struct_match_pattern(field, module, prefix) do
  struct_pattern(field, module, prefix)
end

def struct_build_pattern(field, module, prefix) do
  struct_pattern(field, module, pattern)
end

It is not always possible to use the same pattern for building and matching, particularly if any logic needs to be applied building the value. For example, the built in lookup field type uses the standard functions from BinFormat.FieldType.Util for matching but custom functions that generate a pattern incoperating the case statement for building the values.

We could have used the standard versions of the match functions and then used function calls to extract the relevant data from the tuple or binary in the build functions but it is better to use pattern matching where possible.

bin_match_pattern

The bin_match_pattern function is the equivalent of struct_match_pattern for matching against binaries. We will use the same approach.

The pattern we want to match (with a name of address and prefix of “pfix_“):

<<... ip_a_pfix_address, ip_b_pfix_address, ip_c_pfix_address, ip_d_pfix_address, ...>>

As binary subexpressions are valid as terms in binary matches, this is equivalent to the following:

<<... <<ip_a_pfix_address, ip_b_pfix_address, ip_c_pfix_address, ip_d_pfix_address>>, ...>>

This is easier to generate so it is what we will use.

We can use the same var_name function as before, so the fuctions becomes:

def bin_match_pattern(%MyIPAddr{name: name}, module, prefix) do
  a_name = var_name(name, "ip_a_", prefix, module)
  b_name = var_name(name, "ip_b_", prefix, module)
  c_name = var_name(name, "ip_c_", prefix, module)
  d_name = var_name(name, "ip_d_", prefix, module)

  pattern = quote do
    <<unquote(a_name), unquote(b_name), unquote(c_name), unquote(d_name)>>
  end

  {:ok, pattern}
end

As before we generate the snippet with a quote expression and return it tagged with the atom :ok

bin_build_pattern

This function generates the representation of the binary when the variables are already declared in a match pattern. Like struct_build_pattern, we can reuse the bin_match_pattern function for bin_build_pattern by putting the logic in a private function.

defp bin_pattern(%MyIPAddr{name: name}, module, prefix) do
  a_name = var_name(name, "ip_a_", prefix, module)
  b_name = var_name(name, "ip_b_", prefix, module)
  c_name = var_name(name, "ip_c_", prefix, module)
  d_name = var_name(name, "ip_d_", prefix, module)

  pattern = quote do
    <<unquote(a_name), unquote(b_name), unquote(c_name), unquote(d_name)>>
  end

  {:ok, pattern}
end

def bin_match_pattern(field, module, prefix) do
  bin_pattern(field, module, prefix)
end

def bin_build_pattern(field, module, prefix) do
  bin_pattern(field, module, prefix)
end

defformat Body Macro

We now have a working implementation of the BinFormat.Field protocol, but to use it we need to be able to add instances to the field list. To do this we need to define a macro we can call in the body of defformat.

The macro generates a quoted snippet of code that builds an instance of the struct from its arguments and passes it to the BinFormat.FieldType.Util.add_field\\1 function that generates the correct code to insert into the defformat block. A quote block containing the struct literal with each field set to the relevant unquoted argument will generate the correct snippet. The quote block can be pipelined directly into the add_field function. Passing the macro without quoting it will not work.

defmacro my_ip_addr(name, default) do
  quote do
    %MyIPAddr{name: unquote(name), default: unquote(default)}
  end
  |> BinFormat.FieldType.Util.add_field()
end

The arguments to the macro should match the fields in the struct. Sometimes a module may have more than one macro, for example the built in types share an implementation but integer :a, 2, 8 is easier to understand than builtin :integer, :a, 2, 8. If it makes sense to put this kind of logic in your macros you should.

The macro can be defined anywhere but it should normally go in the same module as the struct.

Using the New Type

The definition of our new type is now complete and we are ready to use it in defformat. The macro will not be picked up automatically but it can be refered to by its full name.

defmodule MyFormat do
  use BinFormat
  
  defformat do
    MyIPAddr.my_ip_addr :address, {127, 0, 0, 1}
  end

You can import the module where it is defined to access it more easily if needed.

defmodule MyFormat do
  use BinFormat

  defformat do
    import MyIPAddr

    my_ip_addr :address1, {127, 0, 0, 1}
    my_ip_addr :address2, {192, 168, 1, 1}
  end
end

For more advanced uses the defformat macro can be wrapped by a macro that imports the needed modules at the start of the do block.

Conclusion

We have now built and used a simplified version of the standard ip_addr type. You can take a look at the source code of the standard version on GitHub for to see how the little endian option is implemented and the other types for ideas on how to implement what you need.

If you have any problems following this guide or getting your types to work, please raise an issue on this project’s GitHub Repository.