View Source BitwiseIp (bitwise_ip v1.1.0)

A struct representing an IP address encoded as an integer.

The Internet Protocol defines computer network addresses using fixed-width integers, which are efficient for both transmission and the implementation of logic using bitwise operations. IPv4 uses 32-bit integers, providing a space of 4,294,967,296 unique addresses. Due to the growing size of the internet, IPv6 uses 128-bit integers to provide an absurdly large address space.

These integers, however, are hard for humans to read. Therefore, we've adopted customary notations that are a little easier to digest. IPv4 uses a dotted octet notation, where each of the four bytes are written in decimal notation and separated by ., as in 127.0.0.1. IPv6 is similar, but uses hexadecimal notation on each of eight hextets separated by :, as in a:1:b:2:c:3:d:4.

As such, representations for IP addresses in modern software have drifted away from fixed-width integers. :inet represents IP addresses as tuples like {127, 0, 0, 1} for IPv4 and {0xA, 1, 0xB, 2, 0xC, 3, 0xD, 4} for IPv6. These are less efficient in both the space to store the addresses and the time it takes to perform various operations. For example, whereas comparing two 32-bit IPv4 addresses is typically one machine instruction, comparing two tuples involves memory indirection for the tuple layout and 4 separate integer comparisons. This could be even worse if you represent IPs as strings in their human-readable format.

The difference is probably negligible for your application. In fact, Elixir & Erlang don't have great support for fixed-width integer representations (see t/0 for details). But in the interest of getting back to basics, BitwiseIp provides the missing interface for manipulating IP addresses as the integers they were designed to be. This makes certain logic much easier to express and improves micro-benchmarks compared to tuple-based libraries (for whatever that's worth). The most useful functionality is in BitwiseIp.Block, which represents a CIDR block. However, BitwiseIp is the fundamental structure that BitwiseIp.Block is built on.

Summary

Types

t()

An integer-encoded IP address.

An IPv4 address.

An IPv6 address.

Functions

Decodes a bitwise IP into an :inet-style tuple.

Encodes an :inet-style tuple as a bitwise IP.

Parses a string into a bitwise IP.

An error-raising variant of parse/1.

Types

@type t() :: v4() | v6()

An integer-encoded IP address.

This type takes on two different shapes depending on the IP protocol. The supported protocols are IPv4 and IPv6.

Normally, the distinction would be down to the number of bits in a fixed-width integer representation. However, the Erlang VM doesn't support fixed-width integers, so there's no way to tell IPv4 addresses apart from IPv6 addresses using just a number. Therefore, this type is a struct with two fields:

  • :proto - the protocol, either :v4 or :v6
  • :addr - the integer encoding of the address

But again, the VM does not support fixed-width integers for the :addr. In the Erlang runtime system, the smallest unit of memory is a word: 4 bytes on a 32-bit architecture, 8 bytes on a 64-bit architecture. Data is stored using tagged pointers, where one word has 4 bits reserved as a tag enumerating type information. One pattern of 4 bits says "I'm a float", another pattern says "I'm an integer", and so on. When the data is small enough to fit in the remaining bits of the word (28 bits or 60 bits, depending on the architecture), it is stored as an immediate value. Otherwise, it is boxed and the word instead contains a pointer to a section of memory on the heap, which can basically be arbitrarily large. Read more in A staged tag scheme for Erlang by Mikael Pettersson.

What this means for us is that :addr may or may not spill onto the heap. On a 32-bit machine, only IP addresses in the range of 0 to 2^28 fit as immediate values. This covers most of the IPv4 range, but only a small portion of the IPv6 range. 64-bit machines have 60 bits to play with, which would comfortably fit any IPv4 address, but still requires boxing of IPv6 addresses. According to the Erlang efficiency guide, large integers are stored across at least 3 words. What's more, because we have to distinguish between integers using the struct with the :proto field, each IP address requires an additional map allocation, which carries some overhead.

So this isn't going to be a maximally compact representation of an IP address. Such a thing isn't really possible on the Erlang VM. However, storing the bulk of it as a single integer still lets us perform efficient bitwise operations with less overhead than, say, :inet-style tuples of multiple integers.

@type v4() :: %BitwiseIp{addr: integer(), proto: :v4}

An IPv4 address.

The :addr is an unsigned integer between 0 and 2^32 - 1. See t/0 for discussion about the in-memory representation.

@type v6() :: %BitwiseIp{addr: integer(), proto: :v6}

An IPv6 address.

The :addr is an unsigned integer between 0 and 2^128 - 1. See t/0 for discussion about the in-memory representation.

Functions

@spec decode(v4()) :: :inet.ip4_address()
@spec decode(v6()) :: :inet.ip6_address()

Decodes a bitwise IP into an :inet-style tuple.

The Erlang standard library represents IP addresses as tuples of integers: 4 octet values for IPv4, 8 hextet values for IPv6. This function decodes the single number from a BitwiseIp struct into its constituent parts. This can be undone with encode/1.

Beware of redundant usage in performance-critical paths. Because of the overhead in decoding the integer, excessive translation back & forth between the formats may outweigh any benefits gained from other operations on the single-integer representation.

Examples

iex> BitwiseIp.decode(%BitwiseIp{proto: :v4, addr: 2130706433})
{127, 0, 0, 1}

iex> BitwiseIp.decode(%BitwiseIp{proto: :v6, addr: 1})
{0, 0, 0, 0, 0, 0, 0, 1}
@spec encode(:inet.ip4_address()) :: v4()
@spec encode(:inet.ip6_address()) :: v6()

Encodes an :inet-style tuple as a bitwise IP.

The Erlang standard library represents IP addresses as tuples of integers: 4 octet values for IPv4, 8 hextet values for IPv6. This function encodes the separate values as a single number, which gets wrapped into a BitwiseIp struct. This can be undone with decode/1.

Beware of redundant usage in performance-critical paths. Because of the overhead in encoding the integer, excessive translation back & forth between the formats may outweigh any benefits gained from other operations on the single-integer representation.

Examples

iex> BitwiseIp.encode({127, 0, 0, 1})
%BitwiseIp{proto: :v4, addr: 2130706433}

iex> BitwiseIp.encode({0, 0, 0, 0, 0, 0, 0, 1})
%BitwiseIp{proto: :v6, addr: 1}
@spec parse(String.t()) :: {:ok, t()} | {:error, String.t()}

Parses a string into a bitwise IP.

This function parses IPv4 and IPv6 strings in their respective notations and produces an encoded BitwiseIp struct. This is done in an error-safe way by returning a tagged tuple. To raise an error, use parse!/1 instead.

BitwiseIp implements the String.Chars protocol, so parsing can be undone using to_string/1.

Examples

iex> BitwiseIp.parse("127.0.0.1")
{:ok, %BitwiseIp{proto: :v4, addr: 2130706433}}

iex> BitwiseIp.parse("::1")
{:ok, %BitwiseIp{proto: :v6, addr: 1}}

iex> BitwiseIp.parse("not an ip")
{:error, "Invalid IP address \"not an ip\""}

iex> BitwiseIp.parse("192.168.0.1") |> elem(1) |> to_string()
"192.168.0.1"

iex> BitwiseIp.parse("fc00::") |> elem(1) |> to_string()
"fc00::"
@spec parse!(String.t()) :: t()

An error-raising variant of parse/1.

This function parses IPv4 and IPv6 strings in their respective notations and produces an encoded BitwiseIp struct. If the string is invalid, it raises an ArgumentError.

BitwiseIp implements the String.Chars protocol, so parsing can be undone using to_string/1.

Examples

iex> BitwiseIp.parse!("127.0.0.1")
%BitwiseIp{proto: :v4, addr: 2130706433}

iex> BitwiseIp.parse!("::1")
%BitwiseIp{proto: :v6, addr: 1}

iex> BitwiseIp.parse!("not an ip")
** (ArgumentError) Invalid IP address "not an ip"

iex> BitwiseIp.parse!("192.168.0.1") |> to_string()
"192.168.0.1"

iex> BitwiseIp.parse!("fc00::") |> to_string()
"fc00::"