View Source BitwiseIp (bitwise_ip v1.1.0)
A struct representing an IP address encoded as an integer.
The Internet Protocol defines computer network addresses using fixed-width integers, which are efficient for both transmission and the implementation of logic using bitwise operations. IPv4 uses 32-bit integers, providing a space of 4,294,967,296 unique addresses. Due to the growing size of the internet, IPv6 uses 128-bit integers to provide an absurdly large address space.
These integers, however, are hard for humans to read. Therefore, we've
adopted customary notations that are a little easier to digest. IPv4 uses a
dotted octet notation, where each of the four bytes are written in decimal
notation and separated by .
, as in 127.0.0.1
. IPv6 is similar, but uses
hexadecimal notation on each of eight hextets separated by :
, as in
a:1:b:2:c:3:d:4
.
As such, representations for IP addresses in modern software have drifted
away from fixed-width integers. :inet
represents IP addresses as tuples
like {127, 0, 0, 1}
for IPv4 and {0xA, 1, 0xB, 2, 0xC, 3, 0xD, 4}
for
IPv6. These are less efficient in both the space to store the addresses and
the time it takes to perform various operations. For example, whereas
comparing two 32-bit IPv4 addresses is typically one machine instruction,
comparing two tuples involves memory indirection for the tuple layout and 4
separate integer comparisons. This could be even worse if you represent IPs
as strings in their human-readable format.
The difference is probably negligible for your application. In fact, Elixir &
Erlang don't have great support for fixed-width integer representations (see
t/0
for details). But in the interest of getting back to basics,
BitwiseIp
provides the missing interface for manipulating IP addresses as
the integers they were designed to be. This makes certain logic much easier
to express and improves micro-benchmarks compared to tuple-based libraries
(for whatever that's worth). The most useful functionality is in
BitwiseIp.Block
, which represents a
CIDR block.
However, BitwiseIp
is the fundamental structure that BitwiseIp.Block
is
built on.
Summary
Types
An integer-encoded IP address.
This type takes on two different shapes depending on the IP protocol. The supported protocols are IPv4 and IPv6.
Normally, the distinction would be down to the number of bits in a fixed-width integer representation. However, the Erlang VM doesn't support fixed-width integers, so there's no way to tell IPv4 addresses apart from IPv6 addresses using just a number. Therefore, this type is a struct with two fields:
:proto
- the protocol, either:v4
or:v6
:addr
- the integer encoding of the address
But again, the VM does not support fixed-width integers for the :addr
. In
the Erlang runtime system, the smallest unit of memory is a word: 4 bytes
on a 32-bit architecture, 8 bytes on a 64-bit architecture. Data is stored
using tagged pointers, where one word has 4 bits reserved as a tag
enumerating type information. One pattern of 4 bits says "I'm a float",
another pattern says "I'm an integer", and so on. When the data is small
enough to fit in the remaining bits of the word (28 bits or 60 bits,
depending on the architecture), it is stored as an immediate value.
Otherwise, it is boxed and the word instead contains a pointer to a section
of memory on the heap, which can basically be arbitrarily large. Read more in
A staged tag scheme for
Erlang by
Mikael Pettersson.
What this means for us is that :addr
may or may not spill onto the heap. On
a 32-bit machine, only IP addresses in the range of 0 to 2^28 fit as
immediate values. This covers most of the IPv4 range, but only a small
portion of the IPv6 range. 64-bit machines have 60 bits to play with, which
would comfortably fit any IPv4 address, but still requires boxing of IPv6
addresses. According to the Erlang efficiency
guide, large integers
are stored across at least 3 words. What's more, because we have to
distinguish between integers using the struct with the :proto
field, each
IP address requires an additional map allocation, which carries some
overhead.
So this isn't going to be a maximally compact representation of an IP
address. Such a thing isn't really possible on the Erlang VM. However,
storing the bulk of it as a single integer still lets us perform efficient
bitwise operations with less overhead than, say, :inet
-style tuples of
multiple integers.
@type v4() :: %BitwiseIp{addr: integer(), proto: :v4}
An IPv4 address.
The :addr
is an unsigned integer between 0 and 2^32 - 1. See t/0
for
discussion about the in-memory representation.
@type v6() :: %BitwiseIp{addr: integer(), proto: :v6}
An IPv6 address.
The :addr
is an unsigned integer between 0 and 2^128 - 1. See t/0
for
discussion about the in-memory representation.
Functions
@spec decode(v4()) :: :inet.ip4_address()
@spec decode(v6()) :: :inet.ip6_address()
Decodes a bitwise IP into an :inet
-style tuple.
The Erlang standard library represents IP addresses as tuples of integers: 4
octet values for IPv4, 8 hextet values for IPv6. This function decodes the
single number from a BitwiseIp
struct into its constituent parts. This can
be undone with encode/1
.
Beware of redundant usage in performance-critical paths. Because of the overhead in decoding the integer, excessive translation back & forth between the formats may outweigh any benefits gained from other operations on the single-integer representation.
Examples
iex> BitwiseIp.decode(%BitwiseIp{proto: :v4, addr: 2130706433})
{127, 0, 0, 1}
iex> BitwiseIp.decode(%BitwiseIp{proto: :v6, addr: 1})
{0, 0, 0, 0, 0, 0, 0, 1}
@spec encode(:inet.ip4_address()) :: v4()
@spec encode(:inet.ip6_address()) :: v6()
Encodes an :inet
-style tuple as a bitwise IP.
The Erlang standard library represents IP addresses as tuples of integers: 4
octet values for IPv4, 8 hextet values for IPv6. This function encodes the
separate values as a single number, which gets wrapped into a BitwiseIp
struct. This can be undone with decode/1
.
Beware of redundant usage in performance-critical paths. Because of the overhead in encoding the integer, excessive translation back & forth between the formats may outweigh any benefits gained from other operations on the single-integer representation.
Examples
iex> BitwiseIp.encode({127, 0, 0, 1})
%BitwiseIp{proto: :v4, addr: 2130706433}
iex> BitwiseIp.encode({0, 0, 0, 0, 0, 0, 0, 1})
%BitwiseIp{proto: :v6, addr: 1}
Parses a string into a bitwise IP.
This function parses IPv4 and IPv6 strings in their respective notations and
produces an encoded BitwiseIp
struct. This is done in an error-safe way by
returning a tagged tuple. To raise an error, use parse!/1
instead.
BitwiseIp
implements the String.Chars
protocol, so parsing can be undone
using to_string/1
.
Examples
iex> BitwiseIp.parse("127.0.0.1")
{:ok, %BitwiseIp{proto: :v4, addr: 2130706433}}
iex> BitwiseIp.parse("::1")
{:ok, %BitwiseIp{proto: :v6, addr: 1}}
iex> BitwiseIp.parse("not an ip")
{:error, "Invalid IP address \"not an ip\""}
iex> BitwiseIp.parse("192.168.0.1") |> elem(1) |> to_string()
"192.168.0.1"
iex> BitwiseIp.parse("fc00::") |> elem(1) |> to_string()
"fc00::"
An error-raising variant of parse/1
.
This function parses IPv4 and IPv6 strings in their respective notations and
produces an encoded BitwiseIp
struct. If the string is invalid, it raises
an ArgumentError
.
BitwiseIp
implements the String.Chars
protocol, so parsing can be undone
using to_string/1
.
Examples
iex> BitwiseIp.parse!("127.0.0.1")
%BitwiseIp{proto: :v4, addr: 2130706433}
iex> BitwiseIp.parse!("::1")
%BitwiseIp{proto: :v6, addr: 1}
iex> BitwiseIp.parse!("not an ip")
** (ArgumentError) Invalid IP address "not an ip"
iex> BitwiseIp.parse!("192.168.0.1") |> to_string()
"192.168.0.1"
iex> BitwiseIp.parse!("fc00::") |> to_string()
"fc00::"