glazer (glazer v0.3.0)
View SourceFast JSON encoding and decoding using the glaze C++ library.
By default JSON null is represented as the atom null. To change it
application-wide, set the null env key in your config:
{glazer, [{null, nil}]}.See also [https://github.com/stephenberry/glaze]
Summary
Types
CSV decode options
CSV encode options
Decode options
Encode options
YAML decode options
YAML encode options
Functions
Decode a CSV binary or iolist to a list of rows.
Decode a CSV binary or iolist to a list of rows, with options.
Raises Reason::atom() (unterminated_quoted_field or duplicate_header)
on invalid input.
Encode a list of rows to a CSV binary.
Encode a list of rows to a CSV binary, with options.
Create a new incremental decoder for feeding CSV in chunks (e.g. from a socket or file), useful when the whole input isn't available up front.
Create a new incremental CSV decoder, passing Opts through to every
csv_decode/2 call.
Signal end-of-stream: decode any remaining buffered bytes as a final row (useful when the input doesn't end with a trailing line break).
Feed a chunk of bytes into the decoder, returning any complete CSV rows found so far (in order) along with the updated decoder.
Decode a CSV binary or iolist, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a CSV binary or iolist with options, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a JSON number string to an integer.
Raises invalid_number_format on invalid input.
Encode an integer to its JSON string representation.
Raises badarg if Int is not an integer.
Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as
maps (default). Raises {parse_error, Msg} on invalid input.
Decode a JSON binary or iolist to an Erlang term with options.
Raises {parse_error, Reason} on invalid input.
Encode an Erlang term to a JSON binary.
Encode an Erlang term to a JSON binary with options.
Minify a JSON binary or iolist, removing all unnecessary whitespace.
Pretty-print a JSON binary or iolist with two-space indentation.
Locate the end of the next complete top-level JSON value in Bin, without
decoding it.
Resume scanning Bin (the unconsumed remainder plus newly-appended bytes)
from ScanState.
Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).
Create a new incremental decoder, passing Opts through to every
json_decode/2 call.
Signal end-of-stream: decode any remaining buffered bytes as a final value
(useful for a trailing bare scalar, e.g. a lone number or true/null,
which the scanner can't otherwise distinguish from a value that's still
being written to mid-chunk).
Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.
Decode a JSON binary or iolist, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON binary or iolist with options, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON number string to an integer, returning {ok, Int} or
{error, invalid_number_format} instead of raising.
Decode a YAML binary or iolist to an Erlang term. YAML mappings are returned
as maps (default). Raises {parse_error, Reason} on invalid input.
Decode a YAML binary or iolist to an Erlang term with options.
Raises {parse_error, Msg} on invalid input.
Encode an Erlang term to a YAML binary in block style (2-space indentation, sequences at the same indentation as the mapping key that owns them).
Encode an Erlang term to a YAML binary in block style with options.
Decode a YAML binary or iolist, returning {ok, Term} or
{error, Msg} instead of raising.
Decode a YAML binary or iolist with options, returning {ok, Term} or
{error, Msg} instead of raising.
Types
-type csv_decode_opt() :: {delimiter, char()} | headers | {keys, atom | existing_atom | binary}.
-type csv_decode_opts() :: [csv_decode_opt()].
CSV decode options:
{delimiter, Char}- field delimiter (default$,)headers- treat the first row as column names and decode each subsequent row as a map keyed by those names, instead of returning every row as a list of fields{keys, atom}- withheaders, decode column names as atoms{keys, existing_atom}- withheaders, decode column names as existing atoms, falling back to binaries for unknown atoms{keys, binary}- withheaders, decode column names as binaries (default)
-type csv_encode_opt() :: {delimiter, char()} | headers | {line_ending, lf | crlf}.
-type csv_encode_opts() :: [csv_encode_opt()].
CSV encode options:
{delimiter, Char}- field delimiter (default$,)headers- input is a list of maps; the first map's keys become the header row, and subsequent maps are encoded as rows in that column order (missing keys produce empty fields){line_ending, lf | crlf}- line terminator (defaultcrlf, per RFC 4180)
-opaque csv_stream_decoder()
-type decode_opt() :: object_as_tuple | use_nil | {null_term, atom()} | {keys, atom | existing_atom | binary} | dedupe_keys.
-type decode_opts() :: [decode_opt()].
Decode options:
object_as_tuple- decode JSON objects as{[{K, V}]}proplists rather than mapsuse_nil- use the atomnilfor JSON null{null_term, Atom}- useAtomfor JSON null{keys, atom}- decode object keys as atoms{keys, existing_atom}- decode keys as existing atoms, fall back to binary{keys, binary}- decode keys as binaries (default)dedupe_keys- withobject_as_tuple, eliminate duplicate object keys from the resulting proplist, keeping the last occurrence's value (and position). Has no effect when objects are decoded as maps (the default) or with{keys, atom | existing_atom}: a JSON object with duplicate keys is always deduped (last value wins) when decoded to a map, since maps cannot represent duplicate keys.
-type encode_opt() :: pretty | uescape | force_utf8 | use_nil | {null_term, atom()}.
-type encode_opts() :: [encode_opt()].
Encode options:
pretty- pretty-print the JSON outputuescape- escape non-ASCII characters as \uXXXX sequencesforce_utf8- fix invalid UTF-8 sequences before encodinguse_nil- encode the atomnilas JSONnull{null_term, Atom}- encodeAtomas JSONnull
-opaque json_stream_decoder()
-type scan_state() :: tuple().
-type yaml_decode_opt() :: use_nil | {null_term, atom()} | {keys, atom | existing_atom | binary} | yaml_1_1_bools.
-type yaml_decode_opts() :: [yaml_decode_opt()].
YAML decode options:
use_nil- use the atomnilfor YAMLnull/~/empty values{null_term, Atom}- useAtomfor YAMLnull/~/empty values{keys, atom}- decode mapping keys as atoms{keys, existing_atom}- decode mapping keys as existing atoms, fall back to binary{keys, binary}- decode mapping keys as binaries (default)yaml_1_1_bools- additionally treatyes/no/on/off(and case variants) as booleans, per the YAML 1.1 core schema. By default (YAML 1.2 core schema) onlytrue/falseare recognized as booleans.
-type yaml_encode_opt() :: use_nil | {null_term, atom()}.
-type yaml_encode_opts() :: [yaml_encode_opt()].
YAML encode options:
use_nil- treat the atomnilas YAMLnull{null_term, Atom}- treatAtomas YAMLnull
Functions
Decode a CSV binary or iolist to a list of rows.
By default each row is a list of binary fields. With the headers option,
the first row is used as column names and each subsequent row is decoded
as a map. Raises unterminated_quoted_field or duplicate_header on
invalid input.
-spec csv_decode(binary() | iolist(), csv_decode_opts()) -> [[binary()]] | [map()].
Decode a CSV binary or iolist to a list of rows, with options.
Raises Reason::atom() (unterminated_quoted_field or duplicate_header)
on invalid input.
Encode a list of rows to a CSV binary.
Each row is a list of fields (binaries, atoms, integers, or floats). Fields containing the delimiter, a double quote, or a line break are quoted per RFC 4180, with embedded quotes doubled.
-spec csv_encode([[term()]] | [map()], csv_encode_opts()) -> binary().
Encode a list of rows to a CSV binary, with options.
With the headers option, Data is a list of maps: the first map's keys
become the header row (in iteration order), and each map is encoded as a
row in that column order.
-spec csv_stream_decoder() -> csv_stream_decoder().
Create a new incremental decoder for feeding CSV in chunks (e.g. from a socket or file), useful when the whole input isn't available up front.
Each complete row is decoded as soon as its terminating line break is seen,
via csv_decode/2 on that single row. Only the row
boundary detection is incremental — a small byte-scanner tracks whether
the cursor is inside a quoted field across chunks, so that \n/\r\n
inside quoted fields doesn't end a row.
With the headers option, the first complete row is captured as the header
and used to decode every subsequent row as a map; no row is emitted for the
header itself.
Example
1> D0 = glazer:csv_stream_decoder(),
2> {Rows1, D1} = glazer:csv_stream_feed(D0, <<"a,b\n1,2\n3,">>),
3> Rows1.
[[<<"a">>,<<"b">>],[<<"1">>,<<"2">>]]
4> {Rows2, D2} = glazer:csv_stream_feed(D1, <<"4\n">>),
5> Rows2.
[[<<"3">>,<<"4">>]]
6> glazer:csv_stream_eof(D2).
{ok, []}
-spec csv_stream_decoder(csv_decode_opts()) -> csv_stream_decoder().
Create a new incremental CSV decoder, passing Opts through to every
csv_decode/2 call.
-spec csv_stream_eof(csv_stream_decoder()) -> {ok, [[binary()]] | [map()]} | {error, term()}.
Signal end-of-stream: decode any remaining buffered bytes as a final row (useful when the input doesn't end with a trailing line break).
Returns {ok, Rows} with zero or one trailing row, or {error, Reason} if
the remaining bytes don't form a valid row.
Example
1> D0 = glazer:csv_stream_decoder(),
2> {Rows1, D1} = glazer:csv_stream_feed(D0, <<"a,b\n1,2">>),
3> Rows1.
[[<<"a">>,<<"b">>]]
4> glazer:csv_stream_eof(D1).
{ok, [[<<"1">>,<<"2">>]]}
-spec csv_stream_feed(csv_stream_decoder(), binary() | iolist()) -> {[[binary()]] | [map()], csv_stream_decoder()}.
Feed a chunk of bytes into the decoder, returning any complete CSV rows found so far (in order) along with the updated decoder.
Raises the same exceptions as csv_decode/2 if a row that
the scanner deemed complete fails to decode.
Example
loop(Socket, D0) ->
case gen_tcp:recv(Socket, 0) of
{ok, Chunk} ->
{Rows, D1} = glazer:csv_stream_feed(D0, Chunk),
handle_rows(Rows),
loop(Socket, D1);
{error, closed} ->
case glazer:csv_stream_eof(D0) of
{ok, Trailing} -> handle_rows(Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end
end.
Decode a CSV binary or iolist, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
-spec csv_try_decode(binary() | iolist(), csv_decode_opts()) -> {ok, [[binary()]] | [map()]} | {error, atom()}.
Decode a CSV binary or iolist with options, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a JSON number string to an integer.
Raises invalid_number_format on invalid input.
Encode an integer to its JSON string representation.
Raises badarg if Int is not an integer.
Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as
maps (default). Raises {parse_error, Msg} on invalid input.
-spec json_decode(binary() | iolist(), decode_opts()) -> term().
Decode a JSON binary or iolist to an Erlang term with options.
Raises {parse_error, Reason} on invalid input.
Encode an Erlang term to a JSON binary.
-spec json_encode(term(), encode_opts()) -> binary().
Encode an Erlang term to a JSON binary with options.
Minify a JSON binary or iolist, removing all unnecessary whitespace.
Pretty-print a JSON binary or iolist with two-space indentation.
-spec json_scan(binary() | iolist()) -> {complete, non_neg_integer()} | {incomplete, scan_state()}.
Locate the end of the next complete top-level JSON value in Bin, without
decoding it.
Returns:
{complete, EndOffset}- a complete value spansbinary:part(Bin, 0, EndOffset); the rest ofBin(if any) is left over for the next call{incomplete, ScanState}-Bindoesn't yet contain a complete value; feed more data viajson_scan/2once it's available, passing the entire unconsumed remainder (thisBin, with new bytes appended) plusScanState
This is the low-level primitive behind json_stream_feed/2;
most callers should use the stream_* API instead.
Example
Slicing off complete values from a buffer of concatenated JSON:
1> Buf0 = <<"{\"a\":1} {\"b\":2}">>,
2> {complete, End1} = glazer:json_scan(Buf0).
{complete, 7}
3> <<Val1:End1/binary, Buf1/binary>> = Buf0,
4> Val1.
<<"{\"a\":1}">>
5> Buf1.
<<" {\"b\":2}">>
6> {complete, End2} = glazer:json_scan(Buf1).
{complete, 8}Resuming a scan once more bytes arrive:
1> {incomplete, S0} = glazer:json_scan(<<"{\"a\":">>).
{incomplete, {6,1,false,false,true,false}}
2> glazer:json_scan(<<"{\"a\":1}">>, S0).
{complete, 7}
-spec json_scan(binary() | iolist(), scan_state()) -> {complete, non_neg_integer()} | {incomplete, scan_state()}.
Resume scanning Bin (the unconsumed remainder plus newly-appended bytes)
from ScanState.
-spec json_stream_decoder() -> json_stream_decoder().
Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).
Decoding itself is not incremental — each complete top-level value is
still decoded in a single pass via json_decode/2 using the
library's fast whole-buffer decoder. Only the boundary detection (finding
where one value ends and the next begins) is incremental, via a small
byte-scanner that tracks nesting/string state across chunks.
Example
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"{\"a\":1} {\"b\":">>),
3> Vals1.
[#{<<"a">> => 1}]
4> {Vals2, _D2} = glazer:json_stream_feed(D1, <<"2}">>),
5> Vals2.
[#{<<"b">> => 2}]
-spec json_stream_decoder(decode_opts()) -> json_stream_decoder().
Create a new incremental decoder, passing Opts through to every
json_decode/2 call.
-spec json_stream_eof(json_stream_decoder()) -> {ok, [term()]} | {error, term()}.
Signal end-of-stream: decode any remaining buffered bytes as a final value
(useful for a trailing bare scalar, e.g. a lone number or true/null,
which the scanner can't otherwise distinguish from a value that's still
being written to mid-chunk).
Returns {ok, [Term]} with zero or one trailing value, or {error, Reason} if the remaining bytes don't form a complete value.
Example
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"123">>),
3> Vals1.
[]
4> glazer:json_stream_eof(D1).
{ok, [123]}A stream that ends mid-value (e.g. a dropped connection) yields an error instead of silently dropping the partial data:
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"{\"a\":1, \"b\":">>),
3> Vals1.
[]
4> glazer:json_stream_eof(D1).
{error, _Reason}
-spec json_stream_feed(json_stream_decoder(), binary() | iolist()) -> {[term()], json_stream_decoder()}.
Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.
Raises the same exceptions as json_decode/2 (e.g.
Reason) if a value that the scanner deemed complete fails
to decode.
Example
Call json_stream_feed/2 for each chunk received from the source while more
data may still arrive, and json_stream_eof/1 once the source
is exhausted to flush any trailing value:
loop(Socket, D0) ->
case gen_tcp:recv(Socket, 0) of
{ok, Chunk} ->
{Vals, D1} = glazer:json_stream_feed(D0, Chunk),
handle_values(Vals),
loop(Socket, D1);
{error, closed} ->
case glazer:json_stream_eof(D0) of
{ok, Trailing} -> handle_values(Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end
end.The same decoder fits naturally into a gen_server driving an
active-mode socket: keep the json_stream_decoder() in the process state,
feed it from handle_info({tcp, ...}), and flush it on
{tcp_closed, ...}:
-module(json_conn).
-behaviour(gen_server).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2]).
-record(state, {socket, decoder}).
start_link(Socket) ->
gen_server:start_link(?MODULE, Socket, []).
init(Socket) ->
inet:setopts(Socket, [{active, once}]),
{ok, #state{socket = Socket, decoder = glazer:json_stream_decoder()}}.
handle_info({tcp, Socket, Data}, #state{socket = Socket, decoder = D0} = State) ->
{Vals, D1} = glazer:json_stream_feed(D0, Data),
lists:foreach(fun handle_value/1, Vals),
inet:setopts(Socket, [{active, once}]),
{noreply, State#state{decoder = D1}};
handle_info({tcp_closed, Socket}, #state{socket = Socket, decoder = D0} = State) ->
case glazer:json_stream_eof(D0) of
{ok, Trailing} -> lists:foreach(fun handle_value/1, Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end,
{stop, normal, State};
handle_info({tcp_error, Socket, Reason}, #state{socket = Socket} = State) ->
{stop, Reason, State}.
handle_call(_Request, _From, State) -> {reply, ok, State}.
handle_cast(_Request, State) -> {noreply, State}.
handle_value(Val) ->
io:format("received: ~p~n", [Val]).
Decode a JSON binary or iolist, returning {ok, Term} or
{error, Reason} instead of raising.
-spec json_try_decode(binary() | iolist(), decode_opts()) -> {ok, term()} | {error, binary()}.
Decode a JSON binary or iolist with options, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON number string to an integer, returning {ok, Int} or
{error, invalid_number_format} instead of raising.
Decode a YAML binary or iolist to an Erlang term. YAML mappings are returned
as maps (default). Raises {parse_error, Reason} on invalid input.
-spec yaml_decode(binary() | iolist(), yaml_decode_opts()) -> term().
Decode a YAML binary or iolist to an Erlang term with options.
Raises {parse_error, Msg} on invalid input.
Encode an Erlang term to a YAML binary in block style (2-space indentation, sequences at the same indentation as the mapping key that owns them).
-spec yaml_encode(term(), yaml_encode_opts()) -> binary().
Encode an Erlang term to a YAML binary in block style with options.
Decode a YAML binary or iolist, returning {ok, Term} or
{error, Msg} instead of raising.
-spec yaml_try_decode(binary() | iolist(), yaml_decode_opts()) -> {ok, term()} | {error, binary()}.
Decode a YAML binary or iolist with options, returning {ok, Term} or
{error, Msg} instead of raising.