glazer (glazer v0.3.2)
View SourceFast JSON encoding and decoding using the glaze C++ library.
By default JSON null is represented as the atom null. To change it
application-wide, set the null env key in your config:
{glazer, [{null, nil}]}.See also [https://github.com/stephenberry/glaze]
Summary
Types
CSV decode options
CSV encode options
Decode options
Encode options
YAML decode options
YAML encode options
Functions
Decode a CSV binary or iolist to a list of rows.
Decode a CSV binary or iolist to a list of rows, with options.
Raises Reason::atom() (unterminated_quoted_field or duplicate_header)
on invalid input.
Encode a list of rows to a CSV binary.
Encode a list of rows to a CSV binary, with options.
Create a new incremental decoder for feeding CSV in chunks (e.g. from a socket or file), useful when the whole input isn't available up front.
Create a new incremental CSV decoder, passing Opts through to every
csv_decode/2 call.
Signal end-of-stream: decode any remaining buffered bytes as a final row (useful when the input doesn't end with a trailing line break).
Feed a chunk of bytes into the decoder, returning any complete CSV rows found so far (in order) along with the updated decoder.
Decode a CSV binary or iolist, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a CSV binary or iolist with options, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a JSON number string to an integer.
Raises invalid_number_format on invalid input.
Encode an integer to its JSON string representation.
Raises badarg if Int is not an integer.
Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as
maps (default). Raises {parse_error, Msg} on invalid input.
Decode a JSON binary or iolist to an Erlang term with options.
Raises {parse_error, Reason} on invalid input.
Encode an Erlang term to a JSON binary.
Encode an Erlang term to a JSON binary with options.
Minify a JSON binary or iolist, removing all unnecessary whitespace.
Pretty-print a JSON binary or iolist with two-space indentation.
Run a jq Filter program against a JSON binary or
iolist Input, returning one Erlang term per value produced by the filter
(in the order they are emitted by jq).
Like json_query/2, but decodes each result term using DecodeOpts
(see json_decode/2).
Locate the end of the next complete top-level JSON value in Bin, without
decoding it.
Resume scanning Bin (the unconsumed remainder plus newly-appended bytes)
from ScanState.
Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).
Create a new incremental decoder, passing Opts through to every
json_decode/2 call.
Signal end-of-stream: decode any remaining buffered bytes as a final value
(useful for a trailing bare scalar, e.g. a lone number or true/null,
which the scanner can't otherwise distinguish from a value that's still
being written to mid-chunk).
Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.
Decode a JSON binary or iolist, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON binary or iolist with options, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON number string to an integer, returning {ok, Int} or
{error, invalid_number_format} instead of raising.
Decode a YAML binary or iolist to an Erlang term. YAML mappings are returned
as maps (default). Raises {parse_error, Reason} on invalid input.
Decode a YAML binary or iolist to an Erlang term with options.
Raises {parse_error, Msg} on invalid input.
Encode an Erlang term to a YAML binary in block style (2-space indentation, sequences at the same indentation as the mapping key that owns them).
Encode an Erlang term to a YAML binary in block style with options.
Decode a YAML binary or iolist, returning {ok, Term} or
{error, Msg} instead of raising.
Decode a YAML binary or iolist with options, returning {ok, Term} or
{error, Msg} instead of raising.
Types
-type csv_decode_opt() :: {delimiter, char()} | headers | {keys, atom | existing_atom | binary}.
-type csv_decode_opts() :: [csv_decode_opt()].
CSV decode options:
{delimiter, Char}- field delimiter (default$,)headers- treat the first row as column names and decode each subsequent row as a map keyed by those names, instead of returning every row as a list of fields{keys, atom}- withheaders, decode column names as atoms{keys, existing_atom}- withheaders, decode column names as existing atoms, falling back to binaries for unknown atoms{keys, binary}- withheaders, decode column names as binaries (default)
-type csv_encode_opt() :: {delimiter, char()} | headers | {line_ending, lf | crlf}.
-type csv_encode_opts() :: [csv_encode_opt()].
CSV encode options:
{delimiter, Char}- field delimiter (default$,)headers- input is a list of maps; the first map's keys become the header row, and subsequent maps are encoded as rows in that column order (missing keys produce empty fields){line_ending, lf | crlf}- line terminator (defaultcrlf, per RFC 4180)
-opaque csv_stream_decoder()
-type decode_opt() :: object_as_tuple | use_nil | {null_term, atom()} | {keys, atom | existing_atom | binary} | dedupe_keys.
-type decode_opts() :: [decode_opt()].
Decode options:
object_as_tuple- decode JSON objects as{[{K, V}]}proplists rather than mapsuse_nil- use the atomnilfor JSON null{null_term, Atom}- useAtomfor JSON null{keys, atom}- decode object keys as atoms{keys, existing_atom}- decode keys as existing atoms, fall back to binary{keys, binary}- decode keys as binaries (default)dedupe_keys- withobject_as_tuple, eliminate duplicate object keys from the resulting proplist, keeping the last occurrence's value (and position). Has no effect when objects are decoded as maps (the default) or with{keys, atom | existing_atom}: a JSON object with duplicate keys is always deduped (last value wins) when decoded to a map, since maps cannot represent duplicate keys.
-type encode_opt() :: pretty | uescape | force_utf8 | use_nil | {null_term, atom()}.
-type encode_opts() :: [encode_opt()].
Encode options:
pretty- pretty-print the JSON outputuescape- escape non-ASCII characters as \uXXXX sequencesforce_utf8- fix invalid UTF-8 sequences before encodinguse_nil- encode the atomnilas JSONnull{null_term, Atom}- encodeAtomas JSONnull
-opaque json_stream_decoder()
-type scan_state() :: tuple().
-type yaml_decode_opt() :: use_nil | {null_term, atom()} | {keys, atom | existing_atom | binary} | yaml_1_1_bools.
-type yaml_decode_opts() :: [yaml_decode_opt()].
YAML decode options:
use_nil- use the atomnilfor YAMLnull/~/empty values{null_term, Atom}- useAtomfor YAMLnull/~/empty values{keys, atom}- decode mapping keys as atoms{keys, existing_atom}- decode mapping keys as existing atoms, fall back to binary{keys, binary}- decode mapping keys as binaries (default)yaml_1_1_bools- additionally treatyes/no/on/off(and case variants) as booleans, per the YAML 1.1 core schema. By default (YAML 1.2 core schema) onlytrue/falseare recognized as booleans.
-type yaml_encode_opt() :: use_nil | {null_term, atom()}.
-type yaml_encode_opts() :: [yaml_encode_opt()].
YAML encode options:
use_nil- treat the atomnilas YAMLnull{null_term, Atom}- treatAtomas YAMLnull
Functions
Decode a CSV binary or iolist to a list of rows.
By default each row is a list of binary fields. With the headers option,
the first row is used as column names and each subsequent row is decoded
as a map. Raises unterminated_quoted_field or duplicate_header on
invalid input.
-spec csv_decode(binary() | iolist(), csv_decode_opts()) -> [[binary()]] | [map()].
Decode a CSV binary or iolist to a list of rows, with options.
Raises Reason::atom() (unterminated_quoted_field or duplicate_header)
on invalid input.
Encode a list of rows to a CSV binary.
Each row is a list of fields (binaries, atoms, integers, or floats). Fields containing the delimiter, a double quote, or a line break are quoted per RFC 4180, with embedded quotes doubled.
-spec csv_encode([[term()]] | [map()], csv_encode_opts()) -> binary().
Encode a list of rows to a CSV binary, with options.
With the headers option, Data is a list of maps: the first map's keys
become the header row (in iteration order), and each map is encoded as a
row in that column order.
-spec csv_stream_decoder() -> csv_stream_decoder().
Create a new incremental decoder for feeding CSV in chunks (e.g. from a socket or file), useful when the whole input isn't available up front.
Each complete row is decoded as soon as its terminating line break is seen,
via csv_decode/2 on that single row. Only the row
boundary detection is incremental — a small byte-scanner tracks whether
the cursor is inside a quoted field across chunks, so that \n/\r\n
inside quoted fields doesn't end a row.
With the headers option, the first complete row is captured as the header
and used to decode every subsequent row as a map; no row is emitted for the
header itself.
Example
1> D0 = glazer:csv_stream_decoder(),
2> {Rows1, D1} = glazer:csv_stream_feed(D0, <<"a,b\n1,2\n3,">>),
3> Rows1.
[[<<"a">>,<<"b">>],[<<"1">>,<<"2">>]]
4> {Rows2, D2} = glazer:csv_stream_feed(D1, <<"4\n">>),
5> Rows2.
[[<<"3">>,<<"4">>]]
6> glazer:csv_stream_eof(D2).
{ok, []}
-spec csv_stream_decoder(csv_decode_opts()) -> csv_stream_decoder().
Create a new incremental CSV decoder, passing Opts through to every
csv_decode/2 call.
-spec csv_stream_eof(csv_stream_decoder()) -> {ok, [[binary()]] | [map()]} | {error, term()}.
Signal end-of-stream: decode any remaining buffered bytes as a final row (useful when the input doesn't end with a trailing line break).
Returns {ok, Rows} with zero or one trailing row, or {error, Reason} if
the remaining bytes don't form a valid row.
Example
1> D0 = glazer:csv_stream_decoder(),
2> {Rows1, D1} = glazer:csv_stream_feed(D0, <<"a,b\n1,2">>),
3> Rows1.
[[<<"a">>,<<"b">>]]
4> glazer:csv_stream_eof(D1).
{ok, [[<<"1">>,<<"2">>]]}
-spec csv_stream_feed(csv_stream_decoder(), binary() | iolist()) -> {[[binary()]] | [map()], csv_stream_decoder()}.
Feed a chunk of bytes into the decoder, returning any complete CSV rows found so far (in order) along with the updated decoder.
Raises the same exceptions as csv_decode/2 if a row that
the scanner deemed complete fails to decode.
Example
loop(Socket, D0) ->
case gen_tcp:recv(Socket, 0) of
{ok, Chunk} ->
{Rows, D1} = glazer:csv_stream_feed(D0, Chunk),
handle_rows(Rows),
loop(Socket, D1);
{error, closed} ->
case glazer:csv_stream_eof(D0) of
{ok, Trailing} -> handle_rows(Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end
end.
Decode a CSV binary or iolist, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
-spec csv_try_decode(binary() | iolist(), csv_decode_opts()) -> {ok, [[binary()]] | [map()]} | {error, atom()}.
Decode a CSV binary or iolist with options, returning {ok, Rows} or
{error, Reason} instead of raising, where Reason is
unterminated_quoted_field or duplicate_header.
Decode a JSON number string to an integer.
Raises invalid_number_format on invalid input.
Encode an integer to its JSON string representation.
Raises badarg if Int is not an integer.
Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as
maps (default). Raises {parse_error, Msg} on invalid input.
-spec json_decode(binary() | iolist(), decode_opts()) -> term().
Decode a JSON binary or iolist to an Erlang term with options.
Raises {parse_error, Reason} on invalid input.
Encode an Erlang term to a JSON binary.
-spec json_encode(term(), encode_opts()) -> binary().
Encode an Erlang term to a JSON binary with options.
Minify a JSON binary or iolist, removing all unnecessary whitespace.
Pretty-print a JSON binary or iolist with two-space indentation.
-spec json_query(binary() | iolist(), binary() | iolist()) -> {ok, [term()]} | {error, json_query_reason()}.
Run a jq Filter program against a JSON binary or
iolist Input, returning one Erlang term per value produced by the filter
(in the order they are emitted by jq).
Requires glazer to have been built against libjq; if libjq was not
available at build time, this returns {error, jq_not_available}.
A runtime error raised by the filter itself (e.g. via jq's error/0,1) is
returned as {error, Msg} where Msg is the binary message produced by jq.
1> glazer:json_query(<<"{\\"a\\":[1,2,3]}">>, <<".a[]">>).
{ok,[1,2,3]}
2> glazer:json_query(<<"{\\"a\\":1}">>, <<".b">>).
{ok,[null]}
3> glazer:json_query(<<"not json">>, <<".">>).
{error, invalid_input}
-spec json_query(binary() | iolist(), binary() | iolist(), decode_opts()) -> {ok, [term()]} | {error, json_query_reason()}.
Like json_query/2, but decodes each result term using DecodeOpts
(see json_decode/2).
-spec json_scan(binary() | iolist()) -> {complete, non_neg_integer()} | {incomplete, scan_state()}.
Locate the end of the next complete top-level JSON value in Bin, without
decoding it.
Returns:
{complete, EndOffset}- a complete value spansbinary:part(Bin, 0, EndOffset); the rest ofBin(if any) is left over for the next call{incomplete, ScanState}-Bindoesn't yet contain a complete value; feed more data viajson_scan/2once it's available, passing the entire unconsumed remainder (thisBin, with new bytes appended) plusScanState
This is the low-level primitive behind json_stream_feed/2;
most callers should use the stream_* API instead.
Example
Slicing off complete values from a buffer of concatenated JSON:
1> Buf0 = <<"{\"a\":1} {\"b\":2}">>,
2> {complete, End1} = glazer:json_scan(Buf0).
{complete, 7}
3> <<Val1:End1/binary, Buf1/binary>> = Buf0,
4> Val1.
<<"{\"a\":1}">>
5> Buf1.
<<" {\"b\":2}">>
6> {complete, End2} = glazer:json_scan(Buf1).
{complete, 8}Resuming a scan once more bytes arrive:
1> {incomplete, S0} = glazer:json_scan(<<"{\"a\":">>).
{incomplete, {6,1,false,false,true,false}}
2> glazer:json_scan(<<"{\"a\":1}">>, S0).
{complete, 7}
-spec json_scan(binary() | iolist(), scan_state()) -> {complete, non_neg_integer()} | {incomplete, scan_state()}.
Resume scanning Bin (the unconsumed remainder plus newly-appended bytes)
from ScanState.
-spec json_stream_decoder() -> json_stream_decoder().
Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).
Decoding itself is not incremental — each complete top-level value is
still decoded in a single pass via json_decode/2 using the
library's fast whole-buffer decoder. Only the boundary detection (finding
where one value ends and the next begins) is incremental, via a small
byte-scanner that tracks nesting/string state across chunks.
Example
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"{\"a\":1} {\"b\":">>),
3> Vals1.
[#{<<"a">> => 1}]
4> {Vals2, _D2} = glazer:json_stream_feed(D1, <<"2}">>),
5> Vals2.
[#{<<"b">> => 2}]
-spec json_stream_decoder(decode_opts()) -> json_stream_decoder().
Create a new incremental decoder, passing Opts through to every
json_decode/2 call.
-spec json_stream_eof(json_stream_decoder()) -> {ok, [term()]} | {error, term()}.
Signal end-of-stream: decode any remaining buffered bytes as a final value
(useful for a trailing bare scalar, e.g. a lone number or true/null,
which the scanner can't otherwise distinguish from a value that's still
being written to mid-chunk).
Returns {ok, [Term]} with zero or one trailing value, or {error, Reason} if the remaining bytes don't form a complete value.
Example
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"123">>),
3> Vals1.
[]
4> glazer:json_stream_eof(D1).
{ok, [123]}A stream that ends mid-value (e.g. a dropped connection) yields an error instead of silently dropping the partial data:
1> D0 = glazer:json_stream_decoder(),
2> {Vals1, D1} = glazer:json_stream_feed(D0, <<"{\"a\":1, \"b\":">>),
3> Vals1.
[]
4> glazer:json_stream_eof(D1).
{error, _Reason}
-spec json_stream_feed(json_stream_decoder(), binary() | iolist()) -> {[term()], json_stream_decoder()}.
Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.
Raises the same exceptions as json_decode/2 (e.g.
Reason) if a value that the scanner deemed complete fails
to decode.
Example
Call json_stream_feed/2 for each chunk received from the source while more
data may still arrive, and json_stream_eof/1 once the source
is exhausted to flush any trailing value:
loop(Socket, D0) ->
case gen_tcp:recv(Socket, 0) of
{ok, Chunk} ->
{Vals, D1} = glazer:json_stream_feed(D0, Chunk),
handle_values(Vals),
loop(Socket, D1);
{error, closed} ->
case glazer:json_stream_eof(D0) of
{ok, Trailing} -> handle_values(Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end
end.The same decoder fits naturally into a gen_server driving an
active-mode socket: keep the json_stream_decoder() in the process state,
feed it from handle_info({tcp, ...}), and flush it on
{tcp_closed, ...}:
-module(json_conn).
-behaviour(gen_server).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2]).
-record(state, {socket, decoder}).
start_link(Socket) ->
gen_server:start_link(?MODULE, Socket, []).
init(Socket) ->
inet:setopts(Socket, [{active, once}]),
{ok, #state{socket = Socket, decoder = glazer:json_stream_decoder()}}.
handle_info({tcp, Socket, Data}, #state{socket = Socket, decoder = D0} = State) ->
{Vals, D1} = glazer:json_stream_feed(D0, Data),
lists:foreach(fun handle_value/1, Vals),
inet:setopts(Socket, [{active, once}]),
{noreply, State#state{decoder = D1}};
handle_info({tcp_closed, Socket}, #state{socket = Socket, decoder = D0} = State) ->
case glazer:json_stream_eof(D0) of
{ok, Trailing} -> lists:foreach(fun handle_value/1, Trailing);
{error, Reason} -> handle_truncated_stream(Reason)
end,
{stop, normal, State};
handle_info({tcp_error, Socket, Reason}, #state{socket = Socket} = State) ->
{stop, Reason, State}.
handle_call(_Request, _From, State) -> {reply, ok, State}.
handle_cast(_Request, State) -> {noreply, State}.
handle_value(Val) ->
io:format("received: ~p~n", [Val]).
Decode a JSON binary or iolist, returning {ok, Term} or
{error, Reason} instead of raising.
-spec json_try_decode(binary() | iolist(), decode_opts()) -> {ok, term()} | {error, binary()}.
Decode a JSON binary or iolist with options, returning {ok, Term} or
{error, Reason} instead of raising.
Decode a JSON number string to an integer, returning {ok, Int} or
{error, invalid_number_format} instead of raising.
Decode a YAML binary or iolist to an Erlang term. YAML mappings are returned
as maps (default). Raises {parse_error, Reason} on invalid input.
-spec yaml_decode(binary() | iolist(), yaml_decode_opts()) -> term().
Decode a YAML binary or iolist to an Erlang term with options.
Raises {parse_error, Msg} on invalid input.
Encode an Erlang term to a YAML binary in block style (2-space indentation, sequences at the same indentation as the mapping key that owns them).
-spec yaml_encode(term(), yaml_encode_opts()) -> binary().
Encode an Erlang term to a YAML binary in block style with options.
Decode a YAML binary or iolist, returning {ok, Term} or
{error, Msg} instead of raising.
-spec yaml_try_decode(binary() | iolist(), yaml_decode_opts()) -> {ok, term()} | {error, binary()}.
Decode a YAML binary or iolist with options, returning {ok, Term} or
{error, Msg} instead of raising.