glazer (glazer v0.2.6)

View Source

Fast JSON encoding and decoding using the glaze C++ library.

By default JSON null is represented as the atom null. To change it application-wide, set the null env key in your config:

{glazer, [{null, nil}]}.

See also [https://github.com/stephenberry/glaze]

Summary

Functions

Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as maps (default). Raises {parse_error, Msg} on invalid input.

Decode a JSON binary or iolist to an Erlang term with options. Raises {parse_error, Msg} on invalid input.

Decode a JSON number string to an integer. Raises invalid_number_format on invalid input.

Encode an Erlang term to a JSON binary.

Encode an Erlang term to a JSON binary with options.

Encode an integer to its JSON string representation. Raises badarg if Int is not an integer.

Minify a JSON binary or iolist, removing all unnecessary whitespace.

Pretty-print a JSON binary or iolist with two-space indentation.

Locate the end of the next complete top-level JSON value in Bin, without decoding it.

Resume scanning Bin (the unconsumed remainder plus newly-appended bytes) from ScanState.

Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).

Create a new incremental decoder, passing Opts through to every decode/2 call.

Signal end-of-stream: decode any remaining buffered bytes as a final value (useful for a trailing bare scalar, e.g. a lone number or true/null, which the scanner can't otherwise distinguish from a value that's still being written to mid-chunk).

Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.

Decode a JSON binary or iolist, returning {ok, Term} or {error, {parse_error, Msg}} instead of raising.

Decode a JSON binary or iolist with options, returning {ok, Term} or {error, {parse_error, Msg}} instead of raising.

Decode a JSON number string to an integer, returning {ok, Int} or {error, invalid_number_format} instead of raising.

Types

decode_opt()

-type decode_opt() ::
          object_as_tuple | use_nil |
          {null_term, atom()} |
          {keys, atom | existing_atom | binary} |
          dedupe_keys.

decode_opts()

-type decode_opts() :: [decode_opt()].

Decode options:

  • object_as_tuple - decode JSON objects as {[{K, V}]} proplists rather than maps
  • use_nil - use the atom nil for JSON null
  • {null_term, Atom} - use Atom for JSON null
  • {keys, atom} - decode object keys as atoms
  • {keys, existing_atom} - decode keys as existing atoms, fall back to binary
  • {keys, binary} - decode keys as binaries (default)
  • dedupe_keys - with object_as_tuple, eliminate duplicate object keys from the resulting proplist, keeping the last occurrence's value (and position). Has no effect when objects are decoded as maps (the default) or with {keys, atom | existing_atom}: a JSON object with duplicate keys is always deduped (last value wins) when decoded to a map, since maps cannot represent duplicate keys.

encode_opt()

-type encode_opt() :: pretty | uescape | force_utf8 | use_nil | {null_term, atom()}.

encode_opts()

-type encode_opts() :: [encode_opt()].

Encode options:

  • pretty - pretty-print the JSON output
  • uescape - escape non-ASCII characters as \uXXXX sequences
  • force_utf8 - fix invalid UTF-8 sequences before encoding
  • use_nil - encode the atom nil as JSON null
  • {null_term, Atom} - encode Atom as JSON null

scan_state()

-type scan_state() :: tuple().

stream_decoder()

-opaque stream_decoder()

Functions

decode(Input)

-spec decode(binary() | iolist()) -> term().

Decode a JSON binary or iolist to an Erlang term. JSON objects are returned as maps (default). Raises {parse_error, Msg} on invalid input.

decode(Input, Opts)

-spec decode(binary() | iolist(), decode_opts()) -> term().

Decode a JSON binary or iolist to an Erlang term with options. Raises {parse_error, Msg} on invalid input.

decode_integer(NumberString)

-spec decode_integer(binary() | iolist()) -> integer().

Decode a JSON number string to an integer. Raises invalid_number_format on invalid input.

encode(Data)

-spec encode(term()) -> binary().

Encode an Erlang term to a JSON binary.

encode(Data, Opts)

-spec encode(term(), encode_opts()) -> binary().

Encode an Erlang term to a JSON binary with options.

encode_integer(Int)

-spec encode_integer(integer()) -> binary().

Encode an integer to its JSON string representation. Raises badarg if Int is not an integer.

minify(Input)

-spec minify(binary() | iolist()) -> binary().

Minify a JSON binary or iolist, removing all unnecessary whitespace.

prettify(Input)

-spec prettify(binary() | iolist()) -> binary().

Pretty-print a JSON binary or iolist with two-space indentation.

scan(Bin)

-spec scan(binary() | iolist()) -> {complete, non_neg_integer()} | {incomplete, scan_state()}.

Locate the end of the next complete top-level JSON value in Bin, without decoding it.

Returns:

  • {complete, EndOffset} - a complete value spans binary:part(Bin, 0, EndOffset); the rest of Bin (if any) is left over for the next call
  • {incomplete, ScanState} - Bin doesn't yet contain a complete value; feed more data via scan/2 once it's available, passing the entire unconsumed remainder (this Bin, with new bytes appended) plus ScanState

This is the low-level primitive behind stream_feed/2; most callers should use the stream_* API instead.

Example

Slicing off complete values from a buffer of concatenated JSON:

1> Buf0 = <<"{\"a\":1} {\"b\":2}">>,
2> {complete, End1} = glazer:scan(Buf0).
{complete, 7}
3> <<Val1:End1/binary, Buf1/binary>> = Buf0,
4> Val1.
<<"{\"a\":1}">>
5> Buf1.
<<" {\"b\":2}">>
6> {complete, End2} = glazer:scan(Buf1).
{complete, 8}

Resuming a scan once more bytes arrive:

1> {incomplete, S0} = glazer:scan(<<"{\"a\":">>).
{incomplete, {6,1,false,false,true,false}}
2> glazer:scan(<<"{\"a\":1}">>, S0).
{complete, 7}

scan(Bin, ScanState)

-spec scan(binary() | iolist(), scan_state()) ->
              {complete, non_neg_integer()} | {incomplete, scan_state()}.

Resume scanning Bin (the unconsumed remainder plus newly-appended bytes) from ScanState.

stream_decoder()

-spec stream_decoder() -> stream_decoder().

Create a new incremental decoder for feeding JSON in chunks (e.g. from a socket or file), useful when a complete document isn't available up front or when a stream contains a sequence of concatenated/whitespace-separated JSON values (e.g. newline-delimited JSON).

Decoding itself is not incremental — each complete top-level value is still decoded in a single pass via decode/2 using the library's fast whole-buffer decoder. Only the boundary detection (finding where one value ends and the next begins) is incremental, via a small byte-scanner that tracks nesting/string state across chunks.

Example

1> D0 = glazer:stream_decoder(),
2> {Vals1, D1} = glazer:stream_feed(D0, <<"{\"a\":1} {\"b\":">>),
3> Vals1.
[#{<<"a">> => 1}]
4> {Vals2, _D2} = glazer:stream_feed(D1, <<"2}">>),
5> Vals2.
[#{<<"b">> => 2}]

stream_decoder(Opts)

-spec stream_decoder(decode_opts()) -> stream_decoder().

Create a new incremental decoder, passing Opts through to every decode/2 call.

stream_eof/1

-spec stream_eof(stream_decoder()) -> {ok, [term()]} | {error, term()}.

Signal end-of-stream: decode any remaining buffered bytes as a final value (useful for a trailing bare scalar, e.g. a lone number or true/null, which the scanner can't otherwise distinguish from a value that's still being written to mid-chunk).

Returns {ok, [Term]} with zero or one trailing value, or {error, Reason} if the remaining bytes don't form a complete value.

Example

1> D0 = glazer:stream_decoder(),
2> {Vals1, D1} = glazer:stream_feed(D0, <<"123">>),
3> Vals1.
[]
4> glazer:stream_eof(D1).
{ok, [123]}

A stream that ends mid-value (e.g. a dropped connection) yields an error instead of silently dropping the partial data:

1> D0 = glazer:stream_decoder(),
2> {Vals1, D1} = glazer:stream_feed(D0, <<"{\"a\":1, \"b\":">>),
3> Vals1.
[]
4> glazer:stream_eof(D1).
{error, {parse_error, _Reason}}

stream_feed/2

-spec stream_feed(stream_decoder(), binary() | iolist()) -> {[term()], stream_decoder()}.

Feed a chunk of bytes into the decoder, returning any complete JSON values found so far (in order) along with the updated decoder.

Raises the same exceptions as decode/2 (e.g. {parse_error, Reason}) if a value that the scanner deemed complete fails to decode.

Example

Call stream_feed/2 for each chunk received from the source while more data may still arrive, and stream_eof/1 once the source is exhausted to flush any trailing value:

loop(Socket, D0) ->
  case gen_tcp:recv(Socket, 0) of
    {ok, Chunk} ->
      {Vals, D1} = glazer:stream_feed(D0, Chunk),
      handle_values(Vals),
      loop(Socket, D1);
    {error, closed} ->
      case glazer:stream_eof(D0) of
        {ok, Trailing}  -> handle_values(Trailing);
        {error, Reason} -> handle_truncated_stream(Reason)
      end
  end.

The same decoder fits naturally into a gen_server driving an active-mode socket: keep the stream_decoder() in the process state, feed it from handle_info({tcp, ...}), and flush it on {tcp_closed, ...}:

-module(json_conn).
-behaviour(gen_server).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2]).

-record(state, {socket, decoder}).

start_link(Socket) ->
  gen_server:start_link(?MODULE, Socket, []).

init(Socket) ->
  inet:setopts(Socket, [{active, once}]),
  {ok, #state{socket = Socket, decoder = glazer:stream_decoder()}}.

handle_info({tcp, Socket, Data}, #state{socket = Socket, decoder = D0} = State) ->
  {Vals, D1} = glazer:stream_feed(D0, Data),
  lists:foreach(fun handle_value/1, Vals),
  inet:setopts(Socket, [{active, once}]),
  {noreply, State#state{decoder = D1}};

handle_info({tcp_closed, Socket}, #state{socket = Socket, decoder = D0} = State) ->
  case glazer:stream_eof(D0) of
    {ok, Trailing}  -> lists:foreach(fun handle_value/1, Trailing);
    {error, Reason} -> handle_truncated_stream(Reason)
  end,
  {stop, normal, State};

handle_info({tcp_error, Socket, Reason}, #state{socket = Socket} = State) ->
  {stop, Reason, State}.

handle_call(_Request, _From, State) -> {reply, ok, State}.
handle_cast(_Request, State)        -> {noreply, State}.

handle_value(Val) ->
  io:format("received: ~p~n", [Val]).

try_decode(Input)

-spec try_decode(binary() | iolist()) -> {ok, term()} | {error, {parse_error, binary()}}.

Decode a JSON binary or iolist, returning {ok, Term} or {error, {parse_error, Msg}} instead of raising.

try_decode(Input, Opts)

-spec try_decode(binary() | iolist(), decode_opts()) -> {ok, term()} | {error, {parse_error, binary()}}.

Decode a JSON binary or iolist with options, returning {ok, Term} or {error, {parse_error, Msg}} instead of raising.

try_decode_integer(NumberString)

-spec try_decode_integer(binary() | iolist()) -> {ok, integer()} | {error, invalid_number_format}.

Decode a JSON number string to an integer, returning {ok, Int} or {error, invalid_number_format} instead of raising.