Saxy.parse_stream
parse_stream
, go back to Saxy module for more information.
Specs
parse_stream( stream :: Enumerable.t(), handler :: module(), initial_state :: term(), options :: Keyword.t() ) :: {:ok, state :: term()} | {:halt, state :: term(), rest :: String.t()} | {:error, exception :: Saxy.ParseError.t()}
Parses XML stream data.
This function takes a stream, SAX event handler (see more at Saxy.Handler
) and an initial state as the input, it returns
{:ok, state}
if parsing is successful, otherwise {:error, exception}
, where exception
is a
Saxy.ParseError
struct which can be converted into readable message with Exception.message/1
.
Examples
defmodule MyTestHandler do
@behaviour Saxy.Handler
def handle_event(:start_document, prolog, state) do
{:ok, [{:start_document, prolog} | state]}
end
def handle_event(:end_document, _data, state) do
{:ok, [{:end_document} | state]}
end
def handle_event(:start_element, {name, attributes}, state) do
{:ok, [{:start_element, name, attributes} | state]}
end
def handle_event(:end_element, name, state) do
{:ok, [{:end_element, name} | state]}
end
def handle_event(:characters, chars, state) do
{:ok, [{:chacters, chars} | state]}
end
end
iex> stream = File.stream!("./test/support/fixture/foo.xml")
iex> Saxy.parse_stream(stream, MyTestHandler, [])
{:ok,
[{:end_document},
{:end_element, "foo"},
{:start_element, "foo", [{"bar", "value"}]},
{:start_document, [version: "1.0"]}]}
Memory usage
Saxy.parse_stream/3
takes a File.Stream
or Stream
as the input, so the amount of bytes to buffer in each
chunk can be controlled by File.stream!/3
API.
During parsing, the actual memory used by Saxy might be higher than the number configured for each chunk, since Saxy holds in memory some parsed parts of the original binary to leverage Erlang sub-binary extracting. Anyway, Saxy tries to free those up when it makes sense.
Options
See the “Shared options” section at the module documentation.
:character_data_max_length
- tells the parser to emit the:characters
event when its length exceeds the specified number. The option is useful when the tag being parsed containing a very large chunk of data. Defaults to:infinity
.