Saxy.parse_stream

You're seeing just the function parse_stream, go back to Saxy module for more information.
Link to this function

parse_stream(stream, handler, initial_state, options \\ [])

View Source

Specs

parse_stream(
  stream :: Enumerable.t(),
  handler :: module(),
  initial_state :: term(),
  options :: Keyword.t()
) ::
  {:ok, state :: term()}
  | {:halt, state :: term(), rest :: String.t()}
  | {:error, exception :: Saxy.ParseError.t()}

Parses XML stream data.

This function takes a stream, SAX event handler (see more at Saxy.Handler) and an initial state as the input, it returns {:ok, state} if parsing is successful, otherwise {:error, exception}, where exception is a Saxy.ParseError struct which can be converted into readable message with Exception.message/1.

Examples

defmodule MyTestHandler do
  @behaviour Saxy.Handler

  def handle_event(:start_document, prolog, state) do
    {:ok, [{:start_document, prolog} | state]}
  end

  def handle_event(:end_document, _data, state) do
    {:ok, [{:end_document} | state]}
  end

  def handle_event(:start_element, {name, attributes}, state) do
    {:ok, [{:start_element, name, attributes} | state]}
  end

  def handle_event(:end_element, name, state) do
    {:ok, [{:end_element, name} | state]}
  end

  def handle_event(:characters, chars, state) do
    {:ok, [{:chacters, chars} | state]}
  end
end

iex> stream = File.stream!("./test/support/fixture/foo.xml")
iex> Saxy.parse_stream(stream, MyTestHandler, [])
{:ok,
 [{:end_document},
  {:end_element, "foo"},
  {:start_element, "foo", [{"bar", "value"}]},
  {:start_document, [version: "1.0"]}]}

Memory usage

Saxy.parse_stream/3 takes a File.Stream or Stream as the input, so the amount of bytes to buffer in each chunk can be controlled by File.stream!/3 API.

During parsing, the actual memory used by Saxy might be higher than the number configured for each chunk, since Saxy holds in memory some parsed parts of the original binary to leverage Erlang sub-binary extracting. Anyway, Saxy tries to free those up when it makes sense.

Options

See the “Shared options” section at the module documentation.

  • :character_data_max_length - tells the parser to emit the :characters event when its length exceeds the specified number. The option is useful when the tag being parsed containing a very large chunk of data. Defaults to :infinity.