SweetXml.stream_tags
You're seeing just the function
stream_tags
, go back to SweetXml module for more information.
Most common usage of streaming: stream a given tag or a list of tags, and optionally "discard" some DOM elements in order to free memory during streaming for big files which cannot fit entirely in memory.
Note that each matched tag produces it's own tree. If a given tag appears in the discarded options, it is ignored.
doc
is an enumerable, data will be pulled during the result stream enumeration. e.g.File.stream!("some_file.xml")
tags
is an atom or a list of atoms you want to extract. Each stream element will be{:tagname, xmlelem}
. e.g. :li, :headeroptions[:discard]
is the list of tag which will be discarded: not added to its parent DOM.- More options details are available with
parse/2
.
Examples
iex> import SweetXml
iex> doc = ["<ul><li>l1</li><li>l2", "</li><li>l3</li></ul>"]
iex> SweetXml.stream_tags(doc, :li, discard: [:li])
...> |> Stream.map(fn {:li, doc} -> doc |> SweetXml.xpath(~x"./text()") end)
...> |> Enum.to_list
['l1', 'l2', 'l3']
iex> SweetXml.stream_tags(doc, [:ul, :li])
...> |> Stream.map(fn {_, doc} -> doc |> SweetXml.xpath(~x"./text()") end)
...> |> Enum.to_list
['l1', 'l2', 'l3', nil]
Be careful if you set options[:discard]
. If any of the discarded tags is nested
inside a kept tag, you will not be able to access them.
Examples
iex> import SweetXml
iex> doc = ["<header>", "<title>XML</title", "><header><title>Nested</title></header></header>"]
iex> SweetXml.stream_tags(doc, :header)
...> |> Stream.map(fn {_, doc} -> SweetXml.xpath(doc, ~x".//title/text()") end)
...> |> Enum.to_list
['Nested', 'XML']
iex> SweetXml.stream_tags(doc, :header, discard: [:title])
...> |> Stream.map(fn {_, doc} -> SweetXml.xpath(doc, ~x"./title/text()") end)
...> |> Enum.to_list
[nil, nil]