CSV.Decoder
The Decoder CSV module sends lines of delimited values from a stream to the parser and converts rows coming from the CSV parser module to a consumable stream. In setup, it parallelises lexing and parsing, as well as different lexer/parser pairs as pipes. The number of pipes can be controlled via options.
Summary
Decode a stream of comma-separated lines into a table.
You can control the number of parallel operations via the option :num_pipes
-
default is the number of erlang schedulers times 3
Functions
Decode a stream of comma-separated lines into a table.
You can control the number of parallel operations via the option :num_pipes
-
default is the number of erlang schedulers times 3.
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:delimiter
– The delimiter token to use, defaults to\r\n
. Must be a string.:strip_cells
– When set to true, will strip whitespace from cells. Defaults to false.:multiline_escape
– Whether to allow multiline escape sequences. Defaults to true.:num_pipes
– Will be deprecated in 2.0 - see num_workers:num_workers
– The number of parallel operations to run when producing the stream.:worker_work_ratio
– The available work per worker, defaults to 5. Higher rates will mean more work sharing, but might also lead to work fragmentation slowing down the queues.:headers
– When set totrue
, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set tofalse
(default), will use no header values. When set to anything butfalse
, the resulting rows in the matrix will be maps instead of lists.
Examples
Convert a filestream into a stream of rows:
iex> "../../test/fixtures/docs.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.Decoder.decode
iex> |> Enum.take(2)
[["a","b","c"], ["d","e","f"]]
Map an existing stream of lines separated by a token to a stream of rows with a header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.Decoder.decode(separator: ?;, headers: true)
iex> |> Enum.take(2)
[%{"a" => "c", "b" => "d"}, %{"a" => "e", "b" => "f"}]
Map an existing stream of lines separated by a token to a stream of rows with a given header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.Decoder.decode(separator: ?;, headers: [:x, :y])
iex> |> Enum.take(2)
[%{:x => "a", :y => "b"}, %{:x => "c", :y => "d"}]