CSV v2.2.0 CSV View Source
RFC 4180 compliant CSV parsing and encoding for Elixir. Allows to specify other separators, so it could also be named: TSV, but it isn't.
Link to this section Summary
Functions
Decode a stream of comma-separated lines into a stream of tuples. Decoding errors will be inlined into the stream
Decode a stream of comma-separated lines into a stream of tuples. Errors when decoding will get raised immediately
Encode a table stream into a stream of RFC 4180 compliant CSV lines for writing to a file or other IO
Link to this section Functions
decode(stream, options \\ []) View Source
Decode a stream of comma-separated lines into a stream of tuples. Decoding errors will be inlined into the stream.
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:strip_fields
– When set to true, will strip whitespace from cells. Defaults to false.:preprocessor
– Which preprocessor to use: :lines (default) -> Will preprocess line by line input respecting escape sequences :none -> Will not preprocess input and expects line by line input with multiple line escape sequences aggregated to one line:validate_row_length
– If set tofalse
, will disable validation for row length. This will allow for rows with variable length. Defaults totrue
:escape_max_lines
– How many lines to maximally aggregate for multiline escapes. Defaults to a 1000.:num_workers
– The number of parallel operations to run when producing the stream.:worker_work_ratio
– The available work per worker, defaults to 5. Higher rates will mean more work sharing, but might also lead to work fragmentation slowing down the queues.:headers
– When set totrue
, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set tofalse
(default), will use no header values. When set to anything butfalse
, the resulting rows in the matrix will be maps instead of lists.
Examples
Convert a filestream into a stream of rows in order of the given stream:
iex> "../test/fixtures/docs/valid.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode
iex> |> Enum.take(2)
[ok: ["a","b","c"], ok: ["d","e","f"]]
Errors will show up as error tuples:
iex> "../test/fixtures/docs/escape-errors.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode
iex> |> Enum.take(2)
[
ok: ["a","b","c"],
error: "Escape sequence started on line 2 near \"\\d,e,f\n\" did not terminate.\n\nEscape sequences are allowed to span up to 1000 lines. This threshold avoids collecting the whole file into memory when an escape sequence does not terminate. You can change it using the escape_max_lines option: https://hexdocs.pm/csv/CSV.html#decode/2"
]
Map an existing stream of lines separated by a token to a stream of rows with a header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode(separator: ?;, headers: true)
iex> |> Enum.take(2)
[
ok: %{"a" => "c", "b" => "d"},
ok: %{"a" => "e", "b" => "f"}
]
Map an existing stream of lines separated by a token to a stream of rows with a given header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode(separator: ?;, headers: [:x, :y])
iex> |> Enum.take(2)
[
ok: %{:x => "a", :y => "b"},
ok: %{:x => "c", :y => "d"}
]
decode!(stream, options \\ []) View Source
Decode a stream of comma-separated lines into a stream of tuples. Errors when decoding will get raised immediately.
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:strip_fields
– When set to true, will strip whitespace from cells. Defaults to false.:preprocessor
– Which preprocessor to use: :lines (default) -> Will preprocess line by line input respecting escape sequences :none -> Will not preprocess input and expects line by line input with multiple line escape sequences aggregated to one line:escape_max_lines
– How many lines to maximally aggregate for multiline escapes. Defaults to a 1000.:validate_row_length
– If set tofalse
, will disable validation for row length. This will allow for rows with variable length. Defaults totrue
:num_workers
– The number of parallel operations to run when producing the stream.:worker_work_ratio
– The available work per worker, defaults to 5. Higher rates will mean more work sharing, but might also lead to work fragmentation slowing down the queues.:headers
– When set totrue
, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set tofalse
(default), will use no header values. When set to anything butfalse
, the resulting rows in the matrix will be maps instead of lists.
Examples
Convert a filestream into a stream of rows in order of the given stream:
iex> "../test/fixtures/docs/valid.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode!
iex> |> Enum.take(2)
[["a","b","c"], ["d","e","f"]]
Errors will be raised:
iex> "../test/fixtures/docs/row-length-errors.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode!
iex> |> Enum.take(2)
** (CSV.RowLengthError) Row has length 3 - expected length 2 on line 2
Map an existing stream of lines separated by a token to a stream of rows with a header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode!(separator: ?;, headers: true)
iex> |> Enum.take(2)
[
%{"a" => "c", "b" => "d"},
%{"a" => "e", "b" => "f"}
]
Map an existing stream of lines separated by a token to a stream of rows with a given header row:
iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode!(separator: ?;, headers: [:x, :y])
iex> |> Enum.take(2)
[
%{:x => "a", :y => "b"},
%{:x => "c", :y => "d"}
]
encode(stream, options \\ []) View Source
Encode a table stream into a stream of RFC 4180 compliant CSV lines for writing to a file or other IO.
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:delimiter
– The delimiter token to use, defaults to\r\n
. Must be a string.
Examples
Convert a stream of rows with cells into a stream of lines:
iex> [~w(a b), ~w(c d)]
iex> |> CSV.encode
iex> |> Enum.take(2)
["a,b\r\n", "c,d\r\n"]
Convert a stream of rows with cells with escape sequences into a stream of lines:
iex> [["a\nb", "\tc"], ["de", "\tf\""]]
iex> |> CSV.encode(separator: ?\t, delimiter: "\n")
iex> |> Enum.take(2)
["\"a\\nb\"\t\"\\tc\"\n", "de\t\"\\tf\"\"\"\n"]