Dataset v0.4.0 Dataset View Source
Datasets represent labeled tabular data.
Datasets are enumerable:
iex> Dataset.new([{:a, :b, :c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"one", "two", "three"})
...> |> Enum.map(&elem(&1, 2))
[:c, :C, :iii, :III]
Datasets are also collectable:
iex> for x <- 0..10, into: Dataset.empty({:n}), do: x
%Dataset{labels: {:n}, rows: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
Link to this section Summary
Functions
Return a dataset with no rows and labels specified by the tuple
passed as label
. If label is not specified, return an empty
dataset with zero columns.
Return the result of performing an inner join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
Return the result of performing a left join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0
are generated, the
number of which are determined by size of the first tuple in the
data.
Return the result of performing an outer join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
Return the result of performing a right join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
Return a new dataset with columns chosen from the input dataset ds
.
Return the contents of _ds
as a list of maps.
Link to this section Functions
empty(labels \\ nil) View Source
Return a dataset with no rows and labels specified by the tuple
passed as label
. If label is not specified, return an empty
dataset with zero columns.
inner_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an inner join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.inner_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
left_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a left join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.left_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {nil, "2"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
new(rows \\ [], labels \\ nil) View Source
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0
are generated, the
number of which are determined by size of the first tuple in the
data.
iex> Dataset.new()
%Dataset{rows: [], labels: {}}
iex> Dataset.new([{:foo, :bar}, {:eggs, :ham}])
%Dataset{rows: [foo: :bar, eggs: :ham], labels: {0, 1}}
iex> Dataset.new([{0,0}, {1, 1}, {2, 4}, {3, 9}],
...> {:x, :x_squared})
%Dataset{labels: {:x, :x_squared}, rows: [{0, 0}, {1, 1}, {2, 4}, {3, 9}]}
outer_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an outer join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.outer_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{nil, "2"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
right_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a right join on datasets ds1
and
ds2
, using k1
and k2
as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels
, which is a keyword list of the form
[left_or_right: label, ...]
.
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.right_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
rotate(dataset) View Source
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}])
...> |> Dataset.rotate()
%Dataset{
labels: {0, 1, 2, 3},
rows: [{:a, :A, :i, :I},
{:b, :B, :ii, :II},
{:c, :C, :iii, :III}]
}
select(ds, out_labels) View Source
Return a new dataset with columns chosen from the input dataset ds
.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"first", "second", "third"})
...> |> Dataset.select(["second"])
%Dataset{rows: [{:b}, {:B}, {:ii}, {:II}], labels: {"second"}}
to_map_list(ds) View Source
Return the contents of _ds
as a list of maps.