View Source Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
v0-4-0-2022-11-29
v0.4.0 - 2022-11-29
added
Added
Add
Series.quotient/2
andSeries.remainder/2
to work with integer division.Add
Series.bintype/1
to return the underlying representation type.Allow series on both sides of binary operations, like:
add(series, 1)
andadd(1, series)
.Allow comparison, concat and coalesce operations on "(series, lazy series)".
Add lazy version of
Series.sample/3
andSeries.size/1
.Add support for Arrow IPC Stream files.
Add
Explorer.Query
and the macros that allow a simplified query API. This is a huge improvement to some of the main functions, and allow refering to columns as they were variables.Before this change we would need to write a filter like this:
Explorer.DataFrame.filter_with(df, &Explorer.Series.greater(&1["col1"], 42))
But now it's also possible to write this operation like this:
Explorer.DataFrame.filter(df, col1 > 42)
This operation is going to use
filter_with/2
underneath, which means that is going to use lazy series and compute the results at once. Notice that is mandatory to "require" the DataFrame module, since these operations are implemented as macros.The following new macros were added:
filter/2
mutate/2
summarise/2
arrange/2
They substitute older versions that did not accept the new query syntax.
Add
DataFrame.put/3
to enable adding or replacing columns in a eager manner. This works similar to the previous version ofmutate/2
.Add
Series.select/3
operation that enables selecting a value from two series based on a predicate.Add "dump" and "load" functions to IO operations. They are useful to load or dump dataframes from/to memory.
Add
Series.to_iovec/2
andSeries.to_binary/1
. They return the underlying representation of series as binary. The first one returns a list of binaries, possibly with one element if the series is contiguous in memory. The second one returns a single binary representing the series.Add
Series.shift/2
that shifts the series by an offset with nil values.Rename
Series.fetch!/2
andSeries.take_every/2
toSeries.at/2
andSeries.at_every/2
.Add
DataFrame.discard/2
to drop columns. This is the opposite ofselect/2
.Implement
Nx.LazyContainer
forExplorer.DataFrame
andExplorer.Series
so data can be passed into Nx.Add
Series.not/1
that negates values in a boolean series.Add the
:binary
dtype for Series. This enables the usage of arbitrary binaries.
changed
Changed
- Change DataFrame's
to_*
functions to return only:ok
. - Change series inspect to resamble the dataframe inspect with the backend name.
- Rename
Series.var/1
toSeries.variance/1
- Rename
Series.std/1
toSeries.standard_deviation/1
- Rename
Series.count/2
toSeries.frequencies/1
and add a newSeries.count/1
that returns the size of an "eager" series, or the count of members in a group for a lazy series. In case there is no groups, it calculates the size of the dataframe. - Change the option to control direction in
Series.sort/2
andSeries.argsort/2
. Instead of a boolean, now we have a new option called:direction
that accepts:asc
or:desc
.
fixed
Fixed
- Fix the following DataFrame functions to work with groups:
filter_with/2
head/2
tail/2
slice/2
slice/3
pivot_longer/3
pivot_wider/4
concat_rows/1
concat_columns/1
- Improve the documentation of functions that behave differently with groups.
- Fix
arrange_with/2
to use "group by" stable, making results more predictable. - Add
nil
as a possible return value of aggregations. - Fix the behaviour of
Series.sort/2
andSeries.argsort/2
to add nils at the front when direction is descending, or at the back when the direction is ascending. This also adds an option to control this behaviour.
removed
Removed
- Remove support for
NDJSON
read and write for ARM 32 bits targets. This is due to a limitation of a dependency of Polars.
v0-3-1-2022-09-09
v0.3.1 - 2022-09-09
fixed-1
Fixed
- Define
multiply
inside*_with
operations. - Fix column types in several operations, such as
n_distinct
.
v0-3-0-2022-09-01
v0.3.0 - 2022-09-01
added-1
Added
Add
DataFrame.concat_columns/1
andDataFrame.concat_columns/2
for horizontally stacking dataframes.Add compression as an option to write parquet files.
Add count metadata to
DataFrame
table reader.Add
DataFrame.filter_with/2
,DataFrame.summarise_with/2
,DataFrame.mutate_with/2
andDataFrame.arrange_with/2
. They all accept aDataFrame
and a function, and they all work with a new concept called "lazy series".Lazy Series is an opaque representation of a series that can be used to perform complex operations without pulling data from the series. This is faster than using masks. There is no big difference from the API perspective compared to the functions that were accepting callbacks before (eg.
filter/2
and the newfilter_with/2
), with the exception beingDataFrame.summarise_with/2
that now accepts a lot more operations.
changed-1
Changed
- Bump version requirement of the
table
dependency to~> 0.1.2
, and raise for non-tabular values. - Normalize how columns are handled. This changes some functions to accept one column or a list of columns, ranges, indexes and callbacks selecting columns.
- Rename
DataFrame.filter/2
toDataFrame.mask/2
. - Rename
Series.filter/2
toSeries.mask/2
. - Rename
take/2
from bothSeries
andDataFrame
toslice/2
.slice/2
now they accept ranges as well. - Raise an error if
DataFrame.pivot_wider/4
has float columns as IDs. This is because we can´t properly compare floats. - Change
DataFrame.distinct/2
to accept columns as argument instead of receiving it as option.
fixed-2
Fixed
- Ensure that we can compare boolean series in functions like
Series.equal/2
. - Fix rename of columns after summarise.
- Fix inspect of float series containing
NaN
orInfinity
values. They are represented as atoms.
deprecated
Deprecated
- Deprecate
DataFrame.filter/2
with a callback in favor ofDataFrame.filter_with/2
.
v0-2-0-2022-06-22
v0.2.0 - 2022-06-22
added-2
Added
- Consistently support ranges throughout the columns API
- Support negative indexes throughout the columns API
- Integrate with the
table
package - Add
Series.to_enum/1
for lazily traversing the series - Add
Series.coalesce/1
andSeries.coalesce/2
for finding the first non-null value in a list of series
changed-2
Changed
Series.length/1
is nowSeries.size/1
in keeping with Elixir idiomsNx
is now an optional dependency- Minimum Elixir version is now 1.13
DataFrame.to_map/2
is nowDataFrame.to_columns/2
andDataFrame.to_series/2
Rustler
is now an optional dependencyread_
andwrite_
IO functions are nowfrom_
andto_
to_binary
is nowdump_csv
- Now uses
polars
's "simd" feature - Now uses
polars
's "performant" feature Explorer.default_backend/0
is nowExplorer.Backend.get/0
Explorer.default_backend/1
is nowExplorer.Backend.put/1
Series.cum_*
functions are nowSeries.cumulative_*
to mirrorNx
Series.rolling_*
functions are nowSeries.window_*
to mirrorNx
reverse?
is now an option instead of an argument inSeries.cumulative_*
functionsDataFrame.from_columns/2
andDataFrame.from_rows/2
is nowDataFrame.new/2
- Rename "col" to "column" throughout the API
- Remove "with_" prefix in options throughout the API
DataFrame.table/2
accepts options with:limit
instead of single integerrename/2
no longer accepts a function, userename_with/2
insteadrename_with/3
now expects the function as the last argument
fixed-3
Fixed
- Explorer now works on Linux with musl
v0-1-1-2022-04-27
v0.1.1 - 2022-04-27
security
Security
v0-1-0-2022-04-26
v0.1.0 - 2022-04-26
First release.