View Source Explorer.Query (Explorer v0.5.1)
High-level query for Explorer.
Queries convert regular Elixir code which compile to efficient
dataframes operations. Inside a query, only the limited set of
Series operations are available and identifiers, such as strs
and nums
, represent dataframe column names:
iex> df = Explorer.DataFrame.new(strs: ["a", "b", "c"], nums: [1, 2, 3])
iex> Explorer.DataFrame.filter(df, nums > 2)
#Explorer.DataFrame<
Polars[1 x 2]
strs string ["c"]
nums integer [3]
>
If a column has unusual format, you can either rename it before-hand,
or use col/1
inside queries:
iex> df = Explorer.DataFrame.new("unusual nums": [1, 2, 3])
iex> Explorer.DataFrame.filter(df, col("unusual nums") > 2)
#Explorer.DataFrame<
Polars[1 x 1]
unusual nums integer [3]
>
All operations from Explorer.Series
are imported inside queries.
This module also provides operators to use in queries, which are
also imported into queries.
supported-operations
Supported operations
Queries are supported in the following operations:
Explorer.DataFrame.arrange/2
Explorer.DataFrame.filter/2
Explorer.DataFrame.mutate/2
Explorer.DataFrame.summarise/2
interpolation
Interpolation
If you want to access variables defined outside of the query
or get access to all Elixir constructs, you must use ^
:
iex> min = 2
iex> df = Explorer.DataFrame.new(strs: ["a", "b", "c"], nums: [1, 2, 3])
iex> Explorer.DataFrame.filter(df, nums > ^min)
#Explorer.DataFrame<
Polars[1 x 2]
strs string ["c"]
nums integer [3]
>
iex> min = 2
iex> df = Explorer.DataFrame.new(strs: ["a", "b", "c"], nums: [1, 2, 3])
iex> Explorer.DataFrame.filter(df, nums < ^if(min > 0, do: 10, else: -10))
#Explorer.DataFrame<
Polars[3 x 2]
strs string ["a", "b", "c"]
nums integer [1, 2, 3]
>
^
can be used with col
to access columns dynamically:
iex> df = Explorer.DataFrame.new("unusual nums": [1, 2, 3])
iex> name = "unusual nums"
iex> Explorer.DataFrame.filter(df, col(^name) > 2)
#Explorer.DataFrame<
Polars[1 x 1]
unusual nums integer [3]
>
implementation-details
Implementation details
Queries simply become lazy dataframe operations at runtime. For example, the following query
Explorer.DataFrame.filter(df, nums > 2)
is equivalent to
Explorer.DataFrame.filter_with(df, fn df -> df["nums"] > 2 end)
This means that, whenever you want to generate queries programatically,
you can fallback to the regular _with
APIs.
Link to this section Summary
Functions
Delegate to Explorer.Series.pow/2
.
Delegate to Explorer.Series.multiply/2
.
Unary plus operator.
Delegate to Explorer.Series.add/2
.
Unary minus operator.
Delegate to Explorer.Series.subtract/2
.
Delegate to Explorer.Series.divide/2
.
Delegate to Explorer.Series.not_equal/2
.
Delegate to Explorer.Series.less/2
.
Delegate to Explorer.Series.less_equal/2
.
Delegate to Explorer.Series.equal/2
.
Delegate to Explorer.Series.greater/2
.
Delegate to Explorer.Series.greater_equal/2
.
Access a column programatically.
Builds an anonymous function from a query.
Link to this section Functions
Delegate to Explorer.Series.pow/2
.
Delegate to Explorer.Series.multiply/2
.
Unary plus operator.
Works with numbers and series.
Delegate to Explorer.Series.add/2
.
Unary minus operator.
Works with numbers and series.
Delegate to Explorer.Series.subtract/2
.
Delegate to Explorer.Series.divide/2
.
Delegate to Explorer.Series.not_equal/2
.
Delegate to Explorer.Series.less/2
.
Delegate to Explorer.Series.less_equal/2
.
Delegate to Explorer.Series.equal/2
.
Delegate to Explorer.Series.greater/2
.
Delegate to Explorer.Series.greater_equal/2
.
Access a column programatically.
name
must be an atom, a string, or an integer.
It is equivalent to df[name]
but inside a query.
Builds an anonymous function from a query.
This is the entry point used by Explorer.DataFrame.filter/2
and friends to convert queries into anonymous functions.
See the moduledoc for more information.