View Source ReqAthena (ReqAthena v0.2.0)

Req plugin for AWS Athena.

ReqAthena makes it easy to make Athena queries and save the results into S3 buckets.

By default, ReqAthena will query results and use the default output format, which is CSV. To change that, you can use the :format option documented bellow.

Summary

Functions

Attaches to Req request.

Functions

Link to this function

attach(request, options \\ [])

View Source

Attaches to Req request.

Request Options

  • :access_key_id - Required. The Access Key ID from AWS credentials.

  • :secret_access_key - Required. The Secret Access Key from AWS credentials.

  • :token - Optional. The Session Token from AWS credentials.

  • :region - Required. The AWS region where AWS Athena is installed.

  • :database - Required. The AWS Athena database name.

  • :output_location - Optional. The S3 URL location to output AWS Athena query results. Results will be saved as Parquet and loaded with Explorer only if this option is given.

  • :workgroup - Conditional. The AWS Athena workgroup.

  • :cache_query - Optional. Forces a non-cached result from AWS Athena.

  • :format - Optional. The output format. Can be one of:

    • :none (default) - return decoded API response from Athena.

    • :csv - return contents of the CSV file.

    • :json - return contents of the JSON file.

      Note: Req by default automatically decodes JSON response body (decode_body step) and to prevent it from doing so, set decode_body: false.

    • :explorer - return contents in parquet format, lazy loaded into Explorer data frame.

    There are some limitations when using the :json and :explorer format. First, you need to install Explorer in order to use the :explorer format. Second, when using these format, you always need to provide a different output location. See the UNLOAD command docs for more details.

  • :output_compression - Optional. Sets the Parquet compression format and level for the output when using the Explorer output format. This can be a string, like "gzip", or a tuple with {format, level}, like: {"ZSTD", 4}. By default this is nil, which means that for Parquet (the format that Explorer uses) this is going to be "gzip".

  • :athena - Required. The query to execute. It can be a plain SQL string or a {query, params} tuple, where query can contain ? placeholders and params is a list of corresponding values.

    There is a limitation of Athena that requires the :output_location to be present for every query that outputs to a format other than "CSV". So we append "results" to the :output_location to make the partition files be saved there.

Conditional fields must always be defined, and can be one of the fields or both.

If you want to set any of these options when attaching the plugin, pass them as the second argument.

Examples

With plain query string:

iex> opts = [
...>   access_key_id: System.fetch_env!("AWS_ACCESS_KEY_ID"),
...>   secret_access_key: System.fetch_env!("AWS_SECRET_ACCESS_KEY"),
...>   region: System.fetch_env!("AWS_REGION"),
...>   database: "default",
...>   output_location: System.fetch_env!("AWS_ATHENA_OUTPUT_LOCATION")
...> ]
iex> query = "SELECT id, type, tags, members, timestamp, visible FROM planet WHERE id = 470454 and type = 'relation'"
iex> req = Req.new() |> ReqAthena.attach(opts)
iex> Req.post!(req, athena: query, format: :json).body
%{
  "id" => 470454,
  "members" => [
    %{"ref" => 670007839, "role" => "", "type" => "node"},
    %{"ref" => 670007840, "role" => "", "type" => "node"}
  ],
  "tags" => %{
    "name" => "Mérignac A",
    "network" => "NTF-5",
    "ref" => "17229A",
    "site" => "geodesic",
    "source" => "©IGN 2010 dans le cadre de la cartographie réglementaire",
    "type" => "site",
    "url" => "http://geodesie.ign.fr/fiches/index.php?module=e&action=fichepdf&source=carte&sit_no=17229A"
  },
  "timestamp" => "2017-01-21 12:51:34",
  "type" => "relation",
  "visible" => true
}

With parameterized query:

iex> opts = [
...>   access_key_id: System.fetch_env!("AWS_ACCESS_KEY_ID"),
...>   secret_access_key: System.fetch_env!("AWS_SECRET_ACCESS_KEY"),
...>   region: System.fetch_env!("AWS_REGION"),
...>   database: "default",
...>   output_location: System.fetch_env!("AWS_ATHENA_OUTPUT_LOCATION")
...> ]
iex> query = "SELECT id, type FROM planet WHERE id = ? and type = ?"
iex> req = Req.new() |> ReqAthena.attach(opts)
iex> Req.post!(req, athena: {query, [239_970_142, "node"]}, format: :json).body
[%{"id" => 239970142, "type" => "node"}]