scrapy_cloud_ex v0.1.2 ScrapyCloudEx.Endpoints.Storage View Source
Documents commonalities between all storage endpoint-related functions.
Format
The :format
option given as an optional parameter must be one of
:json
, :csv
, :html
, :jl
, :text
, :xml
. If none is given, it
defaults to :json
. Note that not all functions will accept all format
values.
CSV options
When requesting results in CSV format with format: :csv
, additional
configuration parameters must be provided within the value associated
to the :csv
key:
:fields
- required, list of binaries indicating the fields to include, in order from left to right.:include_headers
- optional, boolean indicating whether to include the header names in the first row.:sep
- optional, separator character to use between cells.:quote
- optional, quote character.:escape
- optional, escape character.:lineend
- line end string.
Example
params = [format: :csv, csv: [fields: ~w(foo bar), include_headers: true]]
Pagination
The :pagination
option must be a keyword list containing pagination-relevant
options. Note that not all functions will accept all pagination options.
Providing pagination options outside of the :pagination
keyword list will
result in a warning.
Parameters:
:count
- number of results to provide.:start
- skip results before the given one. See a note about format below.:startafter
- return results after the given one. See a note about format below.:index
- a non-zero positive offset to retrieve specific records. May be provided multiple times.
While the index
parameter is just a short <entity_id>
(ex: [index: 4]
), start
and startafter
parameters should have the full form with 4 sections
<project_id>/<spider_id>/<job_id>/<entity_id>
(ex: [start: "1/2/3/4"]
, [startafter: "1/2/3/3"]
).
Example
params = [format: :json, pagination: [count: 100, index: 101]]
Meta parameters
You can use the :meta
parameter to return metadata for the record in addition to its core data.
The following values are available:
:_key
- the item key in the format:project_id/:spider_id/:job_id/:item_no
(String.t/0
).:_project
- the project id (integer/0
).:_ts
- timestamp in milliseconds for when the item was added (integer/0
).
Example
params = [meta: [:_key, :_ts]]