scrapy_cloud_ex v0.1.0 ScrapyCloudEx.Endpoints.Storage.JobQ View Source
Wraps the JobQ endpoint.
The JobQ API allows you to retrieve finished jobs from the queue.
Link to this section Summary
Functions
Counts the jobs for the specified project
Lists the jobs for the specified project, in order from most recent to last
Link to this section Functions
Counts the jobs for the specified project.
The following parameters are supported in the params
argument:
:spider
- the spider name.:state
- return jobs with specified state. Supported values:"pending"
,"running"
,"finished"
,"deleted"
.:startts
- UNIX timestamp at which to begin results, in milliseconds.:endts
- UNIX timestamp at which to end results, in milliseconds.:count
- limit results by a given number of jobs.:has_tag
- return jobs with specified tag. May be given multiple times, and will behave as a logicalOR
operation among the values.:lacks_tag
- return jobs that lack specified tag. May be given multiple times, and will behave as a logicalAND
operation among the values.
The opts
value is documented here.
See docs here.
Example
ScrapyCloudEx.Endpoints.Storage.JobQ.count("API_KEY", "14", state: "running", has_tag: "sometag")
# {:ok, 4}
Lists the jobs for the specified project, in order from most recent to last.
The following parameters are supported in the params
argument:
:format
- the format to be used for returning results. Can be:json
or:jl
. Defaults to:json
.:pagination
- the:count
pagination parameter is supported.:spider
- the spider name.:state
- return jobs with specified state. Supported values:"pending"
,"running"
,"finished"
,"deleted"
.:startts
- UNIX timestamp at which to begin results, in milliseconds.:endts
- UNIX timestamp at which to end results, in milliseconds.:start
- offset of initial jobs to skip in returned results.:end
- job key at which to stop showing results.:key
- job key for which to get job data. May be given multiple times.:has_tag
- return jobs with specified tag. May be given multiple times, and will behave as a logicalOR
operation among the values.:lacks_tag
- return jobs that lack specified tag. May be given multiple times, and will behave as a logicalAND
operation among the values.
The opts
value is documented here.
See docs here.
List jobs finished between two timestamps
If you pass the startts and endts parameters, the API will return only the jobs finished between them.
ScrapyCloudEx.Endpoints.Storage.JobQ.list("API_KEY", 53, startts: 1359774955431, endts: 1359774955440)
Retrieve jobs finished after some job
JobQ returns the list of jobs, with the most recently finished first. It is recommended to associate
the key of the most recently finished job with the downloaded data. When you want to update your data
later on, you can list the jobs and stop at the previously downloaded job, through the :stop
parameter.
ScrapyCloudEx.Endpoints.Storage.JobQ.list("API_KEY", 53, stop: "53/7/81")
Example return value
{:ok, [
%{
"close_reason" => "cancelled",
"elapsed" => 485061225,
"errors" => 1,
"finished_time" => 1540745154657,
"items" => 2783,
"key" => "345675/1/26",
"logs" => 20,
"pages" => 2888,
"pending_time" => 1540744974169,
"running_time" => 1540744974190,
"spider" => "sixbid.com",
"state" => "finished",
"ts" => 1540745141316,
"version" => "5ef2169-master"
}
]}