Crawly v0.7.0 Crawly.Utils View Source
Utility functions for Crawly
Link to this section Summary
Functions
A helper function which joins relative url with a base URL
A helper function which joins relative url with a base URL for a list
Pipeline/Middleware helper
A helper function which returns a Request structure for the given URL
A helper function which converts a list of URLS into a requests list.
Link to this section Functions
build_absolute_url(url, base_url) View Source
A helper function which joins relative url with a base URL
build_absolute_urls(urls, base_url) View Source
A helper function which joins relative url with a base URL for a list
pipe(arg1, item, state)
View Source
pipe(pipelines, item, state) :: result
when pipelines: [Crawly.Pipeline.t()],
item: map(),
state: map(),
result: {new_item | false, new_state},
new_item: map(),
new_state: map()
pipe(pipelines, item, state) :: result when pipelines: [Crawly.Pipeline.t()], item: map(), state: map(), result: {new_item | false, new_state}, new_item: map(), new_state: map()
Pipeline/Middleware helper
Executes a given list of pipelines on the given item, mimics filtermap behavior.
Takes an item and state and passes it through a list of modules which implements a pipeline behavior, executing the pipeline's Crawly.Pipeline.run/3
function.
The pipe function must either return a boolean (false
), or an updated item.
If false
is returned by a pipeline, the item is dropped. It will not be processed by any descendant pipelines.
In case of a pipeline crash, the pipeline will be skipped and the item will be passed on to descendant pipelines.
The state variable is used to persist the information accross multiple items.
Usage in Tests
The Crawly.Utils.pipe/3
helper can be used in pipeline testing to simulate a set of middlewares/pipelines.
Internally, this function is used for both middlewares and pipelines. Hence, you can use it for testing modules that implement the Crawly.Pipeline
behaviour.
For example, one can test that a given item is manipulated by a pipeline as so:
item = %{my: "item"}
state = %{}
pipelines = [ MyCustomPipelineOrMiddleware ]
{new_item, new_state} = Crawly.Utils.pipe(pipelines, item, state)
request_from_url(url)
View Source
request_from_url(binary()) :: Crawly.Request.t()
request_from_url(binary()) :: Crawly.Request.t()
A helper function which returns a Request structure for the given URL
requests_from_urls(urls)
View Source
requests_from_urls([binary()]) :: [Crawly.Request.t()]
requests_from_urls([binary()]) :: [Crawly.Request.t()]
A helper function which converts a list of URLS into a requests list.