Spidey v0.3.2 Spidey.Filter.DefaultFilter View Source

An implementation of the Spidey.Filter behaviour which:

  1. Transforms relative urls to absolute urls
  2. Strips the query parameters of all urls, to simplify unicity.
  3. Strips the trailing slashes of all urls.
  4. Rejects all urls from a different domain than the seed's.
  5. Rejects invalid urls
  6. Reject static resources based on different criteria such as wordpress paths and file type.

This behaviour requires the option :seed.

Link to this section Summary

Link to this section Functions

Link to this function

process_relative_urls(urls, seed)

View Source

Specs

process_relative_urls(Enumerable.t(), String.t()) :: Enumerable.t()
Link to this function

reject_invalid_urls(urls)

View Source

Specs

reject_invalid_urls(Enumerable.t()) :: Enumerable.t()
Link to this function

reject_non_domain_urls(urls, seed)

View Source

Specs

reject_non_domain_urls(Enumerable.t(), String.t()) :: Enumerable.t()
Link to this function

reject_static_resources(urls)

View Source

Specs

reject_static_resources(Enumerable.t()) :: Enumerable.t()
Link to this function

strip_query_params(urls)

View Source

Specs

strip_query_params(Enumerable.t()) :: Enumerable.t()
Link to this function

strip_trailing_slashes(urls)

View Source

Specs

strip_trailing_slashes(Enumerable.t()) :: Enumerable.t()