gollum v0.2.2 Gollum.Cache

Caches the robots.txt files from different hosts in memory.

Add this module to your supervision tree. Use this module to perform fetches of the robots.txt and automatic caching of results. It also makes sure the two identical requests don’t happen at the same time.

Link to this section Summary

Functions

Fetches the robots.txt from a host and stores it in the cache. It will only perform the HTTP request if there isn’t any current data in the cache, the data is too old (specified in the refresh_secs option in start_link/2) or when the force flag is set. This function is useful if you know which hosts you need to request beforehand

Gets the Gollum.Host struct for the specified host from the cache

Starts up the cache

Link to this section Functions

Link to this function fetch(host, opts \\ [])
fetch(binary, keyword) :: :ok | {:error, term}

Fetches the robots.txt from a host and stores it in the cache. It will only perform the HTTP request if there isn’t any current data in the cache, the data is too old (specified in the refresh_secs option in start_link/2) or when the force flag is set. This function is useful if you know which hosts you need to request beforehand.

Options

  • name - The name of the GenServer. Default value is Gollum.Cache.

  • async - Whether this call is async. If the call is async, :ok is always returned. The default value is false.

  • force - If the cache has already fetched from the host, this flag determines whether it should force a refresh. Default is false.

Link to this function get(host, opts \\ [])
get(binary, keyword) :: Gollum.Host.t | nil

Gets the Gollum.Host struct for the specified host from the cache.

Options

  • name - The name of the GenServer. Default value is Gollum.Cache.
Link to this function start_link(opts \\ [])
start_link(keyword) :: {:ok, pid} | {:error, term}

Starts up the cache.

Options

  • name - The name of the GenServer. Default value is Gollum.Cache.

  • refresh_secs - The number of seconds until the robots.txt will be refetched from the host. Defaults to 86_400, which is 1 day.

  • lazy_refresh - If this flag is set to true, the file will only be refetched from the host if needed. Otherwise, the file will be refreshed at the interval specified by refresh_secs. Defaults to false.

  • user_agent - The user agent to use when performing the GET request. Default is "Gollum".