Gollum.Host (gollum v0.5.0)
Represents one host's robots.txt files.
Summary
Functions
Returns whether a specified path is crawlable by the specified user agent, based on the rules defined in the specified host struct.
Creates a new Gollum.Host
struct, passing in the host and rules.
The rules usually are the output of the parser.
Types
Functions
Link to this function
crawlable?(host, user_agent, path)
Returns whether a specified path is crawlable by the specified user agent, based on the rules defined in the specified host struct.
Checks are done based on the specification defined by Google, which can be found here.
Examples
iex> alias Gollum.Host
iex> rules = %{
...> "hello" => %{
...> allowed: ["/p"],
...> disallowed: ["/"],
...> },
...> "otherhello" => %{
...> allowed: ["/$"],
...> disallowed: ["/"],
...> },
...> "*" => %{
...> allowed: ["/page"],
...> disallowed: ["/*.htm"],
...> },
...> }
iex> host = Host.new("hello.net", rules)
iex> Host.crawlable?(host, "Hello", "/page")
:crawlable
iex> Host.crawlable?(host, "OtherHello", "/page.htm")
:uncrawlable
iex> Host.crawlable?(host, "NotHello", "/page.htm")
:uncrawlable
Link to this function
new(host, rules)
Creates a new Gollum.Host
struct, passing in the host and rules.
The rules usually are the output of the parser.
Examples
iex> alias Gollum.Host
iex> rules = %{"Hello" => %{allowed: [], disallowed: []}}
iex> Host.new("hello.net", rules)
%Gollum.Host{host: "hello.net", rules: %{"Hello" => %{allowed: [], disallowed: []}}}