Link (link v1.0.2)

Link is a little link parsing, compacting and shortening library.

Summary

Functions

compact/1 reduces a url to its most compact form. This is highly specific to our use case.

compact_github_url/1 does exactly as its' name implies: compact a GitHub URL down to it's simplest version so that we aren't wasting characters on a mobile screen.

find/1 finds all instances of a URL in a block of text.

find_replace_compact/1 finds all instances of a URL in a block of text and replaces them with the compact/1 version.

regex/0 returns the Regular Expression needed to match URLs. According to RFC 3986: https://www.rfc-editor.org/rfc/rfc3986 Based on reading https://mathiasbynens.be/demo/url-regex After much searching on Google, GitHub and StackOverflow, this is what we came up with.

strip_protocol/1 strips the protocol e.g: "https://" from a URL.

strip_trailing_slash/1 strips trailing forward slash from URL.

valid?/1 confirms if a URL is valid using the RFC 3986 compliant regex/0 above.

Functions

compact/1 reduces a url to its most compact form. This is highly specific to our use case.

Examples

iex> Link.compact("https://github.com/dwyl/mvp/issues/141")
"dwyl/mvp#141"

# Can't understand the URL, just return it sans protocol:
iex> Link.compact("https://git.io/top")
"git.io/top"

iex> Link.compact("https://mvp.fly.dev/")
"mvp.fly.dev"
Link to this function

compact_github_url(url)

compact_github_url/1 does exactly as its' name implies: compact a GitHub URL down to it's simplest version so that we aren't wasting characters on a mobile screen.

Examples

iex> Link.compact_github_url("https://github.com/dwyl/mvp/issues/141") "dwyl/mvp#141"

iex> Link.compact_github_url("https://github.com/dwyl/app/issues/275#issuecomment-1646862277") "dwyl/app#275"

iex> Link.compact_github_url("https://github.com/dwyl/link#123") "dwyl/link"

find/1 finds all instances of a URL in a block of text.

Examples

iex> Link.find("Text with links http://goo.gl/3co4ae and https://git.io/top & www.dwyl.com etc.") ["http://goo.gl/3co4ae", "https://git.io/top", "www.dwyl.com"]

Link to this function

find_replace_compact(text)

find_replace_compact/1 finds all instances of a URL in a block of text and replaces them with the compact/1 version.

Examples

iex> md = "# Hello World! https://github.com/dwyl/mvp/issues/141#issuecomment-1657954420 and https://mvp.fly.dev/" iex> Link.find_replace_compact(md) "# Hello World! dwyl/mvp#141 and mvp.fly.dev"

regex/0 returns the Regular Expression needed to match URLs. According to RFC 3986: https://www.rfc-editor.org/rfc/rfc3986 Based on reading https://mathiasbynens.be/demo/url-regex After much searching on Google, GitHub and StackOverflow, this is what we came up with.

#HELPWANTED: if you find a better (faster, more comppliant) RegEx that passes all our tests, please share! github.com/dwyl/link/issues/new

Explanation:

((http(s)?):// # Optional scheme (http or https) (www.)? # Optional "www" a-zA-Z0-9@:%._+~#=]{2,256} # Domain (IPv4, IPv6 or hostname) :% # Optional port number [a-z]{2,6} # Domain extension e.g: ".com" (?:?[^

]*)? # Optional query string ?q= (?:#[^

]*)? # Optional fragment e.g: #comment

Examples

iex> Regex.run(Link.regex(), "dwyl.com") |> List.flatten |> Enum.filter(& &1 != "") |> List.first "dwyl.com"

Link to this function

strip_protocol(url)

strip_protocol/1 strips the protocol e.g: "https://" from a URL.

Examples

iex> Link.strip_protocol("https://dwyl.com") "dwyl.com"

Link to this function

strip_trailing_slash(url)

strip_trailing_slash/1 strips trailing forward slash from URL.

Examples

iex> Link.strip_trailing_slash("dwyl.com/") "dwyl.com"

valid?/1 confirms if a URL is valid using the RFC 3986 compliant regex/0 above.

Examples

iex> Link.valid?("example.c") false

iex> Link.valid?("https://www.example.com") true