Metalove
Metalove is an Elixir Application to scrape podcast RSS feeds to extract and provide as much of the available metadata as possible. This includes relevant ID3 tag parsing to extract chapter, link and image metadata.
Metalove is intended to be a stateful live repository caching the scraped data. A one shot mode to just get one specific feed/metadata is also provided.
Installation
The package can be installed
by adding metalove
to your list of dependencies in mix.exs
:
def deps do
[
{:metalove, "~> 0.2"}
]
end
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/metalove.
Basic Usage
Use the main Metalove module to trigger scraping of the urls you like. The use the hierarchy of structs/modules to access them.
A Metalove.Podcast
can reference many Metalove.PodcastFeed
s which in turn have Metalove.Episode
s with Metalove.Enclosure
s. Once scraped, PodcastFeed
s and their Episode
s can be fetched using their corresponding get_…
functions.
iex> feed_or_website_url = "forschergeist.de"
iex> podcast = Metalove.get_podcast(feed_or_website_url)
%Metalove.Podcast{
created_at: #DateTime<2019-02-23 13:09:48.632101Z>,
feed_urls: ["http://forschergeist.de/feed/opus/",
"http://forschergeist.de/feed/oga/",
"http://forschergeist.de/feed/m4a/",
"http://forschergeist.de/feed/mp3/"],
id: "forschergeist.de",
main_feed_url: "http://forschergeist.de/feed/mp3/",
updated_at: #DateTime<2019-02-23 13:23:10.917299Z>
}
iex> feed = Metalove.PodcastFeed.get_by_feed_url(podcast.main_feed_url)
iex> most_recent_episode = Metalove.Episode.get_by_episode_id(hd(feed.episodes))
Mix Tasks
ml.chapter
Parses the ID3 tag of an mp3 url or file, writes out the images and the podlove simple chapter tags. E.g.
$ mix ml.chapter --base-url http://atp.fm/img/chapter/atp312/ -o /tmp http://traffic.libsyn.com/atpfm/atp312.mp3
Extracted: /tmp/Chapter01.png
Extracted: /tmp/Chapter03.jpeg
Extracted: /tmp/Chapter05.jpeg
Extracted: /tmp/Chapter06.jpeg
Extracted: /tmp/Chapter07.jpeg
<psc:chapters version="1.2" xmlns:psc="http://podlove.org/simple-chapters">
<psc:chapter start="00:00:00.000" title="ATP_progrm_chptr()" image="http://atp.fm/img/chapter/atp312/Chapter01.png"/>
<psc:chapter start="00:09:12.500" title="Follow-up: Apple-Facebook" href="https://techcrunch.com/2019/01/31/mess-with-the-cook/"/>
<psc:chapter start="00:12:12.000" title="Follow-up: FaceTime bug" href="http://www.loopinsight.com/2019/02/05/high-level-apple-exec-flies-to-tucson-to-meet-with-14-year-old-who-discovered-facetime-flaw/" image="http://atp.fm/img/chapter/atp312/Chapter03.jpeg"/>
<psc:chapter start="00:14:54.979" title="Sponsor: Eero (code ATP)" href="https://eero.com/"/>
<psc:chapter start="00:16:44.107" title='USB-C "MagSafe"' href="https://amzn.to/2t6fibm" image="http://atp.fm/img/chapter/atp312/Chapter05.jpeg"/>
<psc:chapter start="00:20:43.000" title="USB-C LED charging cable" href="http://www.amazon.com/dp/B07CHJYPCC/?tag=marcoorg-20" image="http://atp.fm/img/chapter/atp312/Chapter06.jpeg"/>
<psc:chapter start="00:22:01.500" title="Screen-protector update" href="https://paperlike.com/" image="http://atp.fm/img/chapter/atp312/Chapter07.jpeg"/>
<psc:chapter start="00:27:20.500" title="Sponsor: Molekule (code ATP)" href="https://molekule.com/"/>
<psc:chapter start="00:29:02.500" title="Ahrendts leaving Apple" href="https://www.apple.com/newsroom/2019/02/apple-names-deirdre-obrien-senior-vice-president-of-retail-and-people/"/>
<psc:chapter start="01:12:09.331" title="Sponsor: Squarespace (code ATP)" href="https://squarespace.com/atp"/>
<psc:chapter start="01:13:31.500" title="#askatp: Hard-drive brands" href="https://www.backblaze.com/blog/2018-hard-drive-failure-rates/"/>
<psc:chapter start="01:21:13.500" title="#askatp: Reopening windows"/>
<psc:chapter start="01:26:41.500" title="#askatp: Gimlet-Spotify" href="https://newsroom.spotify.com/2019-02-06/audio-first/"/>
<psc:chapter start="01:46:40.500" title="Ending theme" href="http://jonathanmann.net/"/>
<psc:chapter start="01:47:43.500" title="Gas station update"/>
</psc:chapters>
ml.podcast
Output a human friendly summary for the podcast found at the url given.
$ mix ml.podcast http://wtfpod.libsyn.com/rss
ID: wtfpod.com
Main Feed: http://wtfpod.libsyn.com/rss
Subtitle: Get all your WTF needs at wtfpod.com
Summary: Comedian Marc Maron is tackling the most complex philosophical question of our day - WTF? He'll get to the bottom of it with help from comedian friends, celebrity guests and the voices in his own head.
Description: Comedian Marc Maron is tackling the most complex philosophical question of our day - WTF? He'll get to the bottom of it with help from comedian friends, celebrity guests and the voices in his own head.
Cover: http://static.libsyn.com/p/assets/6/c/a/3/6ca38c2fefa1e989/WTF_-_new_larger_cover.jpg
Episodes available: 60
S01E997: Episode 997 - Andrea Savage (01:24:17|2019-02-25)
Andrea Savage visits the garage to talk about her show I'm Sorry, life with agents, Jason Mantzoukas, and being cut from The Groundlings. http://wtfpod.libsyn.com/episode-997-andrea-savage
.mp3: http://traffic.libsyn.com/wtfpod/WTF_-_EPISODE_997_ANDREA_SAVAGE.mp3?dest-id=14434 (24.42 MB)
S01E996: Episode 996 - Jon Bernthal (01:40:59|2019-02-21)
Jon Bernthal talks about The Punisher, The Walking Dead, Martin Scorsese, and how he went from heading down a bad path in life to salvation by way of acting. http://wtfpod.libsyn.com/episode-996-jon-bernthal
.mp3: http://traffic.libsyn.com/wtfpod/WTF_-_EPISODE_996_JON_BERNTHAL.mp3?dest-id=14434 (29.2 MB)
…
License
Metalove is released under the MIT license - see the LICENSE.txt file.