lastfm_archive v0.3.1 LastfmArchive View Source
lastfm_archive
is a tool for creating local Last.fm scrobble data archive and analytics.
The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.
Current usage:
Link to this section Summary
Functions
Download all scrobbled tracks and create an archive on local filesystem for the default user
Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user
Link to this section Functions
Download all scrobbled tracks and create an archive on local filesystem for the default user.
Example
LastfmArchive.archive
The archive belongs to a default user specified in configuration, for example user_a
(in
config/config.exs
):
config :lastfm_archive,
user: "user_a",
...
See archive/2
for further details on archive format and file location.
archive(binary(), keyword()) :: :ok | {:error, :file.posix()}
Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user.
Example
LastfmArchive.archive("a_lastfm_user")
LastfmArchive.archive("a_lastfm_user", interval: 300) # 300ms interval between Lastfm API request
Older scrobbles are archived on a yearly basis, whereas the latest (current year) scrobbles are extracted on a daily basis to ensure data immutability and updatability.
The data is currently in raw Lastfm recenttracks
JSON format, chunked into
200-track (max) gzip
compressed pages and stored within directories corresponding
to the years and days when tracks were scrobbled.
Options - also configurable:
:interval
the duration (in milliseconds) between successive requests sent to Lastfm API. It provides a control of the max rate of requests. The default (500ms) ensures a safe rate that is within Lastfm’s term of service - no more than 5 requests per second:per_page
number of scrobbles per page in archive. The default is 200 - max number of tracks per request permissible by Lastfm API
The data is written to a main directory,
e.g. ./lastfm_data/a_lastfm_user/
as configured in
config/config.exs
:
config :lastfm_archive,
...
data_dir: "./lastfm_data/"
Note: Lastfm API calls could timed out occasionally. When this happen
the function will continue archiving and move on to the next data chunk (page).
It will log the missing page in an error
directory. Re-run the function
to download any missing data chunks. The function will skip all existing
archived pages.
To create a fresh or refresh part of the archive: delete all or some files in the archive and re-run the function.