lastfm_archive v0.3.2 LastfmArchive View Source
lastfm_archive
is a tool for creating local Last.fm scrobble data archive and analytics.
The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.
Current usage:
Link to this section Summary
Functions
Download all scrobbled tracks and create an archive on local filesystem for the default user
Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user
Link to this section Functions
Download all scrobbled tracks and create an archive on local filesystem for the default user.
Example
LastfmArchive.archive
The archive belongs to a default user specified in configuration, for example user_a
(in
config/config.exs
):
config :lastfm_archive,
user: "user_a",
...
See archive/2
for further details on archive format and file location.
archive(binary(), keyword()) :: :ok | {:error, :file.posix()}
Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user.
Example
LastfmArchive.archive("a_lastfm_user")
LastfmArchive.archive("a_lastfm_user", interval: 300) # 300ms interval between Lastfm API request
LastfmArchive.archive("a_lastfm_user", overwrite: true) # re-fetch / overwrite downloaded data
Older scrobbles are archived on a yearly basis, whereas the latest (current year) scrobbles are extracted on a daily basis to ensure data immutability and updatability.
The data is currently in raw Lastfm recenttracks
JSON format, chunked into
200-track (max) gzip
compressed pages and stored within directories corresponding
to the years and days when tracks were scrobbled.
Options - also configurable:
:interval
the duration (in milliseconds) between successive requests sent to Lastfm API. It provides a control of the max rate of requests. Default is 500 (ms), this ensures a safe rate that is within Lastfm’s term of service - no more than 5 requests per second:overwrite
iftrue
, fetch and overwrite any previously downloaded data. Use this option to refresh the file archive. Default is false if existing data chunks / pages are found, the system will not be making calls to Lastfm to re-fetch data:per_page
number of scrobbles per page in archive. The default is 200 - max number of tracks per request permissible by Lastfm API
The data is written to a main directory,
e.g. ./lastfm_data/a_lastfm_user/
as configured in
config/config.exs
:
config :lastfm_archive,
...
data_dir: "./lastfm_data/"
Reruns and refresh archive
Lastfm API calls could timed out occasionally. When this happen
the function will continue archiving and move on to the next data chunk (page).
It will log the missing page event(s) in an error
directory.
Rerun the function
to download any missing data chunks. The function skips all existing
archived pages by default so that it will not make repeated calls to Lastfm.
Use the overwrite: true
option to re-fetch existing data.
To create a fresh or refresh part of the archive: delete all or some
files in the archive and re-run the function, or use the overwrite: true
option.