lastfm_archive v0.3.0 LastfmArchive View Source

lastfm_archive is a tool for creating local Last.fm scrobble data archive and analytics.

The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.

Current usage:

Link to this section Summary

Functions

Download all scrobbled tracks and create an archive on local filesystem for the default user

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user

Link to this section Functions

Link to this function archive() View Source
archive() :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for the default user.

Example

  LastfmArchive.archive

The archive belongs to a default user specified in configuration, for example user_a (in config/config.exs):

  config :lastfm_archive,
    user: "user_a",
    ...

See archive/2 for further details on archive format and file location.

Link to this function archive(user, interval \\ Application.get_env(:lastfm_archive, :req_interval) || 500) View Source
archive(binary(), integer()) :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user.

Example

  LastfmArchive.archive("a_lastfm_user")

Older scrobbles are archived on a yearly basis, whereas the latest (current year) scrobbles are extracted on a daily basis to ensure data immutability and updatability.

The data is currently in raw Lastfm recenttracks JSON format, chunked into 200-track (max) gzip compressed pages and stored within directories corresponding to the years and days when tracks were scrobbled.

interval is the duration (in milliseconds) between successive requests sent to Lastfm API. It provides a control of the max rate of requests. The default (500ms) ensures a safe rate that is within Lastfm’s term of service - no more than 5 requests per second.

The data is written to a main directory, e.g. ./lastfm_data/a_lastfm_user/ as configured in config/config.exs:

  config :lastfm_archive,
    ...
    data_dir: "./lastfm_data/"

Note: Lastfm API calls could timed out occasionally. When this happen the function will continue archiving and move on to the next data chunk (page). It will log the missing page in an error directory. Re-run the function to download any missing data chunks. The function will skip all existing archived pages.

To create a fresh or refresh part of the archive: delete all or some files in the archive and re-run the function.