lastfm_archive v0.3.2 LastfmArchive View Source

lastfm_archive is a tool for creating local Last.fm scrobble data archive and analytics.

The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.

Current usage:

Link to this section Summary

Functions

Download all scrobbled tracks and create an archive on local filesystem for the default user

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user

Link to this section Functions

Link to this function archive() View Source
archive() :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for the default user.

Example

  LastfmArchive.archive

The archive belongs to a default user specified in configuration, for example user_a (in config/config.exs):

  config :lastfm_archive,
    user: "user_a",
    ...

See archive/2 for further details on archive format and file location.

Link to this function archive(user, options \\ []) View Source
archive(binary(), keyword()) :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user.

Example

  LastfmArchive.archive("a_lastfm_user")
  LastfmArchive.archive("a_lastfm_user", interval: 300) # 300ms interval between Lastfm API request
  LastfmArchive.archive("a_lastfm_user", overwrite: true) # re-fetch / overwrite downloaded data

Older scrobbles are archived on a yearly basis, whereas the latest (current year) scrobbles are extracted on a daily basis to ensure data immutability and updatability.

The data is currently in raw Lastfm recenttracks JSON format, chunked into 200-track (max) gzip compressed pages and stored within directories corresponding to the years and days when tracks were scrobbled.

Options - also configurable:

  • :interval the duration (in milliseconds) between successive requests sent to Lastfm API. It provides a control of the max rate of requests. Default is 500 (ms), this ensures a safe rate that is within Lastfm’s term of service - no more than 5 requests per second

  • :overwrite if true, fetch and overwrite any previously downloaded data. Use this option to refresh the file archive. Default is false if existing data chunks / pages are found, the system will not be making calls to Lastfm to re-fetch data

  • :per_page number of scrobbles per page in archive. The default is 200 - max number of tracks per request permissible by Lastfm API

The data is written to a main directory, e.g. ./lastfm_data/a_lastfm_user/ as configured in config/config.exs:

  config :lastfm_archive,
    ...
    data_dir: "./lastfm_data/"

Reruns and refresh archive

Lastfm API calls could timed out occasionally. When this happen the function will continue archiving and move on to the next data chunk (page). It will log the missing page event(s) in an error directory.

Rerun the function to download any missing data chunks. The function skips all existing archived pages by default so that it will not make repeated calls to Lastfm. Use the overwrite: true option to re-fetch existing data.

To create a fresh or refresh part of the archive: delete all or some files in the archive and re-run the function, or use the overwrite: true option.