lastfm_archive v0.3.1 LastfmArchive View Source

lastfm_archive is a tool for creating local Last.fm scrobble data archive and analytics.

The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.

Current usage:

Link to this section Summary

Functions

Download all scrobbled tracks and create an archive on local filesystem for the default user

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user

Link to this section Functions

Link to this function archive() View Source
archive() :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for the default user.

Example

  LastfmArchive.archive

The archive belongs to a default user specified in configuration, for example user_a (in config/config.exs):

  config :lastfm_archive,
    user: "user_a",
    ...

See archive/2 for further details on archive format and file location.

Link to this function archive(user, options \\ []) View Source
archive(binary(), keyword()) :: :ok | {:error, :file.posix()}

Download all scrobbled tracks and create an archive on local filesystem for a Lastfm user.

Example

  LastfmArchive.archive("a_lastfm_user")
  LastfmArchive.archive("a_lastfm_user", interval: 300) # 300ms interval between Lastfm API request

Older scrobbles are archived on a yearly basis, whereas the latest (current year) scrobbles are extracted on a daily basis to ensure data immutability and updatability.

The data is currently in raw Lastfm recenttracks JSON format, chunked into 200-track (max) gzip compressed pages and stored within directories corresponding to the years and days when tracks were scrobbled.

Options - also configurable:

  • :interval the duration (in milliseconds) between successive requests sent to Lastfm API. It provides a control of the max rate of requests. The default (500ms) ensures a safe rate that is within Lastfm’s term of service - no more than 5 requests per second
  • :per_page number of scrobbles per page in archive. The default is 200 - max number of tracks per request permissible by Lastfm API

The data is written to a main directory, e.g. ./lastfm_data/a_lastfm_user/ as configured in config/config.exs:

  config :lastfm_archive,
    ...
    data_dir: "./lastfm_data/"

Note: Lastfm API calls could timed out occasionally. When this happen the function will continue archiving and move on to the next data chunk (page). It will log the missing page in an error directory. Re-run the function to download any missing data chunks. The function will skip all existing archived pages.

To create a fresh or refresh part of the archive: delete all or some files in the archive and re-run the function.