Backing up Spotify

2025-12-2018:281980701annas-archive.li

We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB). It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be…

annas-archive.li/blog, 2025-12-20

We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB), grouped by popularity.

This release includes the largest publicly available music metadata database with 256 million tracks and 186 million unique ISRCs.

It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.

Anna’s Archive normally focuses on text (e.g. books and papers). We explained in “The critical window of shadow libraries” that we do this because text has the highest information density. But our mission (preserving humanity’s knowledge and culture) doesn’t distinguish among media types. Sometimes an opportunity comes along outside of text. This is such a case.

A while ago, we discovered a way to scrape Spotify at scale. We saw a role for us here to build a music archive primarily aimed at preservation.

Generally speaking, music is already fairly well preserved. There are many music enthusiasts in the world who digitized their CD and LP collections, shared them through torrents or other digital means, and meticulously catalogued them.

However, these existing efforts have some major issues:

  1. Over-focus on the most popular artists. There is a long tail of music which only gets preserved when a single person cares enough to share it. And such files are often poorly seeded.
  2. Over-focus on the highest possible quality. Since these are created by audiophiles with high end equipment and fans of a particular artist, they chase the highest possible file quality (e.g. lossless FLAC). This inflates the file size and makes it hard to keep a full archive of all music that humanity has ever produced.
  3. No authoritative list of torrents aiming to represent all music ever produced. An equivalent of our book torrent list (which aggregate torrents from LibGen, Sci-Hub, Z-Lib, and many more) does not exist for music.

This Spotify scrape is our humble attempt to start such a “preservation archive” for music. Of course Spotify doesn’t have all the music in the world, but it’s a great start.

Before we dive into the details of this collection, here is a quick overview:

  • Spotify has around 256 million tracks. This collection contains metadata for an estimated 99.9% of tracks.
  • We archived around 86 million music files, representing around 99.6% of listens. It’s a little under 300TB in total size.
  • We primarily used Spotify’s “popularity” metric to prioritize tracks. View the top 10,000 most popular songs in this HTML file (13.8MB gzipped).
  • For popularity>0, we got close to all tracks on the platform. The quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify, as well as a metadata file with original hashes and checksums).
  • For popularity=0, we got files representing about half the number of listens (either original or a copy with the same ISRC). The audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.
  • The cutoff is 2025-07, anything released after that date may not be present (though in some cases it is).
  • This is by far the largest music metadata database that is publicly available. For comparison, we have 256 million tracks, while others have 50-150 million. Our data is well-annotated: MusicBrainz has 5 million unique ISRCs, while our database has 186 million.
  • This is the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space).

The data will be released in different stages on our Torrents page:

  • [X] Metadata (Dec 2025)
  • [ ] Music files (releasing in order of popularity)
  • [ ] Additional file metadata (torrent paths and checksums)
  • [ ] Album art
  • [ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)

For now this is a torrents-only archive aimed at preservation, but if there is enough interest, we could add downloading of individual files to Anna’s Archive. Please let us know if you’d like this.

Please help preserve these files:

  1. Donate to Anna’s Archive. Any amount helps!
  2. Seed these torrents (on the Torrents page of Anna’s Archive). Even a seeding a few torrents helps!

With your help, humanity’s musical heritage will be forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes.

In this blog we will analyze the data and look at details of the release. We hope you enjoy.

— Volunteer “ez” of Anna’s Archive team

♫ ♫ ♫ ♫ ♫

Data Exploration

Let’s dive into the data! Here's some high-level statistics pulled from the metadata:

Songs / Tracks

Spotify has around 256 million tracks.

The most convenient available way to sort songs on Spotify is using the popularity metric, defined as follows:

The popularity of a track is a value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.

Generally speaking, songs that are being played a lot now will have a higher popularity than songs that were played a lot in the past. Duplicate tracks (e.g. the same track from a single and an album) are rated independently. Artist and album popularity is derived mathematically from track popularity.

If we group songs by popularity, we see that there is an extremely large tail end:

≥70% of songs are ones almost no one ever listens to (stream count < 1000). To see some detail, we can plot this on a logarithmic scale:

The top 10,000 songs span popularities 70-100. You can view them all in this HTML file (13.8MB gzipped).

Additionally, we can estimate the number of listens per track and total number per popularity. The stream count data is estimated since it is difficult to fetch at scale, so we sampled it randomly.

As we can see, most of the listens come from songs with a popularity between 50 and 80, even though there's only 210.000 songs with popularity ≥50, around 0.1% of songs. Note the huge (subjectively estimated) error bar on pop=0 — the reason for this is that Spotify does not publish stream counts for songs with < 1000 streams.

We can also estimate that the top three songs (as of writing) have a higher total stream count than the bottom 20-100 million songs combined:

Artists Name Popularity Stream Count
Lady Gaga, Bruno Mars Die With A Smile 100 3.075 Billion
Billie Eilish BIRDS OF A FEATHER 98 3.137 Billion
Bad Bunny DtMF 98 1.124 Billion
SQLite Query
select json_group_array(artists.name), tracks.name, tracks.popularity
    from tracks
    join track_artists on track_rowid = tracks.rowid
    join artists on artist_rowid = artists.rowid
    where tracks.id in (select id from tracks order by popularity desc limit 3)
    group by tracks.id;

Note that the popularity is very time-dependent and not directly translatable into stream counts, so these top songs are basically arbitrary.

Songs

We have archived around 86 million songs from Spotify, ordering by popularity descending. While this only represents 37% of songs, it represents around 99.6% of listens:

Put another way, for any random song a person listens to, there is a 99.6% likelihood that it is part of the archive. We expect this number to be higher if you filter to only human-created songs. Do remember though that the error bar on listens for popularity 0 is large.

For popularity=0, we ordered tracks by a secondary importance metric based on artist followers and album popularity, and fetched in descending order.

We have stopped here due to the long tail end with diminishing returns (700TB+ additional storage for minor benefit), as well as the bad quality of songs with popularity=0 (many AI generated, hard to filter).

Torrents

Before diving into more fun stats, let’s look at how the collection itself is structured. It’s in two parts: metadata and music files, both of which are distributed through torrents.

The metadata torrents contain, based on statistical analysis, around 99.9% of artists, albums, tracks. The metadata is published as compact queryable SQLite databases. Care was taken, by doing API response reconstruction, that there is (almost) no data loss in the conversion from the API JSON.

The metadata for artists, albums, tracks is less than 200 GB compressed. The secondary metadata of audio analysis is 4TB compressed.

We look at more detail at the structure of the metadata at the end of this blog post.

Music Files

The data itself is distributed in the Anna’s Archive Containers (AAC) format. This is a standard which we created a few years ago for distributing files across multiple torrents. It is not to be confused with the Advanced Audio Coding (AAC) encoding format.

Since the original files contain zero metadata, as much metadata as possible was added to the OGG files, including title, url, ISRC, UPC, album art, replaygain information, etc. The invalid OGG data packet Spotify prepends to every track file was stripped — it is present in the track_files db.

For popularity>0, the quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify).

For popularity=0, the audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.

There is a known bug where the REPLAYGAIN_ALBUM_PEAK vorbiscomment tag value is a copy-paste of REPLAYGAIN_ALBUM_GAIN instead of the correct value for many files.

The True Shuffle

Many people complain about how Spotify shuffles tracks. Since we have metadata for 99.9+% of tracks on Spotify, we can create a true shuffle across all songs on Spotify!

Example True Shuffle Playlist
$ sqlite3 spotify_clean.sqlite3
  sqlite> .mode table
  sqlite> with random_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from generate_series(0)) select inx,tracks.id,tracks.popularity,tracks.name from random_ids join tracks on tracks.rowid=trowid limit 20;
  +-----+------------------------+------------+--------------------------------------------------------------+
  | inx |           id           | popularity |                             name                             |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 0   | 7KS7cm2arAGA2VZaZ2XvNa | 0          | Just Derry                                                   |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 1   | 1BkLS2tmxD088l2ojUW5cv | 0          | Kapitel 37 - Aber erst wird gegessen - Schon wieder Weihnach |
  |     |                        |            | ten mit der buckligen Verwandtschaft                         |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 2   | 5RSU7MELzCaPweG8ALmjLK | 0          | El Buen Pastor                                               |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 3   | 1YNIl8AKIFltYH8O2coSoT | 0          | You Are The One                                              |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 4   | 1GxMuEYWs6Lzbn2EcHAYVx | 0          | Waorani                                                      |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 5   | 4NhARf6pjwDpbyQdZeSsW3 | 0          | Magic in the Sand                                            |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 6   | 7pDrZ6rGaO6FHk6QtTKvQo | 0          | Yo No Fui                                                    |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 7   | 15w4LBQ6rkf3QA2OiSMBRD | 25         | 你走                                                         |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 8   | 5Tx7jRLKfYlay199QB2MSs | 0          | Soul Clap                                                    |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 9   | 3L7CkCD9595MuM0SVuBZ64 | 1          | Xuân Và Tuổi Trẻ                                             |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 10  | 4S6EkSnfxlU5UQUOZs7bKR | 1          | Elle était belle                                             |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 11  | 0ZIOUYrrArvSTq6mrbVqa1 | 0          | Kapitel 7.2 - Die Welt der Magie - 4 in 1 Sammelband: Weiße  |
  |     |                        |            | Magie | Medialität, Channeling & Trance | Divination & Wahrs |
  |     |                        |            | agen | Energetisches Heilen                                  |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 12  | 4VfKaW1X1FKv8qlrgKbwfT | 0          | Pura energia                                                 |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 13  | 1VugH5kD8tnMKAPeeeTK9o | 10         | Dalia                                                        |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 14  | 6NPPbOybTFLL0LzMEbVvuo | 4          | Teil 12 - Folge 2: Arkadien brennt                           |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 15  | 1VSVrAbaxNllk7ojNGXDym | 3          | Bre Petrunko                                                 |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 16  | 4NSmBO7uzkuES7vDLvHtX8 | 0          | Paranoia                                                     |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 17  | 7AHhiIXvx09DRZGQIsbcxB | 0          | Sand Underfoot Moments                                       |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 18  | 0sitt32n4JoSM1ewOWL7hs | 0          | Start Over Again                                             |
  +-----+------------------------+------------+--------------------------------------------------------------+
  | 19  | 080Zimdx271ixXbzdZOqSx | 3          | Auf all euren Wegen                                          |
  +-----+------------------------+------------+--------------------------------------------------------------+
Or, filtering to only somewhat popular songs
sqlite> with random_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from generate_series(0)) select inx,tracks.id,tracks.popularity,albums.name as album_name,tracks.name from random_ids join tracks on tracks.rowid=trowid join albums on albums.rowid = album_rowid
  where tracks.popularity >= 10 limit 20;
  +-----+------------------------+------------+--------------------------------------+-------------------------------+
  | inx |           id           | popularity |                      album_name      |             name              |
  +-----+------------------------+------------+--------------------------------------+-------------------------------+
  | 32  | 1om6LphEpiLpl9irlOsnzb | 23         | The Essential Widespread Panic       | Love Tractor                  |
  | 47  | 2PCtPCRDia6spej5xcxbvW | 20         | Desatinos Desplumados                | Sirena                        |
  | 65  | 5wmR10WloZqVVdIpYhdaqq | 20         | Um Passeio pela Harpa Cristã - Vol 6 | As Santas Escrituras          |
  | 89  | 5xCuYNX3QlPsxhKLbWlQO9 | 11         | No Me Amenaces                       | No Me Amenaces                |
  | 96  | 2GRmiDIcIwhQnkxakNyUy4 | 16         | Very Bad Truth (Kingston Universi... | Kapitel 8.3 - Very Bad Truth  |
  | 98  | 5720pe1PjNXoMcbDPmyeLW | 11         | Kleiner Eisbär: Hilf mir fliegen!    | Kapitel 06: Hilf mir fliegen! |
  | 109 | 1mRXGNVsfD9UtFw6r5YtzF | 11         | Lunar Archive                        | Outdoor Seating               |
  | 110 | 5XOQwf6vkcJxWG9zgqVEWI | 19         | Teenage Dream                        | Firework                      |
  | 125 | 0rbHOp8B4CpPXXZSekySvv | 15         | Previa y Cachengue 2025              | Debi tirar mas fotos          |
  | 145 | 4RGj8KyWGMjrUEseDTc3MO | 19         | High Noon over Camelot               | "The Hierophant"              |
  | 158 | 1MebBcPcUNgdVRMSfzJIyS | 21         | RBS                                  | Estar Vivo                    |
  | 176 | 0E6h47PjbHJFno9IImwFFm | 17         | The Raga Guide                       | Bilaskhani Todi               |
  | 196 | 1QcziEkM8mZSm0hJ1rC2Ft | 14         | Meu Abraço                           | Meu Abraço                    |
  | 204 | 33vRjP0CI7krO2KQ6YS1u7 | 14         | Joan Shelley                         | Pull Me Up One More Time      |
  | 231 | 3rnTIldZ0uHr5aooIwJjvF | 12         | Stjörnulífið                         | Illuminati                    |
  | 246 | 6aVxXv5ywGL2xc2dg0I5jT | 10         | Family                               | Hana no Youni                 |
  | 252 | 3ESGm5fRIOtzA7BfKlNIZy | 10         | Out Of Control                       | Let's Try Love Again          |
  | 297 | 4jZmhTVjIWBmFfnolYLmD5 | 18         | Blood Brothers                       | Faster and Louder             |
  | 298 | 0ebW1CJ4tYRx3VHfqbWzUh | 19         | Vibe da Faixa Rosa                   | Vibe da Faixa Rosa            |
  | 299 | 5xuK0SlWkAqs0w1sq6BZSk | 15         | Swingin Hammers                      | Hangman                       |
  +-----+------------------------+------------+--------------------------------------+-------------------------------+

More Stats

Here's some more statistics:

Tracks

We're curious about the peaks at whole minutes (particularly 2:00, 3:00, 4:00). If you know why this is, please let us know!

Some songs, especially popular songs, have 2, 3, or even 20 different versions. We can quantify this by counting number of songs per ISRC:

Each song on Spotify is only available on a specific set of markets. Most songs are available on most markets, but if we filter to popular songs, you can see a difference in availability (without filtering, the graph is almost flat):

Artists

Spotify provides a list of genres per artist (not per song). If we count the artists for each genre, we get this result:

Since each genre is very specific, we can also group genres and count the results:

We can also group artists by popularity. The resulting graph looks very similar to the tracks popularity graph:

The same graph for albums also looks the same.

Albums

If we group albums by release year, we see that more and more new music is added to Spotify, a lot of it likely automatically generated:

The amount of procedurally and AI generated content makes it hard to find what is actually valuable.

You can see that most songs on Spotify are singles, not part of an album.

Audio Features

We also scraped audio features generated by Spotify. They can be analyzed to find interesting trends.

This chart contains a lot of information. You can for example see that loudness correlates with energy and that BPM is normally distributed with a mean around 120.

A scrape of the three main APIs of Spotify: artists, albums, tracks. Each track exists in exactly one album, but each track and each album can have multiple artists.

The tables are an almost lossless representation of Spotify API JSON responses, care was taken during creation by always reconstructing the original JSON based on each inserted row, with minor exceptions.

CREATE TABLE `artists` ( `rowid` integer PRIMARY KEY NOT NULL, `id` text NOT NULL, `fetched_at` integer NOT NULL, `name` text NOT NULL, `followers_total` integer NOT NULL, `popularity` integer NOT NULL ); CREATE TABLE `artist_genres` ( `artist_rowid` integer NOT NULL, `genre` text NOT NULL, FOREIGN KEY (`artist_rowid`) REFERENCES `artists`(`rowid`) ); CREATE TABLE `artist_images` ( `artist_rowid` integer NOT NULL, `width` integer NOT NULL, `height` integer NOT NULL, `url` text NOT NULL, FOREIGN KEY (`artist_rowid`) REFERENCES `artists`(`rowid`) ); CREATE TABLE "artist_albums" ( `artist_rowid` integer NOT NULL, `album_rowid` integer NOT NULL, `is_appears_on` integer NOT NULL, `is_implicit_appears_on` integer NOT NULL, `index_in_album` integer, FOREIGN KEY (`artist_rowid`) REFERENCES `artists`(`rowid`), FOREIGN KEY (`album_rowid`) REFERENCES `albums`(`rowid`) ); CREATE TABLE `available_markets` ( `rowid` integer PRIMARY KEY NOT NULL, `available_markets` text NOT NULL ); CREATE TABLE `albums` ( `rowid` integer PRIMARY KEY NOT NULL, `id` text NOT NULL, `fetched_at` integer NOT NULL, `name` text NOT NULL, `album_type` text NOT NULL, `available_markets_rowid` integer NOT NULL, `external_id_upc` text, "external_id_amgid" text, `copyright_c` text, `copyright_p` text, `label` text NOT NULL, `popularity` integer NOT NULL, `release_date` text NOT NULL, `release_date_precision` text NOT NULL, `total_tracks` integer NOT NULL, FOREIGN KEY (`available_markets_rowid`) REFERENCES `available_markets`(`rowid`) ); CREATE TABLE "album_images" ( `album_rowid` integer NOT NULL, `width` integer NOT NULL, `height` integer NOT NULL, `url` text NOT NULL, FOREIGN KEY (`album_rowid`) REFERENCES `albums`(`rowid`) ); CREATE TABLE `tracks` ( `rowid` integer PRIMARY KEY NOT NULL, `id` text NOT NULL, `fetched_at` integer NOT NULL, `name` text NOT NULL, `preview_url` text, `album_rowid` integer NOT NULL, `track_number` integer NOT NULL, `external_id_isrc` text, `external_id_ean` text, `external_id_upc` text, `popularity` integer NOT NULL, `available_markets_rowid` integer NOT NULL, `disc_number` integer NOT NULL, `duration_ms` integer NOT NULL, `explicit` integer NOT NULL, FOREIGN KEY (`available_markets_rowid`) REFERENCES `available_markets`(`rowid`) ); CREATE TABLE `track_artists` ( `track_rowid` integer NOT NULL, `artist_rowid` integer NOT NULL, FOREIGN KEY (`track_rowid`) REFERENCES `tracks`(`rowid`), FOREIGN KEY (`artist_rowid`) REFERENCES `artists`(`rowid`) ); CREATE INDEX `artist_genres_artist_id` ON `artist_genres` (`artist_rowid`); CREATE INDEX `artist_genres_genre` ON `artist_genres` (`genre`); CREATE INDEX `artist_images_artist_id` ON `artist_images` (`artist_rowid`); CREATE UNIQUE INDEX `artists_id_unique` ON `artists` (`id`); CREATE INDEX `artists_name` ON `artists` (`name`); CREATE INDEX `artists_popularity` ON `artists` (`popularity`); CREATE INDEX `artists_followers` ON `artists` (`followers_total`); CREATE INDEX `artist_album_artist_id` ON "artist_albums" (`artist_rowid`); CREATE INDEX `artist_album_album_id` ON "artist_albums" (`album_rowid`); CREATE UNIQUE INDEX `albums_id_unique` ON `albums` (`id`); CREATE INDEX `album_name` ON `albums` (`name`); CREATE INDEX `album_popularity` ON `albums` (`popularity`); CREATE UNIQUE INDEX `available_markets_available_markets_unique` ON `available_markets` (`available_markets`); CREATE INDEX `track_artists_artist_id` ON `track_artists` (`artist_rowid`); CREATE INDEX `track_artists_track_id` ON `track_artists` (`track_rowid`); CREATE UNIQUE INDEX `tracks_id_unique` ON `tracks` (`id`); CREATE INDEX `tracks_popularity` ON `tracks` (`popularity`); CREATE INDEX `tracks_album` ON `tracks` (`album_rowid`); CREATE INDEX `album_images_album_id` ON `album_images` (`album_rowid`); CREATE INDEX tracks_isrc on tracks(external_id_isrc);

A scrape of AudioFeatures objects from the Spotify API. One row per track.

 CREATE TABLE `track_audio_features` ( `rowid` integer PRIMARY KEY NOT NULL, `track_id` text NOT NULL, `fetched_at` integer NOT NULL, `null_response` integer NOT NULL, `duration_ms` integer, `time_signature` integer, `tempo` integer, `key` integer, `mode` integer, `danceability` real, `energy` real, `loudness` real, `speechiness` real, `acousticness` real, `instrumentalness` real, `liveness` real, `valence` real ); CREATE UNIQUE INDEX `track_audio_features_track_id_unique` ON `track_audio_features` (`track_id`);

A scrape of Playlist objects from the Spotify API. Requires the spotify_clean.sqlite3 database to map track_rowid to track_id.

Most playlists with < 1000 followers were excluded. Completeness unknown.

Row count: 6.6 million playlists with 1.7 billion playlist tracks.

 CREATE TABLE "playlists" ( `rowid` integer PRIMARY KEY NOT NULL, `id` text NOT NULL, `snapshot_id` text NOT NULL, `fetched_at` integer NOT NULL, `name` text NOT NULL, `description` text, `collaborative` integer NOT NULL, `public` integer NOT NULL, `primary_color` text, `owner_id` text, `owner_display_name` text, `followers_total` integer, `tracks_total` integer NOT NULL ); CREATE UNIQUE INDEX `playlists_id_unique` ON `playlists` (`id`); CREATE INDEX `playlists_name` ON `playlists` (`name`); CREATE INDEX `playlists_owner_id` ON `playlists` (`owner_id`); CREATE INDEX `playlists_snapshot_id` ON `playlists` (`snapshot_id`); CREATE INDEX `playlists_followers` ON `playlists` (`followers_total`); CREATE TABLE `playlist_images` ( `playlist_rowid` integer NOT NULL, `width` integer, `height` integer, `url` text NOT NULL, FOREIGN KEY (`playlist_rowid`) REFERENCES `playlists`(`rowid`) ); CREATE INDEX `playlist_images_playlist_id` ON `playlist_images` (`playlist_rowid`); CREATE TABLE "playlist_tracks" ( `playlist_rowid` integer NOT NULL, `position` integer NOT NULL, `is_episode` integer NOT NULL, `track_rowid` integer, `id_if_not_in_tracks_table` text, `added_at` integer NOT NULL, `added_by_id` text, `primary_color` text, `video_thumbnail_url` text, `is_local` integer NOT NULL, `name_if_is_local` text, `uri_if_is_local` text, `album_name_if_is_local` text, `artists_name_if_is_local` text, `duration_ms_if_is_local` integer, PRIMARY KEY(`playlist_rowid`, `position`), FOREIGN KEY (`playlist_rowid`) REFERENCES `playlists`(`rowid`) ) WITHOUT ROWID;

This is an almost lossless representation of the original Spotify API response. JSON reconstruction was tested during creation of all tables, with minor exceptions. For example, the original response JSON for playlists can be reconstructed with code similar to the following:

Playlist Reconstruction Code (Example)

  function escapeURI(s) {
    return encodeURIComponent(s).replace(
      /[!()*]/g,
      (c) => "%" + c.charCodeAt(0).toString(16).toUpperCase().padStart(2, "0")
    );
  }
  function booleanToTrackType(is_episode) {
    return is_episode ? "episode" : "track";
  }

  function reconstructTrack(track, track_id) {
    let innerTrack = null;

    if (track.is_local) {
      innerTrack = {
        album: {
          album_type: null,
          available_markets: [],
          external_urls: {},
          href: null,
          id: null,
          images: [],
          name: track.album_name_if_is_local ?? "",
          release_date: null,
          release_date_precision: null,
          type: "album",
          uri: null,
          artists: [],
        },
        artists: [
          {
            external_urls: {},
            href: null,
            id: null,
            name: track.artists_name_if_is_local ?? "",
            type: "artist",
            uri: null,
          },
        ],
        available_markets: [],
        explicit: false,
        preview_url: null,
        type: "track",
        disc_number: 0,
        external_ids: {},
        external_urls: {},
        href: null,
        id: null,
        duration_ms: track.duration_ms_if_is_local,
        name: track.name_if_is_local,
        uri: track.uri_if_is_local,
        popularity: 0,
        track_number: 0,
        is_local: true,
        tags: null,
      };
    } else if (track_id) {
      innerTrack = {
        type: booleanToTrackType(track.is_episode),
        id: track_id,
      };
    } else if (track.id_if_not_in_tracks_table) {
      innerTrack = {
        type: booleanToTrackType(track.is_episode),
        id: track.id_if_not_in_tracks_table,
      };
    }
    return {
      added_at: new Date(track.added_at).toISOString().replace(".000Z", "Z"),
      added_by: {
        id: track.added_by_id,
        type: "user",
        uri: track.added_by_id
          ? `spotify:user:${escapeURI(track.added_by_id)}`
          : null,
        href: `https://api.spotify.com/v1/users/${track.added_by_id}`,
        external_urls: {
          spotify: `https://open.spotify.com/user/${track.added_by_id}`,
        },
      },
      is_local: track.is_local,
      primary_color: track.primary_color,
      video_thumbnail: {
        url: track.video_thumbnail_url,
      },
      track: innerTrack,
    };
  }
  function reconstructPlaylist({
    inserted_playlist,
    inserted_images,
    inserted_tracks,
  }) {
    return {
      id: inserted_playlist.id,
      snapshot_id: inserted_playlist.snapshot_id,
      name: inserted_playlist.name,
      description: inserted_playlist.description || "",
      collaborative: inserted_playlist.collaborative,
      public: inserted_playlist.public,
      primary_color: inserted_playlist.primary_color,
      owner: {
        id: inserted_playlist.owner_id,
        display_name: inserted_playlist.owner_display_name,
        type: "user",
        uri: inserted_playlist.owner_id
          ? `spotify:user:${escapeURI(inserted_playlist.owner_id)}`
          : null,
        href: `https://api.spotify.com/v1/users/${inserted_playlist.owner_id}`,
        external_urls: {
          spotify: `https://open.spotify.com/user/${inserted_playlist.owner_id}`,
        },
      },
      followers: {
        href: null,
        total: inserted_playlist.followers_total,
      },
      images:
        inserted_images.length > 0
          ? inserted_images.map((img) => ({
              url: img.url,
              height: img.height,
              width: img.width,
            }))
          : null,
      tracks: {
        href: "unimportant",
        total: inserted_playlist.tracks_total,
        limit: 100,
        offset: 0,
        next: null,
        previous: null,
        items: inserted_tracks.map(reconstructTrack),
      },
      external_urls: {
        spotify: `https://open.spotify.com/playlist/${inserted_playlist.id}`,
      },
      href: `https://api.spotify.com/v1/playlists/${inserted_playlist.id}?additional_types=episode&locale=*`,
      type: "playlist",
      uri: `spotify:playlist:${inserted_playlist.id}`,
    };
  }

The link between the tracks table in spotify_clean and the actual files we have.

Row count:

CREATE TABLE IF NOT EXISTS "track_files" ( `rowid` integer PRIMARY KEY NOT NULL, `track_id` text NOT NULL, `filename` text, `status` text NOT NULL, `reencoded_kbit_vbr` integer, `fetched_at` integer, `session_country` text, `sha256_original` text, `sha256_with_embedded_meta` text, `isrc_has_download` integer, `track_popularity` integer, `secondary_priority` real, `prefixed_ogg_packet` blob, `alternatives` text, `file_id_ogg_vorbis_96` text, `file_id_ogg_vorbis_160` text, `file_id_ogg_vorbis_320` text, `file_id_aac_24` text, `file_id_mp3_96` text, `language_of_performance` text, `artist_roles` text, `has_lyrics` integer, `licensor` text, `original_title` text, `version_title` text, `content_ratings` text
  );

Raw JSON API responses for audiobooks on Spotify. Contains around 700 thousand rows. Incomplete.

Raw JSON API responses for audiobook chapters on Spotify. Contains around 20 million rows. Incomplete.

Raw JSON API responses for shows (podcasts) on Spotify. Contains around 5 million rows. Incomplete.

Raw JSON API responses for (podcast) episodes on Spotify. Contains around 54 million rows. Incomplete.

Redirect responses for the artist endpoint.

annas_archive_spotify_2025_07_audio_analysis.torrent/##.json.zst

Raw JSON API responses for audio analysis on Spotify. Contains around 40 million rows, fetched in descending priority order. Many songs do not have an audio analysis (404). Incomplete.

annas_archive_spotify_2025_07_coverart.tar.torrent

Album art files. Correspond to last part of url from album_images. Indexed into directories with filename prefixes (after stripping the first 16 chars from the filename).


Read the original article

Comments

  • By crazygringo 2025-12-2019:5025 reply

    This is insane.

    I definitely was not aware Spotify DRM had been cracked to enable downloading at scale like this.

    The thing is, this doesn't even seem particularly useful for average consumers/listeners, since Spotify itself is so convenient, and trying to locate individual tracks in massive torrent files of presumably 10,000's of tracks each sounds horrible.

    But this does seem like it will be a godsend for researchers working on things like music classification and generation. The only thing is, you can't really publicly admit exactly what dataset you trained/tested on...?

    Definitely wondering if this was in response to desire from AI researchers/companies who wanted this stuff. Or if the major record labels already license their entire catalogs for training purposes cheaply enough, so this really is just solely intended as a preservation effort?

    • By Aurornis 2025-12-2019:5511 reply

      > The thing is, this doesn't even seem particularly useful for average consumers/listeners, since Spotify itself is so convenient, and trying to locate individual tracks in massive torrent files of presumably 10,000's of tracks each sounds horrible.

      I wouldn’t be so sure. There are already tools to automatically locate and stream pirated TV and movie content automatic and on demand. They’re so common that I had non-technical family members bragging at Thanksgiving about how they bought at box at their local Best Buy that has an app which plays any movie or TV show they want on demand without paying anything. They didn’t understand what was happening, but they said it worked great.

      > Definitely wondering if this was in response to desire from AI researchers/companies who wanted this stuff.

      The Anna’s archive group is ideologically motivated. They’re definitely not doing this for AI companies.

      • By jsheard 2025-12-2021:135 reply

        > The Anna’s archive group is ideologically motivated. They’re definitely not doing this for AI companies.

        They have a page directly addressed to AI companies, offering them "enterprise-level" access to their complete archives in exchange for tens of thousands of dollars. AI may not be their original/primary motivation but they are evidently on board with facilitating AI labs piracy-maxxing.

        • By toomuchtodo 2025-12-2022:132 reply

          You go where the money is. Infra isn’t free. Churches pass the plate every Sunday. Perhaps one day we’ll exist in a more optimal socioeconomic system; until then, you do what you have to do to accomplish your goals (in this context, archivists and digital preservation).

          • By lurk2 2025-12-2023:256 reply

            > Infra isn’t free.

            There is a certain irony in people providing copyrighted works for free justifying profiting from these copyrights on the basis that providing the works to others isn’t free.

            • By xmcp123 2025-12-210:062 reply

              I'd have a lot more sympathy if the music industry didn't try all of the worst available options to handle piracy for years and years.

              They had many opportunities to get out ahead of it, and they squandered it trying to cling to album sales where 11/13 tracks were trash. They are in a bed of their own making.

              • By raw_anon_1111 2025-12-210:455 reply

                You have been able to buy DRM free digital music from all of the record labels since 2009 from Apple and other stores.

                • By bradleybuda 2025-12-2117:191 reply

                  “I only pirate because evil corporations make it too hard to pay for my favorite content” is a multi-decade ever-shifting goalpost. Some people just like to steal shit and will justify it to themselves on the thinnest of pretenses.

                  • By irilesscent 2025-12-2117:532 reply

                    It is factually true though, music piracy DID drop once ad supported music streaming became available, the opposite is also true, video/movie piracy is now on the rise due to the amount of streaming subscriptions one has to juggle and their rising prices. Ofcourse there will always be those who yearn for the pirates life, but the vast majority just do it for convenience.

                    • By Sohcahtoa82 2025-12-2218:421 reply

                      I don't even know the last time I pirated music. Gotta be at least 10 years.

                      Meanwhile, I pirate movies/TV on a regular basis for the reasons you gave. At one point, I was subbed to 5 services, and decided enough was enough. Cancelled all but Netflix and went back to torrenting anything they didn't have.

                      • By mrguyorama 2025-12-2321:55

                        I've used spotify for a decade. But the other day I opened one of my playlists and noticed that almost all the songs were greyed out as "unavailable" despite a quick search showing those songs still existed.

                        Spotify rotted my playlists because it didn't feel like updating a database row somewhere when some licensing agreement got updated. Apple will do the opposite: Rot your music collection by replacing songs with "identical" songs that aren't at all.

                        So I'm thinking it's time to buy music again.

                    • By raw_anon_1111 2025-12-2217:05

                      And Netflix’s profits have been on the rise for over a decade. I retired my plex server over six years ago. It just wasn’t worth the hassle of finding decent quality torrents. Everything ends up on streaming anyway.

                • By lmm 2025-12-2111:583 reply

                  Is that still the case? The option to do that quietly disappeared from Amazon Music a couple of months ago, for example, and they were one of the last few holdouts where you still could. It might be only Apple now?

                  • By neobrain 2025-12-2112:50

                    There's still plenty of options around, Qobuz and 7digital in particular offer drm-free flac downloads.

                  • By bpfrh 2025-12-2116:261 reply

                    Quboz, bandcamp, etc.

                    • By accrual 2025-12-2117:19

                      Bandcamp is still my go to for owning music. Nice platform, just works.

                  • By ocdtrekkie 2025-12-2116:13

                    I still buy DRM free music from Amazon.

                • By vel0city 2025-12-213:521 reply

                  You've been able to buy DRM free digital music since the 1980s.

                • By jMyles 2025-12-210:573 reply

                  > DRM free digital music from all of the record labels

                  Is this true? Can you show me where I can get DRM-free releases from Mountain Fever?

                  Better yet, can you add that information here? https://pickipedia.xyz/wiki/DRM-free

                  • By raw_anon_1111 2025-12-211:101 reply

                    Your link doesn’t work. But I assume you are talking about this label? I looked at the first artist and I found the artist’s music on iTunes. Everything that Apple sells on the iTunes Music Store has been DRM free AAC or ALAC (Apple lossless) since 2009.

                    https://mountainfever.com/colin-kathleen-ray/

                    While ALAC is an Apple proprietary format, it is DRM free and can be converted to FLAC using ffmeg. AAC is not an Apple format

                    • By lurk2 2025-12-216:191 reply

                      I remember trying to use music I had bought in a slideshow that year and finding out that I couldn’t load tracks with DRM into the editor I was using; it was very frustrating.

                      • By raw_anon_1111 2025-12-217:021 reply

                        A way to strip the DRM was built into the iTunes app - burn the song to a CD and rip it.

                        • By pcthrowaway 2025-12-2117:242 reply

                          Is burning to a CD and ripping it lossless?

                          • By whstl 2025-12-2211:22

                            If the source and target are both lossless, then yes. ALAC was available in iTunes since 2004 AFAIK.

                            Caveat: CDs were 44.1/16 so if the original files had more bit depth, they would require downsampling. Technically lossy, but not "compression" per se. But AFAIK, iTunes was also 44.1/16.

                  • By Yodel0914 2025-12-212:06

                    I don’t know about Mountain Fever, but for anything I haven’t been able to find on Bandcamp, I’ve been able to find on Qobuz.

                • By adrianN 2025-12-2118:07

                  Piracy went down quite a bit since that is possible.

              • By potatoicecoffee 2025-12-210:54

                they made cd singles and single song purchases long before streaming

            • By toomuchtodo 2025-12-2023:346 reply

              Cost recovery isn’t profit. Copyright is just a shared delusion, like most laws. They’re just bits on a disk we’re told are special for ~100 years (or whatever the copyright lockup length is in your jurisdiction), after which they’re no longer special (having entered the public domain).

              I think what is more ironic is we somehow were comfortable being collectively conditioned (manufactured consent?) with the idea that you could lock up culture for 100 years or more just to enable maximum economic extraction from the concept of “intellectual property” and that to evade such insanity is wrong in some way. “You can just do things” after all.

              • By noduerme 2025-12-211:061 reply

                It's not the bits that are copyrighted, it's the performance and the creative work.

                Your savings account is just bits on a disk, yet presumably it represents value that you worked for and which belongs to you to do with what you wish.

                • By komali2 2025-12-214:563 reply

                  > Your savings account is just bits on a disk, yet presumably it represents value that you worked for and which belongs to you to do with what you wish.

                  That's another example of the shared delusion, since yes, we tell eachother it represents labor and resources, and the market engages in allocation somewhat efficiently, and so the money is a pretty accurate representation of the value of labor and the value of resources.

                  In reality, that's not true, because the most highly compensated jobs are some of the least valuable, such as investment bankers, landlords, or being born rich (which isn't even a job, but is compensated anyway). Rent seeking is one of the most highly compensated things you can do under this system, but also one of the most parasitic and least valuable things.

                  Your savings account's number is totally detached from accurately representing value. It's mostly a representation of where you were born.

                  • By noduerme 2025-12-235:11

                    Value is subjective. Ownership is not. You're attempting to perform a sleight of hand by conflating the two.

                    It doesn't matter whether you personally find some creative material to be worthless, or you personally think someone doesn't generate sufficient value to deserve their bank balance. The reason it doesn't matter is that societies cannot run on an individual's opinion about whether other people deserve ownership over what they legally own. Because if it did, that society would quickly disintegrate into anarchy.

                    Speaking personally, as someone who once was on course to make 9 figures and now makes a low 6, I think it's sort of a pathology to spend your time worrying about how much less you have than other people. What matters is whether you can be recognized for your work and earn from it. I don't care that some people just inherited what they have, while I had to struggle as a taxi driver and waiter and minimum wage intern. That's annoying, but it's not as bad as living in a society where I can't capture the value of what I produce creatively. Having ownership of my work is far more important to me than money. But I have a right to expect that e.g. code I develop in my toolkit will remain my own to provide me an income.

                  • By felixg3 2025-12-219:571 reply

                    „Shared delusion“ - just another term for „social contract“?

                    • By komali2 2025-12-2110:10

                      Sort of? The contract doesn't mention that "value" and "price" are just as often negatively correlated as positively so, though, and claims the opposite (always positive correlation), hence where the shared delusion comes in.

                  • By gosub100 2025-12-219:221 reply

                    > Your savings account's number is totally detached from accurately representing value. It's mostly a representation of where you were born

                    This could also be true because the number of dollars in circulation is "just bits on a disk" that politicians can manipulate for various reasons.

                    Someone can work very hard and save their earnings, only to have the value diluted in the future. Isn't that also a delusion?

                    • By komali2 2025-12-2110:091 reply

                      > Someone can work very hard and save their earnings, only to have the value diluted in the future. Isn't that also a delusion?

                      Yes, it is.

                      It's one of my pet peeves about the cryptocurrency movement vs neoliberal institutional types. "Bitcoin is juts bits on a disk!" is always answered with "well, dollars is too!" To which the institutionalist can only say, "no, that's different." But really, it isn't.

                      What the cryptocurrency people get wrong is that replacing one shared delusion with another isn't a useful path to go down.

                      • By dagss 2025-12-2112:221 reply

                        Unless you do substinence farming, you would not last a month without "shared delusions" in place to make sure farmers supply you with food, getting nothing in return except a promise that they can go somewhere to pick up something someone else than you made in the future.

                        Money isn't "only bits" it is also an encoding of social contracts

                        You use the word delusion like it also includes a) things everyone fully agree only exists in people's mind as intersubjective reality (no deceit going on really) and b) things you depend on for your survival.

                        You talk like getting rid of "delusions", as you call them, is a goal in itself. Why? It is part of human technology. (Just like math, which also only exist in people's minds.) Humans have had contracts since we were hunter gatherers in groups...

                        I would recommend Yuval Harari's "Sapiens" for you, you would probably like it. It talks about the history of "shared delusions" as you call them, as a critical piece for development of society.

                        • By komali2 2025-12-2117:471 reply

                          > would recommend Yuval Harari's "Sapiens" for you, you would probably like it. It talks about the history of "shared delusions" as you call them, as a critical piece for development of society.

                          Already read it. Counter: read "Debt, the first 5000 years" by Graeber for, finally, a non- "Chicago school of economics" take on the history of trade amongst humans.

                          • By dagss 2025-12-2118:341 reply

                            Thanks for the tip.

                            Just to be clear, I agree the money abstraction is not working particularly well. And that in the age of computers something that is more directly linked to the underlying economy could have worked better. But what needs to replace it is a better and improved "delusion", not a lack of it.

                            • By komali2 2025-12-221:211 reply

                              But, why? Regarding your farmer example, there are examples throughout history of farming that fed many without the involvement of currency or the paying off of debt. Take a look into syndicalized Spain if you ever get a chance (~1936-1939). Farms were collectivized and worked on by volunteers, distributions done by need with some bookkeeping to track how many people were in certain regions. Worked pretty well until the communists decided it needed to be centrally controlled and kicked out the anarchists!

                              Everyone always starts every future speculation assuming capitalism, or at least, currency. Isn't it worth challenging these core baseline assumptions? At the very least, the other ground is well covered, so we might come up with a little more interesting.

                              • By motoxpro 2025-12-2219:42

                                Currency (or IOU's, handshakes, pieces of green paper, bits on a disc, etc) is just an abstraction allows one to have choice.

                                The political systems that get built on top of that are just a downstream effect of the incentives that arise. Communisim thinking it would be good to centralize the control, capitalism thinking it would be good allow the incentives to rule, marxism thinking the labor rules, etc.

                                What I do for work is SO far away from any sort of tangible production, it makes sense to have a way to just straight from Work -> Food, rather than 50-100 trades so I can eat everyday. Again, the choice to to have to trade at all, or to trade exactly what I want, when I want, is enable by currency.

                                You can make the argument things shouldn't be so easy, that I shouldn't be able to choose to go to play pinball and drink a vanilla milkshake at 11am, but if that's possible, currency (in whatever form you want) has to exist.

              • By lurk2 2025-12-216:422 reply

                > that to evade such insanity is wrong in some way.

                There’s a commons problem at play here. Most habitual pirates couldn’t pay for what they are pirating even if they wanted to, so restricting their access just makes the world worse-off; but who is going to finance the creation of new content if everything is just reliant on completely optional donations?

                The 100 year period is absurd and does nothing to incentivize art, but there are costs involved in production of these works. People are always going to make music and write books regardless of the economic outcome; far fewer are going to write technical manuals or act as qualified reporters without being compensated.

                • By thisisabore 2025-12-2119:071 reply

                  There are several labs and researchers with ideas on how to do this and published books on the subject (https://www.sharing-thebook.com/).

                  Long story short: workable solutions exist, it is entirely a question of political will and lack thereof.

                  • By aqeelat 2025-12-2119:32

                    This would work on niche segments and not for the masses. Look up YouTube subscribers to Pateon ratio.

                • By 0x3f 2025-12-2110:162 reply

                  > Most habitual pirates couldn’t pay for what they are pirating

                  Seems questionable. You can cover almost everything with a handful of monthly subscriptions these days. In fact I often pirate things that I otherwise have access to via e.g. Amazon Prime.

                  > but who is going to finance the creation of new content if everything is just reliant on completely optional donations?

                  Well this is an appeal to consequences, right? It's probably true that increased protectable output is a positive of IP law, but that doesn't mean it's an optimal overall state, given the (massive) negatives. It's a local maxima, or so I would argue.

                  Plus it's a bit of a strange argument. It seems to claim that we must protect Disney from e.g. 'knock offs', and somehow if we didn't, nobody would be motivated to create things. But then who would be making the knock-offs and what would be motivating them?

                  • By klez 2025-12-2112:561 reply

                    > You can cover almost everything with a handful of monthly subscriptions these days.

                    Maybe for you that's something you can afford. I can't. I just consume less music. Or sail the high seas if I really want something.

                    • By 0x3f 2025-12-2121:19

                      If we're purely talking about music then almost everything is on YouTube, which has a subscription cost of $0/mo.

                  • By lurk2 2025-12-2113:161 reply

                    > You can cover almost everything with a handful of monthly subscriptions these days.

                    The majority of people on earth cannot afford more than two or three of these subscriptions.

                    > But then who would be making the knock-offs and what would be motivating them?

                    Ten years ago there was a popular blog that got posted on /r/anarcho_capitalism with some frequency. IP was a contentious topic among the then-technologically literate userbase. At some point, a spammer began copying articles from the blog and posting them to /r/anarcho_capitalism himself. This caught the attention of some users and the spammer was eventually banned. A few days later, I followed a link back to his site and found all the articles he had stolen now linked back to a page featuring the cease and desist letter he had received from the original blog, the URL being something like: “f*-statists-and-such-and-such.”

                    Without any* copyright law, any content that is generated effectively gets arbitraged out to the most efficient hosts and promoters. This might be a win for readers in the short term, but long-term tends towards commodification that simply won’t sustain specialized subject matter in the absence of a patronage model. YouTube and the wave of Short Form Video Content are the two most obvious case studies, though it happens on every social platform that moves faster than infringement notices can be sent.

                    • By 0x3f 2025-12-2121:263 reply

                      > The majority of people on earth cannot afford more than two or three of these subscriptions.

                      I would guess the majority of people on earth don't even have good enough internet to pirate HD video, nor the technical skills to do it, so we're not really talking about global averages here.

                      > Without any* copyright law, any content that is generated effectively gets arbitraged out to the most efficient hosts and promoters. This might be a win for readers in the short term, but long-term tends towards commodification that simply won’t sustain specialized subject matter in the absence of a patronage model.

                      I don't think you understand my argument. I don't deny that this may be true. I deny that it is ipso facto the best outcome to have high-quality creator content, or whatever we are talking about here, at the cost of the massive benefits of free use. You might as well tell me New Jersey gas pumping laws lead to nicer service experiences, and getting rid of them would ruin that.

                      We can arbitrarily prop up any industry to make it cushy and a 'nice experience'. That doesn't make doing so the greatest overall good.

                      I would argue that even if all that we achieved with the abolition of IP law was the provision of cheap generic drugs, long out of research, it'd be worth far more than the YouTube creator economy.

                      • By lurk2 2025-12-2210:55

                        > I would guess the majority of people on earth don't even have good enough internet to pirate HD video

                        Why is that the qualification you’re using? There are plenty of people in the developing world who have benefitted from access to e.g. LibGen who would never be able to afford to legally access the materials hosted there.

                        My point is that under the abolitionist model there is no financial incentive to create anything because the profits get arbitraged away by the most efficient copy services. This wouldn’t be relevant for saturated mediums like music or literature, but it does create a free rider problem in scenarios where the intellectual property has a high cost of production and not many people qualified to produce it (e.g. technical manuals, pharmaceutical research, well-produced films, etc.)

                        Pirates effectively have their usage subsidized by those who actually pay for the content. A huge amount of human potential is unlocked when works are freely available through legitimate platforms; neither of us are disputing this. The reason I can’t get on board with copyright abolitionism over copyright term reduction is because I don’t see how certain works will be produced at all under an abolitionist model that can only sustain itself via voluntary donations.

                      • By woctordho 2025-12-227:11

                        The worldwide median internet download bandwidth is about 100 Mbps, which is far enough for HD or bluray video. The technical barrier can be as low as 'click to search, click to download' in some user-friendly BT clients. That being said, the price of these subscriptions is a problem that actually needs to be solved.

                      • By _DeadFred_ 2025-12-223:07

                        Anyone is free to release under free use in our current system. You already can live with the benefits of no IP law by just limiting yourself to those people that chose to to release this way.

              • By verisimi 2025-12-219:32

                I agree completely. Parasites with money like to keep open the legal loopholes for their clever wheeze.

              • By dagss 2025-12-2111:54

                Sure. But in addition to copyright you might add the concept of money, or the concept of any property rights and ownership of physical things, and...

                Calling such things "shared delusions" is missing the point...it's not that it's wrong, but it is not a very useful way to look at it.

                There is such a thing as intersubjective (as opposed to objective) reality. Physically it exists as a shared pattern in the brains of humans, but that is seldom useful to reflect on. Language wise much more convenient and useful to talk about copyright as something, you know, existing.

                Everyone knows these are just human agreements... it is not exactly deep thinking to point it out.

                You may not agree to some laws. You can then seek to have the laws overturned (I agree patents and copyright are... counterproductive, at this point). Luckily many parts of the world have democracy to decide what laws to force on people, as opposed to a dictator.

              • By bekindtoartists 2025-12-229:252 reply

                Are you an artist? Have you ever created a piece of work that has a copyright attached? You might be anti-establishment but ultimately you are anti-creation. Artists are finding it harder and harder to live and create, artists are vital proponents and voices in changing culture - for you to take away their ability to live in a financially viable way says more about you and how you have conflated big business and an artist who is trying to make art and live.

                • By themusicgod1 2025-12-2217:41

                  I am. Copyright is fucking cancer and is one of the worst things if not the worst things that exists to make creating new things harder.

                  Making bits available isn't "taking artists ability to live in a financially viable way" any more than radio, LPs and player pianos was. If you are an artist who is trying to make art and live do more of that and don't waste peoples time arguing for copyright restricting other people's activity on websites like this one.

                • By toomuchtodo 2025-12-2217:00

                  I pay artists directly, and know they receive almost nothing from Spotify and other Big Tech platforms, ymmv. Artists good, big business bad.

            • By jasonvorhe 2025-12-210:01

              Everyone is doing it, who Cates anymore. Genie's out of the bottle, we could've tried to solve this for decades and yet we didn't so now we reap what we sowed. Happens, move on.

            • By hamdingers 2025-12-211:422 reply

              Do you have evidence they are profiting? I'm genuinely curious how these kinds of archives sustain themselves.

              • By lurk2 2025-12-216:32

                I don’t think any of them are breaking even when you consider the maintenance costs, I just thought it was kind of funny considering the nature of the line of work they are in.

                This was a different group of people but when some of the old LibGen domains got seized the FBI uploaded photos of the owners and the things they had spent their money on; a crappy old boat, what looked like a trailer in rural Siberia, and a vacation somewhere in the Mediterranean. It honestly read like sketch comedy, because the purchases didn’t appear remotely ostentatious.

                Z-library also supposedly caps downloads at 5 per day and offers more and faster downloads to paying subscribers.

              • By djeastm 2025-12-212:351 reply

                They take donations.

                • By cwnyth 2025-12-212:58

                  Just to nitpick, that doesn't imply profit. They could be breaking even (and probably are working at a loss).

            • By woctordho 2025-12-222:19

              Data are basically free. Infra to store and transfer data is not.

            • By emaro 2025-12-219:11

              I admit the irony, but also funny reminder that Spotify started with a pirated catalogue back on the day.

          • By onion2k 2025-12-2116:15

            You go where the money is.

            That is the opposite of being ideologically motivated unless your ideology happens to be 'capitalism'.

        • By j_w 2025-12-2021:57

          Or they know that those parties are going to hammer their servers no matter what so they will at least try and get some money out of it.

        • By BonoboIO 2025-12-2022:071 reply

          That made me chuckle, Enterprise Level Access. I mean as ai company, that’s incredibly cheap and instead of torrenting something, why get it. That price is just a fraction of a engineers salary.

          • By gmueckl 2025-12-2022:533 reply

            But then you have a money trail connecting the company unambiguously to copyright violations on a scale that is arguably larger than Napster.

            • By ls612 2025-12-211:06

              I mean Facebook and Anthropic both torrented LibGen in its entirety.

            • By scratchyone 2025-12-214:22

              I believe they're largely targeting foreign companies who don't care much about US copyright law.

            • By amitav1 2025-12-2023:412 reply

              Yeah,how devstating it would be for Anna's Archive to be found skirting copyright laws. Their reputation may never recover.

              \s

              • By amitav1 2025-12-2621:54

                Ah, yikes, just ignore this comment, my literacy skills failed me here.

              • By hkt 2025-12-2023:551 reply

                He meant the AI companies

                • By pbhjpbhj 2025-12-210:38

                  I mean, the same comment applies mutatis mutandis.

        • By ThinkBeat 2025-12-2112:43

          I think there is a big legal difference between helping preserve books and papers with little regard for copyrights, to then turn around and selling access to large companies.

        • By wartywhoa23 2025-12-2118:47

          So either these folks, who are admittedly living targets of all the world's copyright lawyers, have means to receive tens of thousands of USD anonymously and stealthily,

          or they are totally immune to deanon / getting tracked down,

          or they are stupid enough to allow their greed to become their downfall,

          or this legend about underground warriors of light fighting against evil copyrighters is utter bullshit.

      • By cryzinger 2025-12-2022:331 reply

        > I had non-technical family members bragging at Thanksgiving about how they bought at box at their local Best Buy that has an app which plays any movie or TV show they want on demand without paying anything. They didn’t understand what was happening, but they said it worked great.

        Sounds like one of these: https://krebsonsecurity.com/2025/11/is-your-android-tv-strea...

        Probably not your problem to play tech support for these people and explain why being part of a botnet is bad, but mildly concerning nonetheless!

        • By shaky-carrousel 2025-12-215:141 reply

          Who cares, today is pretty easy to be part of a botnet. Having a slightly outdated lightbulb qualifies, so I'd not bother.

          • By Aurornis 2025-12-2115:331 reply

            Having an IoT device with security vulnerabilities does not automatically make you vulnerable to botnets because it’s behind your router’s NAT under normal conditions.

            Botnet infections occur primarily through one of two ways: Vulnerable devices exposed directly to the Internet, or app downloads and installs on persons computing devices.

            The TV box appears to be a rare hardware version of convincing someone to bring something into their network that compromises it. Usually it’s a software package that they’re convinced to install which brings along the botnet infection

            Regardless, it’s a weird and dangerous mentality to believe that being part of a botnet is a “who cares” level of concern. Having criminal traffic originate from your network is a problem, but they might also decide to exploit other vulnerabilities some day and start extracting even more from your internal network.

            • By shaky-carrousel 2025-12-2116:19

              Nope, many IoT devices open ports via UPnP. The biggest botnets are composed of (among other things) smart plugs, baby monitors, doorbells, IP cameras...

      • By crazygringo 2025-12-2020:061 reply

        > The Anna’s archive group is ideologically motivated.

        Very interesting, thank you. So using this for AI will just be a side effect.

        And good point -- yup, can now definitely imagine apps building an interface to search and download. I guess I just wonder how seeding and bandwidth would work for the long tail of tracks rarely accessed, if people are only ever downloading tiny chunks.

        • By nutjob2 2025-12-2020:36

          I think the people seeding these are also ideologs and so would be interested in also supporting the obscure stuff, maybe more than the popular. There is no way any casual listeners would go to the quite substantial trouble of using these archives.

          Anyone who wants to listen to unlimited free music from a vast catalog with a nice interface can use YouTube/Google Music. If they don't like the ads they can get an ad blocker. Downloading to your own machine works well too.

      • By varenc 2025-12-210:275 reply

        Spotify is $12/month at most to get unlimited ad-free access to virtually all music.

        To get access to "all" TV content legally would be hundreds of dollars a month. And for many movies you must buy/rent each individually. And legal TV and movies are much more encumbered by DRM and lock in, limiting the way you can view them. (like many streaming apps removing AirPlay support, or limiting you to 720p in some browsers)

        I think Spotify wins over pirating because of its relatively low cost and convenience. Pirating TV/Movies have increased as the cost to access them has.

        • By gorbachev 2025-12-2110:12

          It's not even close to virtually all music. 256M songs doesn't come even close.

          It's virtually all popular music recently published commercially in the world.

          It's missing large portions of bootlegs, old music, foreign music, radio shows, mixtapes and live streaming music to list a few prominent categories from music in my private archive of cultural works. Those categories, btw, are well represented by torrents on tracker sites.

        • By figmert 2025-12-2112:56

          Barely all. I have so many songs in my playlist that has randomly become unavailable. It's quite frustrating to be honest.

        • By themusicgod1 2025-12-2217:452 reply

          > Spotify is $12/month at most to get unlimited ad-free access to virtually all music.

          Until they decide to silence the artist you want to listen to because emperor god trump decides to unperson them.

          Putting what music you listen to in the hands of a US corporation is such a dangerously stupid idea that it is amazing to me that there are people here who are OK with it.

          >I think Spotify wins over pirating because of its relatively low cost and convenience

          Spotify isn't "convenient" if you want to control and understand the media and software in your life. https://www.defectivebydesign.org/spotify

          • By brailsafe 2025-12-231:18

            > Putting what music you listen to in the hands of a US corporation is such a dangerously stupid idea that it is amazing to me that there are people here who are OK with it.

            Thankfully Spotify isn't primarily a U.S company.

          • By mlrtime 2025-12-2312:27

            Godwins Law recreated, the new version is Trump.

        • By OJFord 2025-12-2114:00

          It's absolutely not all, I'm an extremely casual listener, not 'into' music or anything, and I have plenty in a playlist that have disappeared (mostly I don't even know what they are, it's just greyed out with no information) for whatever reason. And that's just the stuff that was there at some point that I liked.

          One of them has come back recently. It's still listed as by the wrong artist (same name, but dead, vs. the active artist who actually performed it) but I'm not reporting it again because I suspect I may have made it disappear for a couple of years in doing so before.

          It's kind of crap and disorganised after anything more than barely glancing at it really, must be infuriating for (or just not used by) people who actually are into it.

        • By tsukikage 2025-12-219:571 reply

          Spotify used to be good, but have enshittified their UI past the point of usability for me. It really wants to play me tracks that are profitable for Spotify, not tracks I want to hear.

          What you say is still true of the Amazon and Apple offerings, though. Haven't tried Youtube Music, so can't comment on that.

      • By delusional 2025-12-2114:06

        > There are already tools to automatically locate and stream pirated TV and movie

        Before we had spotify we had grooveshark. Streaming pirated content came first, and everything old is new again.

      • By sneak 2025-12-2022:37

        They’re doing it for everyone, so, yes, they are doing it for AI companies.

      • By wartywhoa23 2025-12-2118:11

        > They’re definitely not doing this for AI companies.

        So it's just yet another instance of enormous luck / annuit coeptis for the wealthy and powerful, then.

        Such lucky bastards. Whatever happens, does so to their benefit, and all inconvenient questions about the nature of their luck automatically recede into the conspiracy theory domain.

        And let's not forget that Anna's Archive is also the host to the world's largest pirate library of books and articles.

      • By 5- 2025-12-2020:081 reply

        [flagged]

        • By ronsor 2025-12-2020:56

          They know about AI companies and don't mind AI companies, but they're not doing it because AI companies.

      • By silcoon 2025-12-211:263 reply

        > The Anna’s archive group is ideologically motivated.

        Anna’s archive business is stealing copyrighted content and selling access to it. It's not ideologically motivated.

        What ideology is about pirating books and music where most of the people producing this stuff cannot afford to do it full-time? It's not like pirating movies, software and large videogame studios, which is still piracy, but they also make big money and they don't act all the time in the interests of the users.

        Writers and musicians are mostly broken. If we sum the rising cost of living, AI generated content and piracy, there's almost no reward left for their work. Anna’s archive is contributing to the art and culture decadence. They sell you premium bandwidth for downloading and training your AIs on copyrighted content, so soon we can all generate more and more slop.

        • By vintermann 2025-12-217:14

          > Anna’s archive business is stealing copyrighted content and selling access to it.

          There is not enough profit in that compared to the risk. They're also not exactly aggressive about it (there are groups which host mirrors who charge far more/finance it in the usual criminal way of getting people to install malware).

          To me, there's a "motivation gap" between what they get out of this and the effort it takes, so there's some kind of "ideology". Whether it's 100% what they say it is, is another question.

        • By frm88 2025-12-2110:52

          Writers and musicians are mostly broken. If we sum the rising cost of living, AI generated content and piracy, there's almost no reward left for their work.

          For authors (books) ~70% of all the book sales go to the publisher, not the author (trad pub): https://reedsy.com/blog/how-much-do-authors-make/

          For musicians: depending on how big a name you are and which publisher you chose, the publishers compensation ranges from 15% (small name/indy) to 60% (big name/Universal, Sony) https://www.careersinmusic.com/music-publishing/

          This is an industry with profit maximising as its goal like every other industry. If artists are broke, first take a look at the publishers.

        • By avoutos 2025-12-214:422 reply

          Agreed. I see far too many people rationalizing piracy as a principled thing to do. Instead of finding ways to improve the market such that the control of content isn't siloed in monopolistic corporations, many celebrate Annas Archive which is itself a more or less monopolistic profit-interested entity. The major difference being that we don't have to pay directly. The cost continues to fall on the writers and artists and the industry suffers.

          • By ptero 2025-12-219:39

            Nothing wrong in rationalizing content sharing; as in rationalizing copyright. But IMO the current form of the copyright for both the technical and the creative works is a cure that is worse than the disease.

            Recommending to an individual to work on changing copyright from within the system is, IMO, naive.

          • By komali2 2025-12-215:081 reply

            > Instead of finding ways to improve the market such that the control of content isn't siloed in monopolistic corporations

            I always assumed the "Anna" in the name was for "Anarchist." My assumption about the archive is that they don't believe there's an ethical solution to the restriction of access to data that involves a capitalist market.

            • By silcoon 2025-12-2110:094 reply

              I get your point but then let's not complains if creativity dies and things all look the same. Creative people don't have motivation to produce if they can't make a living out of it.

              • By moritzruth 2025-12-2111:561 reply

                > Creative people don't have motivation to produce if they can't make a living out of it.

                That is simply not true. Most artists do what they do without ever seeing any money for it.

                • By _DeadFred_ 2025-12-223:101 reply

                  Under the current system people can release everything they want as free use.

                  How much media that the average person choses to consume is this 'free use' media? How much is media that artists chose to make money from?

                  • By komali2 2025-12-223:511 reply

                    This doesn't do much for the argument that artists only do art for money. Everyone knows what happens to free use art, same as what happens to FOSS: corpos bundle it up and sell it back to people.

                    By the way, I do know a lot of artists that just give their work away for free. Hell, any Burn is just a bunch of free art that usually gets lit on fire or destroyed after a week. There's also graffiti art which is uncompensated and usually painted over within a month.

                    • By _DeadFred_ 2025-12-2223:10

                      Great. So you already have a firehose of free art, no need/benefit to change copyright for those that want to release that way then.

              • By pavlovsfrog 2025-12-2221:23

                fwiw, the vast majority of my working musician friends (who do also hold day jobs) would rather you pirate their music than stream it on spotify. they make basically all of their income from music via touring, streaming income might pay for a coffee or two a month.

              • By komali2 2025-12-2110:112 reply

                > Creative people don't have motivation to produce if they can't make a living out of it.

                I challenge you to ask 10 creative people in your life if they would stop doing whatever it is they do if they had a billion dollars.

                • By lotsofpulp 2025-12-2111:11

                  The desire to create something does not seem like an immutable characteristic.

                • By v9v 2025-12-2118:142 reply

                  Would they do what they do if they had zero dollars?

                  • By komali2 2025-12-221:15

                    > Would they do what they do if they had zero dollars?

                    No, probably not. Isn't it a shame we live in a world where we have the technology to automate all meaningful production, but people still need to justify their existence through often meaningless labor?

                    That said, I know artists that make the bare minimum to survive, on purpose, so they have more time to focus on art.

                  • By nani8ot 2025-12-2121:52

                    Yes, as long as they have enough to survive, people generally have some free time. I know someone who's living paycheck to paycheck and they make music as a hobby. Obviously, if you have to work 16 hours a day to survive they wouldn't do it – or at least they wouldn't have the capacity to share it.

              • By lukifer 2025-12-2115:25

                "I'm not a capitalist, I am a creativist... Capitalists make things to make money, I like to make money to make things." - Eddie Izzard

                It's more about the viability of making any kind of living from one's creative work, not motivation to create. (Though for creative works with large upfront costs, eg films, ROI motivation is relevant for backers.)

      • By shevy-java 2025-12-210:22

        > I wouldn’t be so sure. There are already tools to automatically locate and stream pirated TV and movie content automatic and on demand.

        It may be relevant for those people, but I lost all interest in current TV or streaming stuff. I just watch youtube regularly. What's on is on; what is not on is not really important to me. My biggest problem is lack of time anyway, so I try to reduce the time investment if possible, which is one huge reason why I have zero subscriptions. I just could not keep up with them.

    • By gorbachev 2025-12-219:232 reply

      Flippant response: If it's ok for Meta for commercial use, why not for researchers for legitimate research work?

      More serious response: research is explicitly included in fair use protections in US copyright law. News organizations regularly use leaked / stolen copyrighted material in investigative journalism.

      • By zouhair 2025-12-2323:05

        Because the laws are there to protect people with money from people who don't have money.

    • By VanTheBrand 2025-12-2020:472 reply

      The metadata is probably more useful than the music files themselves arguably

      • By vintermann 2025-12-217:21

        Self-supplied metadata in music catalogs is notoriously shit. The degree to which most rights owners don't give a damn is telling.

        Spotify's own metadata is not particularly sophisticated. "Valence", "Energy", "Danceability", etc. You can see from a mile away that these are assigned names to PCA axes which actually correspond pretty poorly to musical concepts, because whatever they analyzed isn't nicely linearly separable.

      • By cm2012 2025-12-2022:021 reply

        Especially since they scraped Spotify's popularity rating as well

        • By input_sh 2025-12-2022:532 reply

          I can't think of many situations where that would be particularly valuable, considering it favours recent plays and the cutoff date is already almost half a year old.

          • By cm2012 2025-12-2022:57

            Helps train an algorithm to figure out which music is popular, as a training signal

          • By skrtskrt 2025-12-212:35

            If that's all the issues there are with the dataset, it is probably far and away the best dataset any researcher has ever used.

    • By zuspotirko 2025-12-2111:071 reply

      > The thing is, this doesn't even seem particularly useful for average consumers/listeners, since Spotify itself is so convenient, and trying to locate individual tracks in massive torrent files of presumably 10,000's of tracks each sounds horrible.

      Are you aware Annas Archive already solved the exact same problem with books?

      • By writebetterc 2025-12-229:181 reply

        I am not, how did they solve that?

        • By zuspotirko 2025-12-2519:50

          Basically there is an entity between Annas Archive and the torrents: hosters. AA has searchable metadata and a hash value. The hosters keep track of hash values, the cached files and in which torrents they are backed up, and take up almost the entire legal liability. Users search on AA what they are looking for but ultimately download it from a hoster.

    • By thiht 2025-12-2022:32

      > this doesn't even seem particularly useful for average consumers/listeners

      I can imagine this making it wayyy easier to build something like Lidarr but for individual tracks instead of albums.

    • By IshKebab 2025-12-2021:031 reply

      I dunno if they publish like a 10 TB torrent of the most popular music I can see people making their own music services. A 10 TB hard disk is easily affordable, and that's about 3 million songs which is way more than anyone could listen to in a lifetime, even if you reduce that by 100x to account for taste.

      It's probably going to make the AI music generation problem worse anyway...

      • By justatdotin 2025-12-210:042 reply

        I would expect more data to make ai music generation better

        • By jen729w 2025-12-217:39

          The problem isn't the generation, it's the taste of the generators.

          An earnest young lady with a guitar can already sing a light jazz version of 'Highway to Hell' or whatever. Just go to your local cafe to hear it. The objective quality is terrific.

          In the past, this wouldn't have been made because the end result is subjectively banal. But now people with no taste can churn it out by the thousands of hours for free.

        • By cakealert 2025-12-212:391 reply

          When they say "worse" they do mean the AI will get better which will be worse because they are ideologically opposed to AI.

          • By IshKebab 2025-12-2114:44

            I'm not ideologically opposed to AI. The problem will get worse because while the quality of the music will improve, it will still be bad and there will also be a lot more of it.

            We aren't really short on music. Diluting the good stuff with 100x more mediocre filler is not a good thing.

            If AI generated music ever actually becomes good then that's another story but that is quite a way off.

    • By sowbug 2025-12-2217:43

      A little off topic, but I remain naively hopeful that the horror you describe will keep Spotify from going down the same road Netflix did once content owners decided to get into the streaming business themselves, so that streaming a movie today requires you to "change the channel" to whichever service offers that movie.

      Can you imagine your favorite playlist needing to swap among 10 apps, each requiring a $10/month subscription?

    • By fsckboy 2025-12-2022:512 reply

      >The thing is, this doesn't even seem particularly useful for average consumer

      it's an archive to defend against Spotify going away. Remember when Netflix had everything, and then that eroded and now you can only rely on stuff that Netflix produced itself?

      the average consumer will flock when Spotify ultimately enshitifies

      • By troupo 2025-12-2023:204 reply

        Netflix didn't lose content by choice. Actual right holders decided to pull their content and create rival services.

        Has nothing to do with perceived enshittification by Netflix (even though they have enshittification too).

        Spotify is under the same threat: they have no content that they own. Everything is licensed.

        • By LunaSea 2025-12-2112:162 reply

          Spotify is banking on AI music which is enough to tell you everything you need to know about the company, their C-suite and their opinion on music.

          • By sbarre 2025-12-2114:251 reply

            The bit in the blog post about the amount of music uploaded yearly to Spotify was shocking.

            I'm sure there's lots of unsigned self-published artists uploading their music in there, but so much of that has to be auto-generated and AI-generated slop.

            • By troupo 2025-12-2115:46

              > but so much of that has to be auto-generated and AI-generated slop.

              There is. And most people would not even recognize a lot of AI music without multiple listens and digging through things like "is there any online presence (which can also be easily spoofed)".

              I've fallen into the trap myself with some (pretty generic) blues music

          • By troupo 2025-12-2115:401 reply

            > Spotify is banking on AI music

            Are they?

            • By LunaSea 2025-12-2121:401 reply

              Yes, they actively promote playlists with AI music to corner the "chill work" music without having to pay anything to musicians.

              • By troupo 2025-12-227:45

                Ah yes. The reason is because all the money is in the chill music. And not in the fact that the most formulaic genres get flooded with easily generated music ("chill work" music especially doesn't even need sophisticated AI, a random number generator would work).

                And that's before we ask the question of how to identify AI-generated music (no one asks that question, but everyone wants it removed).

        • By nimih 2025-12-2023:411 reply

          But, Netflix did lose their content by choice! Way back in the 00s, you could pay Netflix something like $5 a month, and they would mail you physical DVDs of almost any movies you could ever want to watch. In fact, my recollection is that the physical library was generally much more extensive than the streaming library, at least through the early ‘10s.

          Sure, they had the rug yanked out from under them with digital streaming, but they very deliberately put themselves into that position when they pivoted to streaming in the first place.

          • By troupo 2025-12-218:23

            > In fact, my recollection is that the physical library was generally much more extensive than the streaming library, at least through the early ‘10s.

            Because streaming licences are different from DVD licences for example. Hell, even 4k streaming licenses and lossless audio streaming licenses are different (and significantly more costly) than streaming 1080p and compressed audio.

            > put themselves into that position when they pivoted to streaming in the first place.

            As we all know physical DVD businesses are thriving

        • By nsteel 2025-12-210:021 reply

          I thought they started producing their own podcasts. Can't bring in much though.

          • By troupo 2025-12-218:252 reply

            260+ million songs they don't own vs a dozen or so podcasts

            • By kasabali 2025-12-218:49

              They also have fake artists they put on playlists :P

            • By nsteel 2025-12-218:44

              Yes, but it's still the required correction to your claim. I actually don't know how many podcasts are using their publishing platform. I imagine it's considerably more than a dozen.

              They want to own something but it's always going to be a drop in the ocean. They have a small new music label thing called RADAR but I imagine the failure rate on that is very high. They need to buy a label if they want to meaningfully change this. Just like Amazon now owns MGM and Netflix maybe getting Warner Bros. Presumably they can't afford to do this, and I don't think that integration would work as well in the music industry.

        • By zouhair 2025-12-2323:10

          Dude, as of now Netflix cancelled 263 shows[0], I have no idea where this idea of not having a choice is coming from.

          [0]: https://simkl.com/5743957/list/59981/cancelled-tv-shows-netf...

      • By raw_anon_1111 2025-12-210:551 reply

        There was never a time that Netflix had the majority of popular movies on their streaming service.

        • By kodt 2025-12-213:52

          For their mail service they did

    • By basisword 2025-12-2019:561 reply

      >> But this does seem like it will be a godsend for researchers working on things like music classification and generation. The only thing is, you can't really publicly admit exactly what dataset you trained/tested on...?

      Didn't Meta already publicly admit they trained their current models on pirated content? They're too big to fail. I look forward to my music Slop.

      • By VanTheBrand 2025-12-2020:511 reply

        They are too big to fail but they aren’t too big to have to pay out a huge settlement. Facebook annual revenue is about it twice that of the entire global recording industry. The strategy these companies took was probably correct but that calculation included the high risk of ultimately having to pay out down the line. Don’t mistake their current resistance to paying for an internal belief they never will have to.

        • By palata 2025-12-2023:201 reply

          > They are too big to fail but they aren’t too big to have to pay out a huge settlement. Facebook [...]

          I think it's pretty clear from history that they are too big to have to pay out a huge settlement.

          First, they never had to. There was never a "huge" settlement, nothing that actually did hurt.

          Second, the US don't do any kind of antitrust, and if a government outside the US tries to fine a US TooBigTech, the US will bully that government (or group of governments) until they give up.

          • By codersfocus 2025-12-211:221 reply

            Anthropic had to pay $1.5 billion recently so you're incorrect. I'm sure more of such cases will come up against big tech too.

            • By palata 2025-12-2110:40

              It's obviously more profitable to pay the fine than to not do the illegal thing in the first place, so I am correct.

    • By hugholousk 2025-12-276:31

      This makes me think that after the crack, they probably had to come up with a formula that can statistically calculate how fast they should download spotify songs without letting Spotify realizing that they're scraping the company data and block the access. Remind me of Alan Turing formula after cracking the Enigma

    • By Forgeties79 2025-12-211:18

      Just cite facebook getting busted training its AI on torrents proven to contain unlicensed material lol

    • By stefan_ 2025-12-2021:441 reply

      DRM aside, Spotify clearly should have logic that throttles your account based on requests (only so many minutes in a day..), making it entirely impractical to download the entirety of it unless you have millions of accounts.

      • By reactordev 2025-12-2021:551 reply

        >unless you have millions of accounts.

        Challenge accepted…

        This is probably how they did it, over time, was use a few thousand accounts and queued up all the things, and download everything over the course of a year.

        • By Retr0id 2025-12-2022:16

          Notably 160kbit is the free-tier bitrate, so they presumably used unpaid accounts.

    • By troupo 2025-12-2022:58

      Just like with anything digital you (and Spotify) are fully at the mercy of the rights holders. When (not if) they pull their stuff, or replace their stuff, or change their stuff, you can never get the original back unless you preserve it.

      Largest example: a lot of Russian music is not available on Spotify because of the Russia-Ukrane war, and Spotify pulling out of Russia. So they don't have the licneses to a lot of stuff because that belongs to companies operating within Russia.

    • By larodi 2025-12-229:07

      This, indeed, has mostly implications for ML, training, etc. As otherwise the whole catalog is available to partners, but costs a lot. So Anna did indeed liberate the content, but I'm definitely not switching off my Spotify subscription, even though, in my personal taste, neither quality, nor UI does match Apple Music. It is still useful to have s.o. serve the content for you.

    • By firefax 2025-12-2023:174 reply

      >I definitely was not aware Spotify DRM had been cracked to enable downloading at scale like this.

      What's stopping someone from sticking a microphone next to their speaker?

      Slow, but effective.

      • By michaelmior 2025-12-2023:21

        > Slow, but effective.

        I wouldn't call this very effective. It would take an impractically long amount of time to capture a meaningful fraction of the collection and quality would suffer greatly.

      • By coppsilgold 2025-12-212:363 reply

        Even if you plug the audio output into the input you would still be taking a quality loss by passing the audio through a DAC and then an ADC. Maybe if the quality of your hardware is good enough it wouldn't matter, but then you would be limited to only ripping 24 hours of audio per day...

        • By Clamchop 2025-12-2215:10

          You don't have to pass it through a DAC. There's no equivalent of HDCP for protecting digital audio end to end. Crudely, you could capture S/PDIF but really, skip that and just output to a virtual audio device for recording. No DAC in the path either way.

          But yes, it is inconvenient and slow.

        • By firefax 2025-12-2113:06

          They recently started offering lossless, could you get down to the equivalent of 320kbps?

          I grew up on sites like Suprnova, and quickly found I could not discern the difference between 320 mp3s and lossless.

          Even now, I only seem to notice if I use a very high end pair of headphones, and mostly with electronic music that has a lot of soft parts with sounds that are in the low or high end of the spectrum.

        • By yungwarlock 2025-12-2110:31

          Bro. Who cares. Ive got bunch of songs like this. The loss makes it more nostalgic

      • By layman51 2025-12-2023:451 reply

        Audio fingerprinting?

        • By firefax 2025-12-210:07

          >Audio fingerprinting?

          Bought a spotify card with cash, email was registered on public wifi.

          Who cares? :-)

      • By dbalatero 2025-12-212:31

        They'd probably do a shit job of capturing it?

    • By thaumasiotes 2025-12-2021:085 reply

      > I definitely was not aware Spotify DRM had been cracked to enable downloading at scale like this.

      Do they have DRM at all? Youtube and Pandora don't.

      • By Retr0id 2025-12-2022:18

        Spotify has DRM, and you can find open-source reimplementations of it on github.

        Their native clients use a weak hand-rolled DRM scheme (which is where the ogg vorbis files come from), whereas the web player uses Widevine with AAC.

      • By ale42 2025-12-2021:15

        Yes they do use DRM. I know they are using Widevine on the web player, but possibly other ones too (never looked very far). Not sure for the app, it might be that it is using OGG streams with a custom DRM (which is probably the one some existing downloaders actually (ab)use).

      • By nsteel 2025-12-2022:33

        It's called playplay. It's used for protecting their new lossless files. But the first rule of playplay is you can't talk about playplay. https://torrentfreak.com/spotify-dismantles-spotifydl-track-...

      • By Mindwipe 2025-12-2021:201 reply

        YouTube Music uses Widevine.

        • By thaumasiotes 2025-12-2021:251 reply

          If it's on YouTube Music, it's also on... YouTube.

          • By charcircuit 2025-12-2021:451 reply

            Not necessarily at the same quality though.

            • By thaumasiotes 2025-12-2022:071 reply

              I assume in most cases they're literally the same files. Youtube runs "topic" channels for music that distributors have sent it.

              https://www.youtube.com/channel/UCYOa-hi751OKY2zGJJv6V2A

              https://www.youtube.com/watch?v=MSSxnv1_J2g (same thing, but on an official channel instead)

              • By charcircuit 2025-12-2022:521 reply

                You can load any youtube music song on youtube by just removing the "music" subdomain.

                • By thaumasiotes 2025-12-2023:092 reply

                  Then why do you say they might not be the same files?

                  • By charcircuit 2025-12-210:24

                    Let me start over. Youtube itself has DRM required for certain videos, and certain formats of videos.

                    The 256 kbps format for music will be protected by DRM. If you do not have DRM available youtube will fallback to a lower quality format to play the auduo.

                  • By sgtlaggy 2025-12-2023:58

                    Music might have higher quality audio-only files as provided where Youtube might have it combined with video and a generic compression algorithm applied as with all other uploaded videos.

    • By cm2012 2025-12-2022:005 reply

      This leak will also be really useful to bad actors who will resell the music from this list without paying royalties to the artists.

      • By lkramer 2025-12-2022:092 reply

        Which is how Spotify started... And is still carrying on. So nothing has changed.

        • By troupo 2025-12-2023:212 reply

          Spotify pays 70% of revenue to rights holders.

          Why don't you ask them where the money inteded for artists is going? You know? The small insignificant companies of Sony, Warner Music, EMI that own the vast majority of music and own all the contracts?

          • By lkramer 2025-12-2312:37

            They have also arbitrarily decided not to pay out if you fall below a certain threshold, which hits smaller artists as well. Of course part of the problem is that the pay out is so low, so if you don't have millions of streams it's not worth it.

          • By injidup 2025-12-216:252 reply

            That is the decision of artists to sign with a mega corp. Any tom dick or harry can create a Spotify account, load their warbling autotuned ditty written by themselves ( or AI ) on any theme, in any genre and wait for fame or fortune to appear or not. You can take your 70% or whatever the exact number is with no.middle man if you like.

            Unfortunately the number of people producing music and the quantity of it is much higher than the number of people able to consume it. And culture is simply network effects. You listen to what your friends or family listen to. Thus there are only a small number of artists who make it big in a cultural sense.

            And one of the cheat codes for cracking the cultural barrier is to use a mega corp to advertise for you but if course the devil takes his cut.

            Anyway AI is coming for all these mega corps. If you haven't tried SUNO and many of you have it's amazing how convincingly it can crack specific Genres and churn out quality music. Call it slop if you like but the trajectory is obvious.

            As a consumer you will get you own custom music feed singing songs about YOUR life or desired life and you will share those on your social media account and some of those will go viral most will die.

            Content creation as a career is probably dead.

            • By saaaaaam 2025-12-219:591 reply

              (a) you can’t directly upload to Spotify. You need an intermediary in the shape of a distributor. Whether that’s a label or a DIY platform like DistroKid.

              (b) Spotify introduced a threshold of 1000 streams before they pay anything. This disincentivises low quality warbling autotuned ditties as they are unlikely to pass that threshold. (It’s more nuanced - you don’t just need 1000 streams from a handful of accounts as that could easily be gamed.)

              (c) Suno and Udio have been forced into licensing deals with the major record companies. The real threat will be when we see an open sourced Qwen or DeepSeek style genAI for music creation.

              • By woctordho 2025-12-228:441 reply

                There is a pretty interesting open source music AI named ACE-Step. Currently its quality is at about the Stable Diffusion 1.0 level, and they'll release a new version soon (hopefully in January).

                • By saaaaaam 2025-12-2321:53

                  That’s very interesting, thank you! Do you have any info on how it compares to Suno/Udio etc? I don’t know if you saw the news about Anna’s Archive having effectively scraped the majority of the Spotify library. It will be very interesting to see how this impacts on the next generation of generative models for music. Any thoughts there?

            • By troupo 2025-12-218:24

              > Any tom dick or harry can create a Spotify account, load their warbling autotuned ditty written by themselves ( or AI ) on any theme, in any genre and wait for fame or fortune to appear or not

              No, you literally can't.

        • By dehrmann 2025-12-2023:311 reply

          I think they build the demo with pirated music, but it was licensed by the time customers started paying for it.

          • By ninjin 2025-12-214:461 reply

            Correct, the pirated music library was before they exited the closed Alpha.

            • By cess11 2025-12-2112:402 reply

              No, that's what they ran on when the general public could join on a referral basis. They called that "beta".

              The technology was already proven, i.e. The Pirate Bay and other torrent networks had already been a success for years. What Spotify likely aimed to show was that they could grow very fast and that their growth was too good to just shut down, like the entertainment industry tried to do with TPB.

              After they took in the entertainment oligarchs they cut out the warez and substituted with licensed material.

              • By ninjin 2025-12-2112:562 reply

                Not sure if it was called "beta" or "alpha" and "closed" is of course up to interpretation, but it was indeed by invitation. Swedish law at the time (still?) had a clause about permitting sharing copyrighted material within a limited circle, which I know Spotify engineers referred to as somewhat legitimising it. I also know for a fact that once the invite-only stage ended there was a major purge of content and I lost about half of my playlist content, which was the end of me having music "in the cloud". Still, this is nearly twenty years ago, so my memory could be foggy.

                • By grvbck 2025-12-2116:51

                  When I first started using Spotify, a lot of the tracks in my playlists had titles like "Pearl Jam - Even Flow_128_mp3_encoded_by_SHiLlaZZ".

                  Always made me chuckle, it looked like they had copied half of their catalogue from the pirate bay. It took them a few years to clean that up.

                • By cess11 2025-12-2122:16

                  Yes, when the entertainment industry came onboard they immediately made the service much worse. I reacted the same way you did.

                  IIRC, 2008, a little less than twenty years.

              • By dehrmann 2025-12-226:201 reply

                > The technology was already proven, i.e. The Pirate Bay and other torrent networks had already been a success for years.

                Spotify showed that you could have a local-like experience with something backed by the cloud. BitTorrent had never really done that. The client wasn't that good, and you couldn't double click and hear a song in two seconds.

                The way you said that made me think you might be remembering when it was partially P2P, but I don't remember the timeline, it was only used to save bandwidth costs, and they eventually dropped it because network operators didn't like it and CDNs became a thing.

                • By cess11 2025-12-2210:49

                  If you don't remember, why speculate?

                  Ek had been the CEO of µTorrent and they hired a person who had done research on Torrent technology at KTH RIT to help with the implementation. It was a proven technology that required relatively small adaptations.

                  They moved away from this architecture after the entertainment industry got involved. Sure, it was a cost issue until this point, but it also turned into a telemetry issue afterwards.

      • By cedws 2025-12-211:122 reply

        I just started DJing and something I quickly noticed is how garbage Spotify's music sounds compared to FLACs I've purchased. The max bitrate is very low.

        • By tandr 2025-12-213:131 reply

          Spotify just (last week or 2 weeks ago) introduced lossless compression (FLAC) and it sounds amazing.

          • By cedws 2025-12-2113:40

            Wow didn't know about that, thanks.

        • By ThatMedicIsASpy 2025-12-211:36

          tidal is a thing and can be scraped the same way. I wonder how big that collection would be as it can go from 50mb to 300mb for 3min

      • By hermanzegerman 2025-12-2022:522 reply

        Spotify fucks over most artists anyway, so who cares?

        • By raw_anon_1111 2025-12-210:532 reply

          Spotify pays the rightsholders. What are they supposed to do about the shitty contracts that the artists signs with the labels?

          • By Aldipower 2025-12-2113:432 reply

            I am providing my own music on Spotify via a distributor I a pay 50 Euros once. What do I get from Spotify? Basically nothing! It is not the rightholders as I am the rightholder! Spotify is a scam for artist.

            • By Nemo_bis 2025-12-2218:04

              Copyright is a scam for artists.

          • By hermanzegerman 2025-12-2110:20

            They don't pay any artist who has less than 1000/Streams per Song per Year.

            They also deliberately choose a model which favours big artists, where they split the compensation just by the plays instead of User Centric Payments.

            Either way I don't feel bad about the Labels or Spotify.

            If I want to support an artist I buy their music, go to a concert or buy merch.

            I've had a Spotify Subscription, but that got cancelled as I didn't agree to the recent Price Hike, as I wasn't interested in paying for AudioBooks I don't care about.

            Now I'm rolling with YouTubeMusic and I am looking for a less shitty alternative

        • By chrneu 2025-12-2023:42

          yeah it's wild to me how folks will defend the current status quo when it's clearly broken.

          people defend convenience way too much. spotify isn't good for us and spotify-like-streaming is destroying the music industry.

      • By chrneu 2025-12-2022:335 reply

        this argument is so tired.

        most artists dont really care about streaming or selling their music. most of their real money comes from touring, merch, and people somehow interacting with them.

        most musicians just want to make music, express themselves, and connect with folks who enjoy their stuff or want to make music with em.

        Even some of the largest artists in the world only receive a few grand a year from streaming. Only the top 1% or so of artists get enough streams to even come close to living off it. It isn't that big of a deal. Music piracy isn't the theft people think it is, lars.

        youtube is kind of the same way. the real money comes from sponsorships which come from engagement. nobody on youtube is upset that their video got stolen because that mentality was never sold to us to justify screwing us over. musicians, however, were used as pawns so music labels could get more money.

        now folks will say stuff like "this is theft" which is just a roundabout way of supporting labels who steal from the artists. so, it's just a weird gaslighting. there's a reason folks turned on metallica over the napster stuff. metallica were being used to further the interests of labels over the interests of fans. and now you're doing the same thing :) It's a script we hear over and over again yet people keep falling for it.

        • By nospice 2025-12-2022:453 reply

          > most artists dont really care about streaming or selling their music. most of their real money comes from touring, merch, and people somehow interacting with them.

          I think you have it the wrong way round. I'm sure that musicians would love to make money from album / song sales. It's just that between piracy and companies like Spotify, artists make pennies on these activities, so their only choice is to make money on more labor-intensive stuff where they retain more control.

          Note that Spotify, somehow, finds it profitable to be in the streaming business.

          • By anjel 2025-12-211:521 reply

            I think it was was Les Claypool (of the band Primus) who said on some podcast that recording a studio album with its attendant very non-trivial costs is really just creating a very expensive business card to hand out to prospective clients.

            • By fragmede 2025-12-215:42

              Back then, that is. It probably cost $250k in 1990 for them to record Frizzle Fry in a studio, handwave $500k in 2025 dollars. But Bandcamp on MacBook and some gear from GuitarStudio, round to $15k and your time. neither of which isn't trivial or cheap, but it's not 1990 no more.

          • By chrneu 2025-12-2023:32

            > I'm sure that musicians would love to make money from album / song sales.

            i think we're actually in agreement. I just don't see streaming as a "must". A lot of musicians I work with and follow also don't see streaming as a must. It's a necessary evil in today's convenience fixated life/culture.

            Most musicians I ask about this absolutely fucking hate streaming and don't view it as a real revenue stream.

            That's why nearly all merch tables still have CDs, bandcamp links or records for purchase. Artists make more money off a t-shirt sale than they do from 50,000 streams.

            I think you slightly misinterpreted what I meant by "selling their music". Or I might have said it poorly.

            also, piracy does not mean less money for small artists. evidence suggests the opposite, i think. I think piracy marginally harms record sales for the top 1% of artists while benefiting basically all other artists.

            piracy = free exposure. more exposure means more ticket sales, more merch sales, etc. most musicians i know just want people to hear their stuff. piracy enables that for the majority of folks who can't afford to buy every album. i think artists care more about their art being used in commercial stuff without permission/payment, not everyday people checking their shit out.

          • By woctordho 2025-12-228:47

            It takes time and effort to receive money, especially from consumers worldwide. Most hobbyists would not going to deal with all the complexity in it.

        • By cm2012 2025-12-2022:544 reply

          Spotify paid out ten billion dollars to artists in 2024. This is not small potatoes - total 2024 music industry merchandise sales was around $14b.

          Youtube also paid out literally 50x more to creators in 2024 than Patreon had total subscriptions on the platform.

          These big platform payouts matter a lot.

          • By cwillu 2025-12-2023:202 reply

            > This is not small potatoes

            Unless you're a small potato. Approximately 0% of what I pay for spotify goes to the artists I actually listen to. Fucking Taylor Swift and the Beatles estate don't need my money.

            • By jMyles 2025-12-210:591 reply

              As a reasonably known but not super popular bluegrass artist, I agree: please steal my music instead of paying Spotify for it.

              • By NoGravitas 2025-12-2122:35

                Hell, Weird Al himself only made $12 from Spotify views in 2023.

            • By lkramer 2025-12-2214:55

              And Joe Rogan...

          • By vintermann 2025-12-217:25

            To rights owners, not to artists. It's not a trivial difference. Ask Taylor Swift.

          • By cj 2025-12-2023:003 reply

            Some quick Googling shows 1 million streams pays approx $2000.

            You'd need 40,000,000 streams to earn $80,000.

            • By chrneu 2025-12-2023:384 reply

              be aware that payout rates change based on tiers and a bunch of other factors. So, it would likely take more than 40 million streams to earn $80k.

              I believe Weird Al posted his streaming revenue a few years ago. He had something like 80 million streams and said he earned about $12. https://www.billboard.com/music/pop/weird-al-yankovic-wrappe...

              There is a reason people like T Swift and whatnot tour constantly, it's how they make money. Weird Al is known for his amazing live shows, there's a reason for it: they make more money.

              • By vintermann 2025-12-217:311 reply

                Ad supported streams in Spotify are counted in a separate pool, and only get paid out of the ad revenue pool.

                Artists can of course complain that "they're selling our music for cheap!", especially in the ad pool. But what's worth remembering is that when it comes to setting optimal price points, Spotify's interest is almost perfectly aligned with the artists. And Spotify has a hell of a lot more data than artists (not to mention financial sense, which you probably didn't become an artist if you had a lot of).

                • By Dylan16807 2025-12-2116:531 reply

                  > Ad supported streams in Spotify are counted in a separate pool, and only get paid out of the ad revenue pool.

                  What are the rough rates for each pool? That's the important part here. And how many artists are far enough from the average ratio that the detail of two pools matters.

                  https://soundcamps.com/spotify-royalties-calculator/ This site says $0.00238 is typical for "worldwide" and a lot more than that for US and Europe specifically.

                  • By vintermann 2025-12-2118:241 reply

                    I'd be interested in knowing that too, as far as I know Spotify doesn't publish details to the public at least.

                    But I have no trouble believing some artists will be vastly overrepresented in the ad financed pool. Also, there are separate pools by country, and countries have different subscription prices - being big in Japan will be more profitable than being big in India.

                    Payout per stream is a terrible metric. It's almost like if you ranked grocery stores by payment per gram.

                    • By Dylan16807 2025-12-2118:551 reply

                      > Payout per stream is a terrible metric. It's almost like if you ranked grocery stores by payment per gram.

                      CDs are usually similar prices. Per-stream isn't nearly as bad as wildly different products sharing prices.

                      We could debate per stream versus per minute but I don't know if that's a particularly big effect. It causes some annoyance but it's mostly compensated for already.

                      Anything that gives different value to different artists is probably going to favor the big ones and just make things worse.

                      • By vintermann 2025-12-2119:291 reply

                        CDs get wildly different number of plays. But the number of plays, whether from a record or from a streaming service, isn't proportional to how glad you are that this music exists and you can listen to it.

                        The present system favors big artist rights owners a lot, but most of all it rewards owners of music played on repeat, i.e. background music.

                        • By Dylan16807 2025-12-2121:03

                          I do think allocating money per-account or something should be better. Don't let a constant listener allocate the royalties from ten other people.

                          Trying to measure importance feels like a lost cause.

              • By xorcist 2025-12-2214:42

                > people like T Swift and whatnot tour constantly

                Maybe not the best comparison to anything. Swift is known for being an even better busiensswoman than artist and obsessed with having control.

                She screwed her record company for profits, not the other way around. Not many people have done that. She's likely making money on both ends of the stick.

              • By a022311 2025-12-2111:07

                The Pudding had a nice article explaining how streaming revenue is distributed: https://pudding.cool/2022/06/streaming/

              • By Dylan16807 2025-12-213:29

                When he says "so if I'm doing the math right that means I earned $12" I interpret that as him exaggerating for effect. It's definitely not him citing the pay slip.

                "$2 or more per thousand streams, split across rightsholders" seems like an accurate estimate.

            • By cm2012 2025-12-210:321 reply

              That seems reasonable?

              Assume an artist (either directly or through a rights holder) makes 1/3 income from streaming, 1/3 from merch and physical albums, and 1/3 from live events.

              40m streams per year would be 800k per week. 200k fans worldwide playing 4 times per week on average could get you there. Thats like a decent sized but not enormous youtube channel.

              200k fans worldwide would also support the ticket sales and merchandise sales aspects.

              • By tayo42 2025-12-213:021 reply

                You only need 5000 fans to buy your CD/album/w.e at $15 to make 80k

                • By cm2012 2025-12-213:241 reply

                  Per year, which is a big lift compared to them pressing play on Spotify

                  • By tayo42 2025-12-2113:00

                    Yeah but you need a quarter million people every week according to that guy. That will drop off over time.

            • By edelhans 2025-12-2115:27

              But you only need to record your song once and get money forever. Nobody pays me per function invocation in production, that would be very nice

          • By chrneu 2025-12-2023:241 reply

            99% of that 10 billion went to a handful of artists. Actually, I'd wager nearly half of it went to labels and other middlemen, but that's beside the point. The vast majority of money in the music industry never trickles down, ever.

            edit: I looked it up, 70% of spotify's payouts go directly to labels, not artists. So...that $10 bil is nothing.

            This is by design and it's the same broken system that metallica defended in the 90s/00s because it benefits large artists while fucking over the other 99%.

            We keep repeating the same script using the same busted short term logic.

            • By Dylan16807 2025-12-213:31

              Labels suck but when we're considering the merits of Spotify it's not their fault and artists can put music on the service without an abusive label.

        • By NoGravitas 2025-12-2122:331 reply

          Weird Al pointed out in 2023 that his 80 million Spotify views that year netted him $12 - enough for a nice sandwich.

          • By unparagoned 2025-12-2515:33

            He admitted that he exaggerated for comic effect

        • By basisword 2025-12-211:23

          Ah so you're only stealing a bit of money from the artists. That's ok then.

        • By earthnail 2025-12-2023:012 reply

          Touring makes almost no money. Only concerts with >1000ppl make money. Below that you can assume not even the sound engineer gets paid.

          • By chrneu 2025-12-2023:24

            Not true at all. I support small artists and it's the only way they make money. Ticket sales and merch make up the vast majority of artist revenue for artists who arent in the top 1%. Most musicians don't make money if they aren't touring or selling merch somehow.

            there's also the invaluable aspect of networking that touring allows. bit of a tangent, but it's very important for musicians to network.

            The exception are musicians who do production stuff. Think movie/tv scores, commercials, etc. I actually know a handful of artists who used to tour quite a lot but eventually settled down to do production stuff. So they transitioned from touring to make money to production. Touring all year with no healthcare catches up to people.

          • By ChrisMarshallNY 2025-12-2023:16

            I know a number of musicians that tour nightclubs, small venues, and festivals.

            They make a living; not a luxurious one, but they do OK. They just enjoy making music, and feel that it's worth it. Many of them never even record their music.

    • By londons_explore 2025-12-2023:54

      > Spotify itself is so convenient, and trying to locate individual tracks in massive torrent files of presumably 10,000's of tracks each sounds horrible.

      Download the lot to a big Nas and get Claude to write a little fronted with song search and auto playlist recommendations?

    • By ccppurcell 2025-12-2221:00

      >The only thing is, you can't really publicly admit exactly what dataset you trained/tested on...?

      Curious why not? Assuming you only used the metadata. I think they would be considered raw facts and not copyrightable.

    • By madduci 2025-12-217:501 reply

      The first users of this dataset will be Big Tech corps. Meta, Alphabet, OpenAI, Microsoft, Apple will all be happy to use this dataset for training their LLMs.

      For them, 300TB is just cheap

      • By ipsum2 2025-12-219:16

        They already have this data. See jukebox from OpenAI, released before chatgpt.

    • By 1dry 2025-12-2023:074 reply

      Thank god we are taking care of the “researchers working on things like music classification and generation” ! As long as we can convince ourselves we have a sound analysis of it, no need to support and defend people making actual art right. So much already made, who needs more?

      This is not to defend Spotify (death to it), but to state that opening all of this data for even MORE garbage generation is a step in the wrong direction. The right direction would be to heavily legislate around / regulate companies like Spotify to more fairly compensate the musicians who create the works they train their slop generators with.

      • By nimih 2025-12-2023:301 reply

        What, precisely, is the point you’re trying to make here?

        • By 1dry 2025-12-210:052 reply

          Expressing frustration at the pervasive tendency of technologists to look at everything, including art which is a reflection of peoples' subjective realities, with an "at-scale" lens, e.g., "let's collect ALL of it, and categorize it, and develop technologies to mash it all together and vomit out derivative averages with no compelling humanist point of view"

          I hope readers will feel our frustration.

          • By nimih 2025-12-213:19

            Well, that seems like a pretty reasonable thing to be pissed off about, thanks for taking the time to elaborate.

            I think the overlap between the bureaucratic technologies developed by people who, by all accounts, are genuine lovers of the subjectivity and messiness of music qua human artistic production (e.g. the algorithmic music recommendation engines of the '00s and early '10s; public databases like discogs and musicbrainz; perhaps even the expansive libraries and curated collections in piracy networks like what.cd), and the people who mainly seem interested in extracting as much profit as possible from the vast portfolios of artistic output they have access to (e.g. all of Spotify's current business practices, pretty much), should probably prompt some serious introspection among any technologists who see themselves in that first category.

            I read an essay a number of years back, which raised the point that, if you're an academic or researcher working on computer vision, no matter how pure your motives or tall your ivory tower, what do you expect that research to be used for, if not surveillance systems run by the most evil people imaginable. And, thus, shouldn't you share some of that moral culpability? I think about that essay a lot these days, especially in relation to topics like this.

          • By flir 2025-12-2110:01

            I'm reminded of the Zero One Infinity rule (https://en.wikipedia.org/wiki/Zero_one_infinity_rule)

            We're very much trained to solve the most general case of any problem, for sensible reasons.

            I first learned about this formulation of the rule from a case study in Alan Cooper's The Inmates Are Running the Asylum, where breaking the rule resulted in a much better user experience.

      • By kachnuv_ocasek 2025-12-2023:381 reply

        How does Spotify defend people who actually make art? There's virtually no difference between pirating and steaming through Spotify for the vast majority of artists.

        • By Griffinsauce 2025-12-2112:26

          Personally as an artist I'd rather give it to people directly for free but I'll meet the audience where they are. The "compensation" does not factor into it at all.

          Interestingly, I'm seeing more and more small bands stepping off of Spotify, mainly because of AI clones and botted stream scams. Apparently they've decided losing that reach is acceptable. (anecdotal ofc. but even on local scale it's an interesting choice)

      • By 1dry 2025-12-2023:54

        updated - thank you commenters for making it clear that my sentiment was not clear

      • By fao_ 2025-12-2023:24

        Spotify doesn't take care of artists, if you knew any artists you'd understand that Spotify is atrocious for people who make music.

    • By robtherobber 2025-12-2217:07

      I believe that we need to distinguish between convenience and preservation here. It is indeed convenient for consumers to use Spotify now whilst it exists and operates the way it does. They could go under, they could change their business model, they could decide to purge everything that is not easily justifiable commercially.

      As a society, we should do our best to preserve this trove.

    • By hkt 2025-12-2023:53

      Id be stunned if we didn't find out Anna's Archive is a front for a handful of shadier VCs who are into AI. Even if AA themselves don't know it and just take the cash.

    • By shevy-java 2025-12-210:20

      > The thing is, this doesn't even seem particularly useful for average consumers/listeners

      Yeah. To me it is not really relevant. I actually was not using spotify and if I need to have songs I use ytldp for youtube but even that is becoming increasingly rare. Today's music just doesn't interest me as much and I have the songs I listen to regularly. I do, however had, also listen to music on youtube in the background; in fact, that is now my primary use case for youtube, even surpassing watching movies or anything else. (I do use youtube for getting some news too though; it is so sad that Google controls this.)

  • By Etheryte 2025-12-2020:0811 reply

    To put this into perspective, What.CD [0] was widely considered to be the music library of Alexandria, unparalleled in both its high quality standard and it's depth. What had in the ballpark of a few million torrents when it got raided and shut down. Anna's rip of Spotify includes roughly 186 million unique records. Granted, the tail end is a mixed bag of bot music and whatnot, but the scale is staggering.

    [0] https://en.wikipedia.org/wiki/What.CD

    • By flxy 2025-12-2021:106 reply

      I think what earned what.cd that title wasn't necessarily just the amount but the quality, as you mentioned, as well as the obscurity of a lot of the offered material. I remember finding an early EP of an unknown local band on there, and I live in the middle of nowhere in Europe. There were also quite a few really old and niche records on there which possibly couldn't be put on streaming services due to the ownership of rights being unknown. It was the equivalent of vinyl crate digging without physical restrictions.

      Additionally there was a lot of discourse about music and a lot of curated discovery mechanisms I sorely miss to this day. An algorithm is no replacement for the amount of time and care people put into the web of similar artists, playlists of recommendations and reviews. Despite it being piracy, music consumption through it felt more purposeful. It's introduced me to some of my all time favourite artists, which I've seen live and own records and merchandise of.

      • By sbarre 2025-12-2114:35

        > I remember finding an early EP of an unknown local band on there

        So there was a clever trick that smaller artists did on what.cd: put up a really generous upload credit bounty for your own music, in order to sell digital copies.

        I knew a few bands in Toronto who did this as a way to make sales.

        They'd put up a big bounty right after setting up a webpage offering the album for sale via Paypal, then spend a few days collecting orders (and they would get a lot of them - hundreds sometimes - because What.cd had a lot of users looking for ratio credits) and then eventually email a link to the album after a few days.

        No idea what the scale of this trick/scam (call it whatever) was but anecdotally I heard about it enough.

      • By toast0 2025-12-2023:24

        > There were also quite a few really old and niche records on there which possibly couldn't be put on streaming services due to the ownership of rights being unknown.

        Music licensing (in the US at least) is actually pretty nice for this (from the licensee perspective anyway). There are mechanical licenses which allow you to use music for many uses without contracting with the rightsholders and clearinghouses whose job is to determine where to send royalties. So you can use the music and send reporting and royalties to the clearing houses and you're done.

        Of course, you may want to contract with the rightsholders if you don't like the terms of the mechanical license; maybe it costs too much, etc. If you're Spotify or similar and you have specific contracts for most of the music, and have to pay mechanical license rates for the tail, it might make sense to do so in order to boast of a larger catalog.

      • By some-guy 2025-12-2021:412 reply

        I’m still using the “successor” to what.cd and I usually discover artists through random lists, “related artists”, among other things on the platform.

        One interesting way of discovering artists is finding an artist that I already like on a compilation CD, and then seeing what else is on the CD.

        • By david_p 2025-12-2023:023 reply

          Would you share the name of that successor? I miss the old internet and would love to take a look.

          • By Narushia 2025-12-2023:542 reply

            It's Redacted.sh, a.k.a. RED. They have around three million torrents. But like What.CD, Redacted.sh is a private tracker, so you can't just jump in and see the content.

            • By bgbntty2 2025-12-2119:39

              How does it compare to rutracker, especially for electrnic music? I've never used what.CD and rutracker seems to have lots of high quality music.

            • By david_p 2025-12-216:591 reply

              Thank you. I’m reading about them, cool project. I’ll try to join.

              • By SamuelAdams 2025-12-2419:13

                It is worth noting that RED is particularly difficult to get a decent ratio on. Spend some time googling reddit posts, there are plenty of examples of people not being able to build solid ratios due to competing with scripted bots.

          • By chrneu 2025-12-2023:43

            Another comment mentioned Redacted.sh as a successor. I haven't used it. I'm sure there's a subreddit around that can help. Looks like orpheus is another option if I'm reading correctly. You have to get an invite or pass an "interview" though, so be prepared to wait a while.

        • By chrneu 2025-12-2021:52

          the compilation album is a great idea. thanks for that. your comments in here have been helpful. have fun listening.

      • By paraknight 2025-12-2221:471 reply

        What.cd were extreme sticklers for quality! When you applied to get in, they did a live interview on IRC to test your knowledge of ripping, transcoding, and different kinds of compression, how torrents and private trackers work, and their code of conduct. I remember studying for it. They also had ways to make sure you weren't cheating like checking your screen, as well as very aggressive automated checks for VPNs and blocklisted IPs to prevent ban evasion and multiple accounts.

        They also had good incentive structures for keeping the bar high -- you could get kicked out for having a bad ratio, so the easiest way to pump your upload up was to fulfil obscure requests for FLACs you could purchase online but were extremely difficult to purchase (if you're lucky it's just an unknown artist on Bandcamp). I discovered a lot of obscure music this way, some that I'm still looking for to this day after it shut down.

        Because I cared so much about being part of that private tracker, this is what also prompted me to rent a seedbox for the first time. I paid in Bitcoin out of paranoia (I lived in Germany where the fines for piracy are HEFTY, and they actually do come after you) back when Bitcoin wasn't really worth that much, and later found that that old wallet suddenly had a couple thousand in it instead of the spare change I couldn't move!

        • By arm32 2025-12-2920:33

          I must have joined at a different time because all I needed for me, a total annoying script kiddie leech, just needed an invite code (or link? I forget)

      • By girvo 2025-12-2022:02

        Yeah, What.CD had a bunch of the local Brisbane post-rock bands from the 00s on there which was amazing to me. I at least have copies of a lot of their records!

      • By MarcelOlsz 2025-12-211:28

        email me please

    • By VanTheBrand 2025-12-2020:464 reply

      True but What.cd had a tremendous amount of notable music not available on Spotify though because it was also sourced from cds, bootlegs, vinyl, tape etc whereas Spotify only includes music explicitly licensed for streaming.

      • By Etheryte 2025-12-2020:502 reply

        This is true and a category of music that got hit notably hard was live recordings. What had a wide array of live recordings made by sound engineers straight from the mixer. This is something that you simply cannot find now unless you maybe know a guy.

        • By qingcharles 2025-12-2021:04

          That's why I use YouTube Music as my streamer as they allow damned near anyone to upload any old rare record and then figure out the royalties somehow.

        • By alxndr 2025-12-2120:57

          FWIW archive.org has a lot of live music as well

      • By leetbulb 2025-12-2021:171 reply

        Yes. RIP a ton of very rare material. What.cd has a special place in my heart.

        • By some-guy 2025-12-2021:393 reply

          Redacted.sh is a worthy successor, but the average person just doesn’t care about “which release is best” anymore. I use YT Music as a backup but Redacted is my main source of music these days.

          • By selectodude 2025-12-2022:352 reply

            At the end of the day it feels like the private trackers are such a nightmare to get invited to and maintain ratio at it’s just not worth the effort.

            I want this torrent though. It would be fun to stand up a NAS for this.

            • By some-guy 2025-12-2122:061 reply

              The private trackers are just as much about the community as they are about the content they host. Of course there are trade offs because communities can be very insular.

              I’ve noticed in the past 10 years or so private trackers have become less strict because the economics of ratios only works if either a) everyone is equally uploading new material and b) there are more and more signups. So now there is value in the amount of time you seed your content which lowers your “required” ratio.

              • By max51 2025-12-2317:52

                Generally speaking, trackers that require a ratio above 1.0 and don't have freeleech/point system are designed so that you pay the website to fix your ratio and/or rent a seedbox from one of their partner.

                It's a 0 sum game; for every account with a >1.0 ratio, that implies other people will be <1.0.

                And when you compete with 10gb/s seedboxes that have scripts to automatically grab all the new torrent the second they get posted, it's extremely difficult to improve your ratio. Even for super popular torrents, you have a few minutes to seed as much as you can before upload speed goes to 0 forever. You can't slowly accumulate upload over time the same way you would with a torrent from a public tracker.

            • By leetbulb 2025-12-284:17

              If you share your email somehow, I can invite you.

          • By karamanolev 2025-12-2023:032 reply

            Don't you consider it best to ... redact ... your post, as it's the only one mentioning it by name?

            • By sincerely 2025-12-219:30

              It's hardly a secret, you can go on r/trackers where people discuss private trackers for every media type

            • By fragmede 2025-12-216:02

              Some people just don't know when to shut the hell up.

          • By udev4096 2025-12-215:48

            [dead]

      • By BoingBoomTschak 2025-12-2111:121 reply

        Which also means almost always limited to the latest, almost always crappy (or blind to the original ambiance) remaster! One of the main reasons why I don't bother with streaming, really.

        (And because they lack much obscure stuff and I don't like being dependent on the Internet and a renter's whims for something as essential as music, I guess)

        • By TonyTrapp 2025-12-2111:22

          This, a thousand times this. I have gone back to collecting CDs because it's often the only remaining way (short of pircay) to get original masters of many artists. Even lossless download stores like Qobuz don't have them.

      • By tclancy 2025-12-2023:31

        Yeah, it was a great place. I have a paid Spotify account but finally got an ancient hard drive onto my network for all sorts of stuff Spotify doesn’t or can’t have (e.g., Coldcut: 70 Minutes of Madness).

    • By rckclmbr 2025-12-2021:172 reply

      You can’t talk about what.cd without talking about its precursor OiNks Pink Palace. Even Trent Reznor was public about what an amazing place it was. Music aside, the community existing just for the shared love of music and not for any other kind of monetary or influencer gain is what set it apart. We just don’t have those kinds of communities for music online anymore

      • By chrneu 2025-12-2021:502 reply

        >We just don’t have those kinds of communities for music online anymore

        They're still kind of around, but yeah, everything is very much on it's way out in the music scene, at least in terms of that late 90s early 00s culture. Or has been until recently. There is a renewed interest in self-hosting and "offline" style music collections.

        It sucks too. The way folks discover music is important. The convenience of streaming has lead to some interesting outcomes. When self-hosting music comes up this is always one of the top questions people have: How do you find new music?

        The answer isn't that hard and really hasn't changed much. People just don't want to spend any time or effort doing it. Music stores still exist, they're amazing. Lots of 2nd hand stores carry vinyl and CDs now, which can give you great ideas for new music. There are self-hosted AI solutions and tools. Last.fm and Scrobbling are still very much around. My scrobble history is so insanely useful. There are music discords. Friends. Asking people what they're listening to in public. Live shows with unique openers(I once went to a Ben Kweller show with 4 opening bands, I still listen to 3 of them.)

        • By pickledoyster 2025-12-2217:53

          One thing you haven't noted is radio.

          Some local radio DJs frequently play songs I enjoy that have under 1K plays on youtube. No algo or platform is surfacing those. Local radio gets me both local and international music. A friend of mine prefers critically acclaimed stuff, so he streams radio shows from NTS and the like.

        • By josteink 2025-12-229:30

          > It sucks too. The way folks discover music is important. The convenience of streaming has lead to some interesting outcomes.

          I think carefully curating music was something we did when music was a scarce commodity. Our collection was limited by how much we could afford to acquire. As such, acquiring the right stuff become a valued skill, not only for DJs, but for music enthusiasts just playing music at home.

          Streaming killed all that. For 99.9% of the people out there, streaming has all they need and will ever need, at a fixed cost. It's absolutely abundance.

          So the skill of curating music as a human activity went out the window as well, because there's no cost in playing the wrong track and deciding you didn't like it, before moving to the next item in your AI-generated playlist.

          Put bluntly: How people discover music isn't important. At least not anymore.

          (And I say this as a music enthusiast myself)

      • By SSLy 2025-12-2021:342 reply

        I mean, WCD has two healthy replacements, plus slsk

        • By tclancy 2025-12-2023:29

          I love that SoulSeek still exists in some format. My path was Napster (made me get cable Internet and a cd burner) > AudioGalaxy (learned how to path things on routers so I could download music to home from work) > SoulSeek. Plus it had some useful chat and people who cared about sound quality and metadata.

        • By platevoltage 2025-12-2023:293 reply

          Soulseek has to be the best kept secret on the internet. Even people my age who grew up with things like Napster, Limewire, and even soulseek, don’t know that it still exists.

          • By ZeWaka 2025-12-211:45

            The amount of extremely obscure music on there is crazy, stuff that exists nowhere else in the internet except maybe google drive links.

          • By parodysbird 2025-12-2517:08

            It's crazy how convenient and deep the library on soulseek is. I even use it all the time on mobile.

          • By lukaslalinsky 2025-12-2111:15

            Yeah, I was looking for some rare album I had in the past, and was shocked to realize that Soulseek is still active.

    • By josteink 2025-12-229:24

      > What.CD [0] was widely considered to be the music library of Alexandria, unparalleled in both its high quality standard and it's depth.

      It was quality in technical quality of the audio in the files, but also in the organization and sourcing of the material, the QA-process of the encoding - down the the specific release the audio-file was from.

      There was quantity, sure, but that was secondary to the quality. The quantity was just a side-effect of the place being known for quality, making it an attractive arena to participate in.

      And it also had all the "weird"/non-standard things you don't find on mainstream streaming-services precisely because that is what independent curators are good at and often driven by.

      This Anna's release... While in itself impressive in many ways does not compare to the things What.CD represented. It's almost the exact opposite:

      - focus on most popular content - niche content (even by mainstream Spotify-standards) is not included

      - quality is 160kbps ogg files, which is far from lossless, it's not tightly coupled to a release and even as so far the audio-grading goes, there's no transparent QA process for the content, nor is it available in audiophile fidelity.

      This is definitely Apples vs Oranges.

    • By layer8 2025-12-2022:18

      That being sad, I have a lot of non-mainstream tracks in my playlists on YouTube Music that have YouTube comments along the line of “I wish this was available on Spotify :’(“. I bet the same goes for What.CD.

      So there’s some way to go for a comprehensive music archive.

    • By b8 2025-12-211:53

      Redacted, their replacement has more records then they had now.

    • By rldjbpin 2025-12-2119:141 reply

      about the scale, the same album in the tracker had several submissions, for dedicated format and regional editions.

      while one can compare in terms of number of tracks, the quality used to be in another level altogether. from the article:

      > The quality is the original OGG Vorbis at 160kbit/s.

      meanwhile the tracker had 16/24-bit flac rips of vinyl, with decent quality control where the track's metadata was verified for any artifacts. for the given quality, one could rip youtube music (maybe not as easily anymore) and achieve a larger scale in a similar quality level.

      now if hypothetically tidal had all the music of the world and was accessible this way, then it would be a comparable resource. insane regardless.

      • By uncivilized 2025-12-2618:551 reply

        It's 160kbit/s for popularity>0 and 75kbit/s for popularity=0. I'm surprised Anna's Archive went for this given that these are not archival quality bitrates. It appears they did this because they found a way, rather than seeking to create a library of music.

        • By rldjbpin 2025-12-2723:04

          my comparison was with source catalog in spotify compared to the private tracker. spotify is working on the high fidelity mode, but so far it is not rolled out worldwide.

          i completely understand the archive's decision on applying their own compression.

    • By WadeGrimridge 2025-12-2115:16

      anna's rip has ~86m tracks, not ~186. ~186m is metadata, specifically ISRCs.

    • By laughingcurve 2025-12-2119:39

      Wow, I have not thought about OiNK in ages... great memories! OiNK and WhatCD did something very special for the musical community

    • By SSLy 2025-12-2021:33

      Well, what.cd counted any album as one torrent. While current spotify has also podcasts and AI slop.

  • By virtualritz 2025-12-2023:256 reply

    I just found out that https://annas-archive.li/ is masked by my German internet provider (SIM.de/Drillisch). I usually use a VPN but I had it switched off temp. to watch Fallout (Prime Video won't let you watch through a VPN). Only when I switched Mullvad back on could I open the site.

    I didn't know German providers do this.

    • By oarfish 2025-12-216:291 reply

      Yeah this is actually quite nefarious, as it is a private organization that decides what sites get blocked, with no legal oversight.

      - https://de.wikipedia.org/wiki/Clearingstelle_Urheberrecht_im...

      - https://netzpolitik.org/2024/cuii-liste-diese-websites-sperr...

      Its a DNS based block, so overriding your default DNS server is enough to circumvent it. I think Dns over Https also works.

    • By croemer 2025-12-219:131 reply

      I think it's a DNS level block. I've been using NextDNS (free plan) and one side effect (besides auto ad block) is that it doesn't have those blocks. Highly recommend - there are alternative services as well, just saw NextDNS recommended here.

      Alternative: https://archive.ph/2025.12.21-050644/https://annas-archive.l...

      • By grumbelbart 2025-12-2110:22

        Someone compiled a list of blocked domains (by probing different DNS servers):

        https://cuiiliste.de/

        This is also how, for example, RT is blocked in Germany.

    • By iknowstuff 2025-12-2023:423 reply

      In that vein, I am trying to find out why searching for

          alextud popcorntime
      
      which should trivially yield http://github.com/alextud/PopcornTimeTV results in anything but that one particular URL in every search engine: Google, Kagi, DuckDuckGo, Bing

      They even find a fork of that particular repo, which in turn links back to it, but refuse to show the result I want. Have't found any DMCA notices. What is going on?

      • By ticoombs 2025-12-211:47

        They have marked the repo as noindex (or GitHub is forcing a noindex header).

        Its returning a noindex flag so every serp is correctly doing what the repo has been asked.

        That is... except for brave! I checked on my searx instance and it still showed up in brave's results

      • By Mythli 2025-12-2115:56

        Try Yandex search, trust me later.

        It has 0 censorship - regarding pirated content at least.

      • By ZeWaka 2025-12-211:48

        Very interesting. The security page does show up on kagi at #6.

        I wonder if GitHub flags it to not be indexed or something.

    • By polytely 2025-12-2111:16

      Also true in the Netherlands, I hate these copyright freaks constantly trying to restrict access.

    • By junon 2025-12-213:29

      Was also shocked to see that (Berlin, Telekom here).

    • By sva_ 2025-12-2116:55

      They also block some foreign "news" like Russia Today last time I checked.

HackerNews