Nearly 300TB of Spotify's music library, including 86 million songs, has been archived and distributed via torrents by a preservation group.
An archival organization known for preserving books and research papers just made a bold move into music. Anna's Archive claims to have scraped nearly all of Spotify's catalog, capturing metadata for 256 million tracks and audio files for 86 million songs. The total haul weighs in at just under 300TB and is being distributed through torrents, organized by popularity. The group says the dataset covers about 99.6% of listens on the platform, positioning it as one of the largest publicly available music metadata collections online.
The archive uses a tiered approach. Popular tracks are stored in their original 160kbps format, while less-played songs have been re-encoded into smaller files to conserve space. The group says the full audio library will be released gradually, starting with the most popular songs, while the metadata set is already broadly available. Anything released after July 2025 appears to be missing from the collection.
Spotify acknowledged the incident in a statement, saying a third party scraped public metadata and used unauthorized methods to access some audio files. The company did not confirm the full scale described by the archivists, but framed the activity as a misuse of its systems rather than a legitimate preservation effort.
The community reaction has been mixed, reflecting a familiar tension between preservation and ownership. Some commenters treated it as a throwback to earlier eras of music downloading. Others took the preservation argument more seriously, pointing to how niche releases can disappear when licensing deals change, catalogs rotate, or services shut down. It is the same fragility that shows up across modern platforms, a theme that security researchers and policy watchers often track as part of the broader shift toward cybersecurity and emerging technologies.
There is also a practical angle that gets lost in the culture war framing. Scraping at this scale is also a systems problem, involving automation, access controls, and the hard question of what defenses actually work when a service is built for global distribution. The more relevant conversation for builders is how platforms detect abuse, rate-limit scraping, and harden their pipelines, ideas adjacent to the kinds of tooling discussed in guides to open-source security testing.
Legally, the line is clearer than the rhetoric. Spotify licenses music under contracts with rights holders, and mass copying plus redistribution of audio files generally violates both platform terms and copyright law. Even if the intent is framed as cultural preservation, the mechanism is still unauthorized distribution. That reality tends to push companies toward containment, audits, and tightened controls, including the kind of baseline requirements that show up in security standards checklists when teams review where data can leak.
In the long run, this is the paradox the industry keeps postponing. Streaming is great at access, weak at permanence. Preservation is culturally valuable, but today it is mostly outsourced to private platforms and legal agreements built for commerce, not continuity. If rights holders respond aggressively, the archive becomes another piracy case study. If they do not, it becomes an uncomfortable proof-of-concept that the "library" of modern music can be duplicated faster than the rules can adapt.
Reporting on the claims and the torrent rollout first surfaced via Android Authority.