If you ask Spotify’s website, it’ll send you a big ZIP file of “your personal data”.
I was curious to see what was in it, so I went to have a look. It took four days to be available to me.
A bit about me first…
I’m a Premium subscriber, but use Spotify fairly sporadically. I live in Australia, which is not a country within the GDPR and does not have particularly strong privacy laws. I have used Spotify a little in the past for podcasts, but haven’t used it much over the last few months. (I prefer YouTube Music, which appears to have better algorithms for giving me automated music mixes). I have also opted out of “tailored ads”, which probably means my data isn’t being matched with other databases (and I run quite fearsome blocking on most of my devices anyway). So perhaps I’m not a normal user. Who knows. But there are a few interesting things in the data.
So, first, there are a lot of JSON files in here, but they’re relatively easy to read. The “read_me_first” datafile contains the following (in many different languages):
If you are also interested in receiving technical log information we collect to provide and troubleshoot the Spotify service, extended streaming history, or other special data, please contact our Customer Service or email us at email@example.com to clarify your request.
…so this isn’t, actually, my full data: just the data Spotify thinks I might want to see. It also links to this page which explains what all the files are.
Most of them are, to be honest, quite dull. Here are a few, and we’ll get onto the juicy one in a minute.
StreamingHistory0 is just my streaming history from the past year. Here’s one of the entries: it contains the total amount of miliseconds played. No link back to the track’s ID, mind you (possibly deliberately).
This is supposed to show all the searches I’ve done over the past year, but it doesn’t show any searches I’ve made on Spotify Web or on the macOS search platform, so it seems a little incomplete. One nice thing here is that it keeps a log of what I clicked on after the search: they’ll presumably use this as a scoring method. That’s a neat trick.
An intriguing file, which isn’t mentioned in the help document. I’m not a “tasteMaker”, and nor am I “verified”: my suspicion is that this is for musicians and notable people who use the service, and a verified tick to show that this really is who you think it is.
This, though, is the fascinating one. The help file says:
We draw certain inferences about your interests and preferences based on your usage of the Spotify service and using data obtained from our advertisers and other advertising partners. This includes a list of market segments with which you are currently associated. Depending on your settings, this data may be used to serve interest-based advertising to you within the Spotify service.
Now, I’m a Spotify Premium user, so never hear any advertising on this service. However, I bet that this data is still being kept up to date so that if I stop paying for it, it’s got all the information that it knows about me so it can start serving up those ads.
So, let’s find out what Spotify has “inferred” about me.
1P_Custom values at the top are based on the devices that I’ve played music on. It’s learnt (from the Bluetooth data) that I have a Toyota car; and learnt from other things that I’ve Google speakers (and one device which is a Spotify Connect internet radio).
It’s correctly noted that I use Spotify in the car. I’m curious as to what the
podcast-audience-segmentation-rules denote: it says that I listen on my mobile, which is correct, but “format length short” appears to have decided that I’d rather have shorter podcasts rather than longer ones, I suspect. That would probably be true: a lot of my recent Spotify listening to podcasts has been checking my own (3m) podcast.
The rest of these look like podcast categories that I’ve listened to. I thought that they might be based on music somehow, but I struggle to work out how.
Finally, no idea what the Nerdwallet entry is for: perhaps that is marking me down as someone who followed a promotion within Spotify. That’s a US personal finance site, so is almost entirely useless to someone tax-resident in Australia: and I know I’d not listen to their podcast, so that’s a bit of a mystery.
What can we learn from this? Not much, really: other to note that Spotify knows what brand of car I drive, and that I like short podcasts.
As I noted above, I use YouTube Music for music listening. I also use Google for my main search engine, Google Podcasts as my main podcast app, Google Chrome as a browser, Google smart speakers, and Android devices as phones and tablets. It’s a fair bet to suggest that Google knows rather a lot more about me than Spotify does. And perhaps it’s no surprise that I find YouTube’s algorithms play me more of the sort of music I want to hear.
Much of the current privacy discussion is about third-party privacy: Spotify shipping my data to Nielsen, for example, to work out information about me. But Google or Apple need do none of that if you’re within their ecosystem.
It’s possibly understandable that Google and Apple has been keen to roll out privacy controls. Privacy controls in Android or iOS stop their competitors, like Spotify, from getting the data: but Google and Apple are unaffected.