Podcasting’s “Privacy Problem”

Seth Resler, a knowledgeable man about social media strategy, and “Digital Dot Connector” at Jacobs Media, has written a piece claiming that podcasting has a “privacy problem”.

Podcasting Has a Privacy Problem
_Last August, I once again attended the Podcast Movement conference in Nashville. Every few years, I like to drop in on…_jacobsmedia.com

In it, he talks about a speaker who discussed digital privacy, and claims that:

“several companies in the podcasting space rely on capturing the IP addresses of listeners in order to target them with ads. This is problematic for several reasons:

- It may be unethical.

- It may violate existing laws like the GDPR in the European Union.

- It might violate privacy future laws that are expected to be passed in places like the U.S.”

If you write about “podcasting”, then you’re writing about podcasting, and the ecosystem around podcasting.

A broad brush approach of “podcasting has a privacy problem” is incorrect, and this article repeats false assertions that really ought to be laid down, once and for all.

Podcasting doesn’t have a privacy problem

That’s right. Podcasting doesn’t have a privacy problem. If all you have is an IP address and a useragent, which is all we know in podcasting, then you cannot take that one piece of data and invade anyone’s privacy with it.

You simply can’t.

192.0.2.223 Podcasts/2.0 Dalvik 23433

The above is literally all a podcast hosting company knows about a user. And, as long as you don’t combine it with any data from anywhere else and don’t let anyone else do that either, there is nothing here that invades anyone’s privacy: there’s no personally-identifiable data. It’s a set of random numbers and a useragent string.

(“Personally-identifiable”: can I tell that this person is called Bill Smith? Can I tell he lives in 29 Acacia Avenue? Do I even know if he’s a boy or a girl?)

The GDPR, which is specifically about protecting personally-identifiable data, agrees. As recital 30 says, online identifiers like IP addresses become personally-identifiable if combined with other things (like a login).

A good piece of German case law says this too: and, as you’ll notice if you try Google Streetview in the country, Germans are very cautious about privacy. In this case, an internet service provider had additional information which allowed it to know the name of the person they had given 192.0.2.223 to. But the ruling also clearly says:

a dynamic IP address does not provide a website operator with sufficient information to directly identify an individual user, unless additional information is also available (e.g., the user logs into the website and provides information that enables the website operator to identify that user). The parties agreed that the IP address in question did not directly identify Mr Breyer.

Can IP addresses be personal data? Absolutely they can, when combined with other things. It’s why podcast hosting companies (and, since I’m one, me) want to keep them hidden away and treat them as securely as personal information, and it’s why VPNs and Apple’s Private Relay are a thing.

But are IP addresses personal data by themselves? No. It’s technically impossible for them to be so: there’s not enough data there.

Note: podcasting doesn’t use cookies either. The IP address (and user-agent) are the only things we have.

Podcast ad-targeting might have a privacy problem

Geo ad-targeting, at its very basic, doesn’t necessarily mean a privacy violation.

You can get databases containing the rough physical location of an IP address, like 192.0.2.223 . Some are quite advanced, like the data you can pull out of Amazon Cloudfront, a popular content distribution service.

Armed with that kind of database, you can work out that a listener is in Brookline, a suburb near Boston MA, USA. But, without other data (and without combining this with other requests from the same person), you know nothing else about this listener. There’s no personally-identifiable data here. Knowing that 192.0.2.223 is in the suburb of Brookline doesn’t tell you who that person is, which household they’re in, whether they’re looking for a new car, or if they have kids. It merely tells you a vague area where that person is (or might be: geolocation isn’t actually, whisper, that good).

But many go further.

For many ad companies, geo-location isn’t enough; so they buy access to other big databases from Experian, Nielsen, Comscore, Oracle or others, as Sounds Profitable goes into in a long article about ketchup. And then they match up the IP address of the listener with anything it says in the databases about the same IP address. And then, knowing that there’s a child in the household, it advertises Disneyworld to me.

Is this personally-identifiable data? Yes, quite possibly. An IP address is often unique to a household, and who knows whether those databases have my name and address from the Amazon order I made last week, or the time I logged into eBay, or my partner’s Netflix account, or my daughter’s seemingly non-stop YouTube watching.

Is the podcast listener giving opt-in for a podcast host to run their IP address through all these databases and find out lots more detail about them? No. Is that against the terms of GDPR? Yes.

Does use of a VPN or Apple Private Relay get in the way and break all this? Yes. Even the benign physical location stuff.

Ad effectiveness measurement probably does have a privacy problem

Ad effectiveness, or attribution measurement, works by linking the listening I did of Bill’s Amazing Podcast and his excitement in talking up his new razor, and the purchase of the new razor from his sponsor which I did after I heard how excited Bill was with it.

In order to achieve this link, the system needs to know that it’s “me” in both places. My IP address when I downloaded the podcast; and then my IP address when I bought the razor. (Not the user-agent in this case: podcast apps aren’t web browsers).

As above, an IP address becomes personal information when you combine it with other data: in this case, the purchase of that nice razor, which, of course, comes with my name, address, credit card number, and the almost certainty that since it was a man’s razor, I’m probably a man.

This is combining that IP address (which, as we saw above, isn’t personal data) with additional data (which very definitely makes it so).

It’s not podcasting’s fault, this, though. It’s the tracking stuff placed on the razor website that’s the privacy problem here.

Does Apple’s Private Relay or VPNs get in the way and break all of this? Probably so, though a little less than you’d think, since some purchases are made on the same device (and thus same IP) as the listen.

Resler goes on to discuss the IAB’s #concernedpodcaster paper, which they published last week.

As I understand it, this paper took twelve months of work. It is full of questions and statements, but no solutions. It has a message of “the podcast advertising world will fall in if people will use VPNs” (and, yes, as above, that fear has some truth to it): but no clear strategy of what to do next other than “please, Apple, talk to us”.

I wish it contained constructive solutions. It would have been rather a more helpful paper.

James Cridland is Editor of Podnews, a daily podcast newsletter, but writes here when he wants to say stuff that he probably shouldn’t.

podcasting

Previously...

Getting more value from the archive

Next...

Trip report: fifteen hours between SYD and DFW