James Cridland

Data mining the Podcast Index

James Cridland and Adam Curry

The Podcast Index - released by Adam Curry and Dave Jones more than five years ago - is a super-useful thing to go data mining into the wide world of podcasts.

There’s a weekly-updated file containing every public feed in the Podcast Index - it’s here, and is about 1.8GB. Unzip it, and it becomes a 5GB file, which you can “easily” interrogate using a thing like this SQLite browser.

Once you’ve installed the SQLite browser - if you have Homebrew installed, it’s just brew install --cask db-browser-for-sqlite

  • Open the podcastindex_feeds.db file.
  • Hit the “Execute SQL” tab.
  • Post one of the SQL statements below.

I’m mostly posting this so I have a note of them.

What are the latest episodes published by Inception Point AI?

While some shows by Inception Point AI are troubling, not all of them are. Short-casts with fishing reports or weather forecasts, or even crime reports for tiny suburbs, are a useful thing that humans can’t replicate at scale.

Inception Point AI uses Megaphone, which gives a five-letter prefix to every customer in their audio URLs; Inception Point AI’s is NPTNI, so armd with that, we can get a list of the latest shows.

CAUTION: Podcast Index marks and discards some shows which are marked as AI, so this will be incomplete.

select datetime(newestItemPubdate, 'unixepoch', 'localtime') AS newestEpisodePublished,title,link,itunesAuthor,episodeCount,description,time(newestEnclosureDuration,'unixepoch') AS latestEpisodeLength from podcasts where newestEnclosureUrl LIKE "%NPTNI%" ORDER BY newestItemPubdate DESC

What are the newest shows published by Inception Point AI?

We use oldestItemPubdate here, which is the time/date for the first episode in the RSS feed. There’s a createdOn value, but that’s the date when the Podcast Index first added the feed.

CAUTION: Podcast Index marks and discards some shows which are marked as AI, so this will be incomplete.

select datetime(oldestItemPubdate, 'unixepoch', 'localtime') AS createdDate,title,link,itunesAuthor,episodeCount,description,time(newestEnclosureDuration,'unixepoch') AS latestEpisodeLength from podcasts where newestEnclosureUrl LIKE "%NPTNI%" ORDER BY oldestItemPubdate DESC

More posts about:

Previously: