Data mining the Podcast Index

The Podcast Index - released by Adam Curry and Dave Jones more than five years ago - is a super-useful thing to go data mining into the wide world of podcasts.
There’s a weekly-updated file containing every public feed in the Podcast Index - it’s here, and is about 1.8GB. Unzip it, and it becomes a 5GB file, which you can “easily” interrogate using a thing like this SQLite browser.
Once you’ve installed the SQLite browser - if you have Homebrew installed, it’s just brew install --cask db-browser-for-sqlite …
- Open the
podcastindex_feeds.dbfile. - Hit the “Execute SQL” tab.
- Post one of the SQL statements below.
I’m mostly posting this so I have a note of them.
What are the latest episodes published by Inception Point AI?
While some shows by Inception Point AI are troubling, not all of them are. Short-casts with fishing reports or weather forecasts, or even crime reports for tiny suburbs, are a useful thing that humans can’t replicate at scale.
Inception Point AI uses Megaphone, which gives a five-letter prefix to every customer in their audio URLs; Inception Point AI’s is NPTNI, so armd with that, we can get a list of the latest shows.
CAUTION: Podcast Index marks and discards some shows which are marked as AI, so this will be incomplete.
select datetime(newestItemPubdate, 'unixepoch', 'localtime') AS newestEpisodePublished,title,link,itunesAuthor,episodeCount,description,time(newestEnclosureDuration,'unixepoch') AS latestEpisodeLength from podcasts where newestEnclosureUrl LIKE "%NPTNI%" ORDER BY newestItemPubdate DESC
What are the newest shows published by Inception Point AI?
We use oldestItemPubdate here, which is the time/date for the first episode in the RSS feed. There’s a createdOn value, but that’s the date when the Podcast Index first added the feed.
CAUTION: Podcast Index marks and discards some shows which are marked as AI, so this will be incomplete.
select datetime(oldestItemPubdate, 'unixepoch', 'localtime') AS createdDate,title,link,itunesAuthor,episodeCount,description,time(newestEnclosureDuration,'unixepoch') AS latestEpisodeLength from podcasts where newestEnclosureUrl LIKE "%NPTNI%" ORDER BY oldestItemPubdate DESC