James Cridland

HTML5 audio player with VTT captions

I’ve spent much of the evening trying to get an audio player working with VTT captions.

The previous code I used worked on everything (Chrome on desktop, Chrome on mobile, Safari on mobile) but not on Safari on desktop, which was a bit annoying. It turns out that, for whatever reason, Safari desktop sets the track as “disabled” for some reason, and needs specifically telling that it needs to show it. I suspect this is because there’s no space in the audio controls to show the captions.

Anyway, this is the code you can use, including the workaround for Safari.

Example

Here’s a short clip from the BBC’s PM programme, from earlier in 2025. Hit play!

The code

<div style="height:150px;width:100%;text-align:center;">
    <audio style="width:100%;" id="vtt-player-{{ .Get 0 }}" controls src="/uploads/{{ .Get 0 }}.m4a">
        <track default kind="captions" label="captions" srclang="en" src="/uploads/{{ .Get 0 }}.vtt" />
    </audio>
    <div style="display:none;text-align:center;font-family:monospace;font-weight:bold;text-wrap:balance" id="vtt-text-{{ .Get 0 }}"></div>
    <!-- Text-wrap:balance above helps text wrap without one word on the next line, making it easier to read
     https://developer.chrome.com/docs/css-ui/css-text-wrap-balance -->
</div>

<script>
// Safari needs to be specifically told to show this track, for some reason.
document.getElementById('vtt-player-{{ .Get 0 }}').textTracks[0].mode="showing";

document.getElementById('vtt-player-{{ .Get 0 }}').addEventListener('play', function() {
    //We hid the captions in the div above, since Safari shows them on page load
    //Now the listener has hit play, show them.
    document.getElementById('vtt-text-{{ .Get 0 }}').style.display="block";
});

document.getElementById('vtt-player-{{ .Get 0 }}').textTracks[0].addEventListener('cuechange', function() {
    //Update the caption
    document.getElementById('vtt-text-{{ .Get 0 }}').innerText = this.activeCues[0].text;
});
</script>

Notes on using this

Replace {{ .Get 0 }} with the name of your audio file, and the name of the VTT captions file. In my case, they’re always the same. You’ll notice it’s all over the place here: all the IDs are marked with it too, so you can have more than one player on the page. (You may wish to use .mp3 instead of .m4a, and that’s cool too).

While audio files aren’t subject to CORS security, transcripts are, rather unfortunately. While it’s best practice for podcast hosting companies to make their VTT files (and RSS files) open to all, you might need to run a reverse proxy, and/or just copy the file.

If you’re using Hugo as a static site generator, then save this as audio-player.html in the shortcodes folder in you site template, and call it as:
{{< audio-player nameoffile >}}

Important: Add preload="none" if you’re using this for podcasts; otherwise you’re kicking off a download that an advertiser might be paying for, even if it’s never actually played.

How to produce the VTT file

You’ll want to install whisper-cpp which is relatively easy on a Mac.

VTT files are also compatible with the podcast:transcription tag; they do support individual speakers, too, though this code above will not do deal with that.

More posts about:

Previously:

Next: