SubGrab

Make Your Videos WCAG-Compliant: A Practical Transcript Guide

What WCAG 2.2 actually requires for video accessibility, and how to use transcripts to meet AA-level conformance without a six-figure budget.

If you publish video on a website your business legally has to make accessible — government, education, healthcare, finance, anything covered by the ADA, the European Accessibility Act, or the UK Public Sector Bodies Accessibility Regulations — you need to know which transcript and caption rules apply.

This guide covers the practical version: what WCAG 2.2 says, what's enforced, and how to actually meet the bar with SubGrab without hiring a transcription agency.

The Two WCAG Criteria You Need to Know

The Web Content Accessibility Guidelines (WCAG) 2.2, the standard most accessibility laws point to, has two video-specific success criteria at the AA level:

### 1.2.2 Captions (Prerecorded) — Level A

> Captions are provided for all prerecorded audio content in synchronized media.

In plain English: every video on your site that has speech or important sound needs captions. Not auto-generated YouTube captions — actual reviewed captions that include speaker identification and non-speech audio (e.g., "[door slams]", "[laughter]").

### 1.2.3 Audio Description or Media Alternative (Prerecorded) — Level A

> An alternative for time-based media or audio description of prerecorded video content is provided for synchronized media.

You either need (a) an audio description track that describes what's happening visually, or (b) a full text transcript that conveys the same information as the video.

The transcript route is dramatically cheaper and faster. It also doubles as SEO content. This is where SubGrab earns its keep.

What Counts as a "Compliant" Transcript

WCAG's Understanding 1.2.3 document is specific. A compliant text alternative must include:

1. All spoken dialogue, with speaker identification when more than one person speaks

2. Important non-speech audio that contributes to meaning (sound effects, music cues with narrative significance)

3. Important visual information — what's happening on screen that isn't conveyed by the audio (charts, demonstrations, on-screen text, expressions)

4. A clear sequence so the reader experiences the same flow as the viewer

The first two come straight out of a SubGrab transcript. The third (visual descriptions) you have to add manually — but you only have to add it once, and it's far less work than generating audio descriptions or hiring a captioning service.

A Realistic Workflow

For a typical 10-minute corporate video:

1. Extract the transcript — paste the video URL into SubGrab, get a timestamped transcript in under a minute. Cost: 1 credit ($0.66 on the Starter pack, or 2 free credits on signup).

2. Add speaker labels — if your video has multiple speakers, edit the transcript to mark who's speaking. SubGrab's transcript export is plain text or SRT, so this is fast in any text editor.

3. Add non-speech audio cues — listen back once and bracket important sound effects: "[applause]", "[chart appears showing 40% growth]".

4. Add visual descriptions — add bracketed descriptions for anything important that's purely visual: "[Speaker holds up Product X box]", "[On-screen text: $2.1M raised]".

5. Publish the transcript on the page that hosts the video, in a <details> element or below the player. Search engines will index it.

6. Upload the SRT as captions to whichever platform hosts the video.

The whole flow for a 10-minute video takes about 30 minutes the first time, less once you have a template.

The Common Mistakes That Fail an Audit

In our experience helping users prepare for accessibility audits, these are what auditors flag most often:

  • Auto-generated captions used as-is. YouTube and Vimeo auto-captions don't include speaker IDs or sound cues, and they're typically only ~85% accurate on conversational speech. They fail 1.2.2 on the spot.
  • Transcript hidden behind a "Show transcript" button that doesn't have keyboard focus. The transcript has to be reachable by keyboard navigation.
  • Missing visual information. The most common transcript mistake — including all the dialogue but none of the on-screen demonstrations or charts.
  • No transcript at all on a page that has video. This is a 1.2.3 violation even if the video has captions.

Why a Transcript Beats Audio Description for Most Sites

WCAG 1.2.3 lets you choose: audio description OR media alternative (transcript). For a small or medium site, transcripts win on every dimension:

| | Audio description | Transcript |

|---|---|---|

| Cost per minute | $5–$15 (recorded by a voice actor) | ~$0.07 (SubGrab) + your editing time |

| Turnaround | Days (recording, mixing) | Minutes |

| SEO value | None | Full-text indexed by Google |

| Updates when video changes | Re-record entire track | Edit the text |

| Works for users on mobile data | Increases video size | Tiny text file |

The only case where audio description wins is when the video is the *primary* user experience and reading isn't a substitute — high-end marketing videos, art films, or training where the visual demo is the whole point.

A Note on AI-Generated Transcripts

SubGrab uses OpenAI's Whisper model (specifically whisper-large-v3-turbo), which currently runs at 95–98% accuracy on clear English speech. That's better than YouTube's auto-captions, but it's not 100%.

For WCAG compliance, you must review and correct the transcript before publishing. The Whisper output is a high-quality starting point that saves you from typing — it's not a finished compliance artifact on its own.

The good news: review takes about 2 minutes per minute of video, vs. 8–10 minutes per minute for typing from scratch.

Getting Started

If you have a backlog of videos to make accessible:

1. List the videos in priority order (highest-traffic first)

2. For each, run it through SubGrab and download the SRT

3. Edit for speaker labels, sound cues, and visual descriptions

4. Publish the transcript on the page; upload the SRT as captions to the host

A small team can clear 10–20 videos per week this way once the workflow is set up.

Try it free with two credits — no credit card →.