Convert podcast audio into YouTube-ready video with waveform, transcript captions, chapter markers, and thumbnail strategy. Workflow and gotchas.
Podcasters who don't upload to YouTube are leaving 30-50% of their addressable audience on the table. YouTube has become the second-largest podcast platform globally, ahead of Apple Podcasts in some markets — but it punishes audio-only uploads. A static-image-with-waveform video is the minimum acceptable format; everything else (chapters, captions, search-optimized titles, thumbnails) is what actually moves discovery.
The technical conversion is the easy part. A 60-minute MP3 + a 1920x1080 still image + auto-captions = a publishable YouTube video. The harder part is the editorial layer: chapter markers, search-optimized titles, and thumbnails that compete with native YouTube content for click-through. Workflow handles both.
Podcasters with even a modest YouTube presence get the highest compounding return from this pair. YouTube's search index runs on every episode for years; podcast app discovery is mostly first-week. The cross-post extracts long-tail traffic the original audio never sees.
Podcast source is typically MP3 or WAV, 30-90 minutes, stereo. Most podcasters have show notes and a transcript (or can generate one cheaply). Both feed YouTube's description and SRT side-car.
YouTube long-form, 16:9, up to 12 hours. SRT side-car captions accepted. Chapters supported via timestamped lines in the description. Thumbnails are the single biggest CTR lever.
| Issue | Fix |
|---|---|
| YouTube auto-captions mangle guest names | Upload SRT side-car from your show transcript. |
| Default podcast cover art as thumbnail = 60% CTR loss | Design a custom thumbnail with guest face + callout text. |
| No chapter markers = no chapter chips, lower watch-time | Add 5+ timestamped chapters in description, first at 00:00. |
| Podcast episode title under-performs YouTube search | Rewrite for search intent; lead with the value, not the episode number. |
| 90-minute static-image video tanks engagement | Add waveform animation or section b-roll cuts at minimum. |
| Audio levels normalized for podcast apps clip on YouTube | Re-master to -14 LUFS for YouTube vs -16 LUFS for podcast apps. |
Following the workflow above by hand: trimming, reframing, captioning, writing copy, publishing.
Paste the source URL or upload the file. Kompozy handles transcript, scoring, reframe, captions, copy, and publish.
Yes — YouTube rejects pure audio. A static image + waveform is the minimum format.
Yes. Each podcast episode typically yields 8-15 standalone moments suitable for Shorts/Reels/TikTok. Cross-post separately.
-14 LUFS for YouTube, -16 LUFS for podcast apps. Re-master if you're uploading the same file to both.
Yes — they unlock YouTube's in-player chapter chips, which measurably lift watch-time and session retention.
Upload audio, Kompozy generates waveform video, pulls transcript as SRT, writes chapters from transcript breakpoints, drafts a search-optimized title, designs a thumbnail. ~5 minutes per episode.
Browse all repurposing pairs · See Kompozy pricing · Start your trial →