How to clip podcasts for social (manual vs AI clip detection, 2026)
Clip podcast episodes into shareable short-form videos. Covers manual clip selection, Opus Clip / Submagic AI clip detection, and how to pick which approach for your podcast type.
Last verified 2026-05-22
Clipping podcasts means turning a 30-90 minute episode into 6-15 standalone 30-60 second clips that can each function as a TikTok, Reel, or Short. The clip set is typically a podcaster's entire short-form output — most podcasters stopped recording original short-form content years ago in favor of clipping the conversations they were having anyway.
Two approaches dominate in 2026: manual clip selection (you scrub the timeline, find the moments, cut them) and AI clip detection (an algorithm scans the transcript / video and surfaces moments scored by virality heuristics). Opus Clip and Submagic are the major AI tools; Descript and CapCut are the major manual tools.
Which approach wins depends on the podcast type. Interview-heavy podcasts with clear narrative beats work well with AI detection. Conversational / improvisational podcasts where the gold is in subtle context shifts tend to require manual selection because AI tools miss subtext.
The steps
Decide between manual and AI clip selection. AI clip detection (Opus Clip — opus.pro, Submagic, Vidyo.ai) takes a full episode video and outputs 8-15 ranked clip candidates with timestamps, captions, and proposed hooks within minutes. Works well for clear-narrative podcasts (Q&A interviews, single-topic deep dives). Manual selection (CapCut, Descript) is slower but catches subtle moments that AI scoring misses — best for conversational podcasts, improvisational shows, and content where the value is in unexpected moments.
Manual method — transcribe first. For manual clipping, transcribe the episode using Whisper, AssemblyAI, or Descript (built-in). Reading the transcript is 5-10x faster than scrubbing video — you can spot clip candidates in 15-20 minutes vs 60-90 minutes scrubbing. Highlight 15-25 candidate moments in the transcript before opening the video editor.
Manual method — cut tight with Descript or CapCut. In Descript, the transcript IS the timeline — delete words to cut video. For each candidate moment, position the playhead at the start, select to the end, and cut. Aim for clips with a strong opening (the speaker hits a hook in the first 3-5 seconds) and a clean ending (no trailing words from the next thought). Export each clip as a separate MP4.
AI method — upload to Opus Clip. In Opus Clip, click New Project, upload your episode MP4, and configure: clip length (30-60s for most platforms; up to 90s for YouTube Shorts), aspect ratio (9:16 default), and template (background blur, captions, overlay style). Opus runs its ClipAnything algorithm, scores moments by retention shape + emotional intensity + speaker diarization, and outputs 8-15 ranked clips within 10-30 minutes. Pricing starts around $19/mo.
AI method — review and override. AI clip detection is not perfect. Review every suggested clip and override: discard clips that miss context (the speaker is responding to something said 30 seconds earlier that the clip omits), trim clips that include filler at the start or end, and accept clips that are genuinely strong. Most podcasters override 30-50% of AI clip outputs.
Add captions to every clip. Whether manual or AI-generated, every clip needs captions — 70%+ of short-form viewing is sound-off, and a clip without captions earns ~30% of the watch time of the same clip with captions. Submagic, CapCut, and Opus Clip all generate captions; pick one preset and apply it to every clip in the batch for visual brand consistency.
Add a hook overlay if the spoken opening is weak. If the speaker takes 5+ seconds to land the hook, overlay a text hook on frame 1 ("Why I quit Twitter after 8 years…") that summarizes the clip's payoff. The hook overlay keeps the viewer through the slow opening so the spoken content can land. See write-viral-hooks for the framework.
Schedule clips with deliberate stagger. A 60-minute podcast typically yields 8-15 clips. Do not publish all of them in the same week — spread across 3-4 weeks so the episode keeps generating impressions and the audience does not exhaust. Pair the clips with the full episode publish: clip 1-3 in week 1 (around episode drop), clips 4-8 in weeks 2-3 (sustaining), clips 9-15 in week 4 (final push before next episode).
Common gotchas
AI clip tools score by surface-level signals (energy, keyword density, speaker diarization). They miss subtle context and improvisational gold. Always review.
Clips taken out of context can misrepresent the conversation. Verify each clip stands on its own meaning before publishing.
Transcripts with multiple overlapping speakers (interview format) confuse AI clip tools more than single-speaker monologue formats.
Opus Clip and similar tools charge per minute of input video. A weekly 90-minute podcast = ~360 input minutes/month, which puts most podcasters on the $19-49/mo plans.
Publishing all clips in the first week wastes the episode's long tail. Spread across 3-4 weeks.
Vertical aspect ratio (9:16) usually requires re-framing the original 16:9 podcast footage. Most clip tools auto-track the active speaker; manually verify the framing on each clip.
Where Kompozy fits
Kompozy clips podcasts as part of its automation pipeline. Connect your podcast RSS feed or drop in an episode MP4 / MP3; the engine transcribes, scores moments, generates 8-15 clip candidates with captions and hook overlays, and cross-posts to TikTok / Reels / Shorts / LinkedIn on a staggered schedule.
For a podcaster shipping 4+ episodes per month, Kompozy collapses the Opus + Submagic + CapCut + Buffer chain into a single configured workflow. For 1-2 episodes per month, Opus Clip ($19-49/mo) + Submagic ($25-45/mo) is fine — Kompozy would not save enough time to justify itself at that volume. Pro tier ($299/mo for 18,000 credits) covers ~4-6 podcast episodes per month including caption + hook + cross-platform fanout.
Frequently asked questions
Is Opus Clip or Submagic better for podcast clipping?
Opus Clip is more specialized for podcast / long-form video clip detection — its ClipAnything algorithm is the most-cited in 2026. Submagic is more focused on caption styling and works well on already-cut clips. Many podcasters use Opus for detection and Submagic for caption polish.
How many clips should I get from a 60-minute episode?
8-15 clips is typical. Less than 6 suggests the episode lacks short-form-friendly moments (consider how you are pacing); more than 15 suggests you are accepting low-quality clips that will underperform.
Can I clip a podcast that is audio-only (no video)?
Yes, but you need to build a visual track. Most podcast clipping tools (Headliner, Wavve) create audiograms with a waveform, captions, and a static or animated background. Audiograms underperform actual video clips on TikTok and Reels but work fine on Twitter and LinkedIn.
Should I record podcast video specifically for clipping?
Yes if short-form is a meaningful traffic source. Two-camera setups (close-up + medium) give clip tools more visual variety to work with. Single-camera works but produces visually flatter clips.
Do clips link back to the full episode?
They should. Include a "Full episode in bio" CTA in the caption and link to the episode in your profile. Most podcasters see 8-15% click-through from clip viewers to full-episode listens, which is the conversion the whole workflow is optimizing for.
Can I clip a podcast I do not host?
Only with permission. Even if the podcast is public, the host owns the copyright, and clipping + posting it under your account is infringement. Some podcasts explicitly allow clipping with attribution — check the show notes or contact the host.