Generative AI B-roll vs stock libraries vs filming your own — the real cost, quality, and use-case math for 2026. The 70/30 mix rule, the generative workflow that avoids the uncanny look, verified per-second pricing, and the hidden costs nobody budgets for.
In 2026, the right B-roll source depends on the shot. Use free stock (Pexels, Mixkit, Coverr) for roughly 70% of cutaways — generic city, nature, office, and food shots where filmed footage looks better than rendered and costs nothing. Use generative video (Runway Standard $12/mo, Kling ~$10/mo, Pika ~$8/mo) for the ~30% of shots that need a specific subject, abstract concept, or brand-distinct moment no stock library carries. Generative-only edits look uncanny because the AI tells stack up across shots; stock-only edits look generic because competitors pull the same clips. The 70/30 mix wins on both authenticity and differentiation. Budget $0.04-0.30 per finished generative second and 1.4-1.8 generations per usable shot.
B-roll is the connective tissue of every short-form video — the cutaways between voiceover beats, the visual interest that holds attention while the narration carries the argument. Before generative AI, B-roll meant one of three things: a stock subscription (Pexels, Storyblocks, Artgrid), a camera and an afternoon, or a motion designer and an invoice. In 2026 a fourth option is real: text-to-video models that render shots which do not exist in any library — a specific product on a specific surface, an abstract concept made literal, a brand-styled moment shot to match a script beat exactly.
The naive question is "generative or stock?" The operator question is "which source wins which shot, and how do I mix them without the edit looking either commoditized or synthetic?" This spoke answers both with verified 2026 pricing, a three-way cost-quality-use-case table, the generative workflow that actually ships, and the hidden costs that turn a cheap-looking $8/mo tool into a real line item. For the model-by-model deep dive behind the providers named here, see [text-to-video-tools-2026](/ai-video-generation/text-to-video-tools-2026); for the full faceless pipeline that consumes this B-roll, see [faceless-video-creation](/ai-video-generation/faceless-video-creation). Third-party prices verified 2026-06-17.
B-roll has three supply chains in 2026 and they are not interchangeable. Stock libraries give you real footage at near-zero marginal cost but zero specificity — you write your script around what exists. Filming gives you exact, authentic, defensible footage at the cost of time, gear, and a location. Generative gives you any shot you can describe at a per-second price, but with consistency drift across shots and a resolution ceiling. Most creators default to one source and pay for it in either generic-looking edits or blown budgets. The honest mapping, with verified 2026 pricing:
| Source | Cost | Quality ceiling | Specificity | Best for |
|---|---|---|---|---|
| Free stock (Pexels, Mixkit, Coverr) | $0 | Real-camera, up to 4K | Low — only what was filmed | Generic context: city, nature, office, food, hands-on-keyboard |
| Paid stock (Storyblocks, Artgrid) | $30-65/mo | Real-camera, 4K, broader catalog | Low-medium — larger but still generic | Higher-volume channels needing variety and commercial clearance |
| Generative (Runway, Kling, Pika) | $8-76/mo + per-second compute | 1080p typical, drifts across shots | High — render to match any beat | Specific, abstract, or brand-distinct shots that do not exist in stock |
| Self-filmed | Time + gear + location | Whatever your camera shoots | Exact | Hero shots, product realism, anything authenticity-critical |
Read down the table and the strategy writes itself: no single source covers a real edit. A 60-second short might pull eight stock cutaways, render two generative shots for the moments stock cannot supply, and intercut one filmed product shot. Routing each beat to the right source is the entire skill — and it is what the 70/30 rule formalizes below.
Stock outperforms generative more often than the AI hype admits, because most B-roll is contextual rather than specific. Stock wins when:
The default posture should be stock-first. Reach for generative only when a specific beat cannot be served by anything in the library — which is exactly the 30% the next section defines.
Generative earns its cost on the shots stock structurally cannot supply. It wins when:
Provider fit matters here. Per the model deep-dive in [text-to-video-tools-2026](/ai-video-generation/text-to-video-tools-2026): Runway Gen-4 leads on filmic, cinematic B-roll; Kling 2.0 leads on motion fluidity for action cutaways; Pika 2.5 leads on stylized and abstract motion. For a single starter tool, Runway Standard ($12/mo) covers the widest range of B-roll shots; add Pika ($8/mo) or Kling (~$10/mo) when stylized or high-motion beats become frequent. The shot-type-to-provider routing, with verified entry pricing:
| B-roll shot type | Primary pick | Secondary pick | Why |
|---|---|---|---|
| Cinematic / filmic context | Runway Standard ($12/mo) | Kling (~$10/mo) | Runway's filmic look and camera controls dominate slow, atmospheric cutaways. |
| High-motion / action cutaway | Kling (~$10/mo) | Runway Standard | Kling's motion fluidity holds together fast movement that other models smear. |
| Stylized / 2D / motion graphic | Pika (~$8/mo) | Runway Standard | Pika's effects library is purpose-built for stylized and abstract motion. |
| Abstract concept made literal | Runway Standard ($12/mo) | Pika (~$8/mo) | Concrete prompts with one motion verb render most reliably on Runway. |
| Animated still (photo to motion) | Luma Plus ($30/mo) | Runway Standard | Luma's reference-image keyframe lock is the cleanest still-to-motion path. |
| Product close-up (specific item) | Runway Standard ($12/mo) | Kling (~$10/mo) | Stock rarely carries your exact product; render to match, then grade. |
The pattern most growing creator channels converge on after burning a few hundred dollars learning it the hard way:
Why the split works in both directions: a 100%-stock channel is visually indistinguishable from every competitor pulling the same Pexels clips — the audience has seen that exact drone shot of a city at dusk a hundred times. A 100%-generative channel reads uncanny, because the model's tells (warped hands, drifting backgrounds, off physics, inconsistent subjects) compound shot over shot until the whole edit feels synthetic. The 70/30 mix keeps stock's real-camera authenticity as the visual baseline and spends generative only where differentiation actually pays — which also happens to be where the editing tax is worth absorbing.
The difference between generative B-roll that ships and generative B-roll that wastes an afternoon is process discipline. The workflow that works:
That last step is the one most creators skip, and it is the single biggest tell of an unpolished AI edit. A mismatched color grade between a Pexels clip and a Runway render is more noticeable than the render itself.
The sticker price of a generative tool is the smallest part of its real cost. Budget for these before deciding generative is "cheap":
Kompozy is not a video model — it is the orchestration layer that calls the providers above on your behalf and assembles the result. The B-roll decision is handled inside the formats rather than left as nine separate API keys for the user to juggle. In Persona Shorts, B-roll pulls from Pexels by default (the free 70%); users can opt into generative B-roll, at which point the LLM extracts the shot intent from the script and routes it to Runway, Kling, or Luma depending on whether the beat is filmic, high-motion, or a reference-locked still — exactly the 70/30 routing logic this spoke describes, automated.
Kompozy pricing is independent of which model the orchestration layer picks: Creator $49/mo (2,500 credits) and Pro $299/mo (18,000 credits), with a BYO-key founding tier. A clipped short costs 14 credits and an AI-generated short 214 credits, so the B-roll-heavy generative path is the most credit-intensive format — which is the cost reality this spoke exists to make legible. See [pricing](/pricing) for the full per-format credit math, and [content-repurposing](/repurpose) for how B-roll feeds the broader multi-platform fan-out.
The honest limits matter as much as the wins, because believing generative can do more than it can is how creators ship edits that look off. As of mid-2026, generative B-roll still cannot reliably deliver real-world realism at extreme close-up (the texture and micro-motion tells appear), legible in-scene text, multi-shot subject continuity without disciplined reference workflows, or anything above its 1080p consumer ceiling. It also cannot replace the authenticity of a real filmed hero shot — for founder cameos, real customers, and product close-ups under two feet, the camera still wins.
Use generative for the 30% of shots that need specificity stock cannot provide, lean on free stock for the 70% that does not, film the handful of shots where authenticity is non-negotiable, and grade the whole thing to one LUT. That is the entire 2026 B-roll playbook — and it is cheaper, faster, and better-looking than committing to any single source. Start with [text-to-video-tools-2026](/ai-video-generation/text-to-video-tools-2026) to pick your generative provider, or [faceless-video-creation](/ai-video-generation/faceless-video-creation) for the full no-camera pipeline this B-roll plugs into.
Yes, for the ~30% of shots that need specificity stock cannot provide. For the ~70% of generic cutaways (city, nature, office, food), free Pexels is equivalent quality at zero cost. A generative subscription pays back if you produce 5+ videos a week with at least 5-10 specific, brand-distinct, or script-locked shots per video. Runway Standard ($12/mo) is the widest-coverage single tool; Pika (~$8/mo) and Kling (~$10/mo) add stylized and high-motion range.
There is no production-grade free generative-video tool in 2026 — free tiers carry watermarks and lag the paid models badly on quality. For genuinely free B-roll, the right answer is royalty-free stock: Pexels, Mixkit, and Coverr cover the generic 70% at zero cost and real-camera quality.
Budget $0.04-0.30 per finished 1080p second after the editing tax, on the value providers (Runway Gen-4 Turbo, Kling, Hailuo). For a 30% generative mix on a 60-second short — roughly 18 seconds of generative cutaways — that lands around $1-5 of compute per video, with stock supplying the rest for free. Per-second figures verified 2026-06-17.
Yes — Pexels is royalty-free for commercial use with no attribution required. Paid libraries differ: Storyblocks requires a Business tier (around $30/mo+) for full commercial rights. Always confirm the license on any clip before shipping a monetized video.
Mix sources (Pexels + Mixkit + Coverr rather than one library), apply a unique shared color grade via a LUT so your edits do not look like everyone else's, and supplement with 20-30% generative for the specific shots competitors are most likely to repeat. The color grade is the highest-leverage fix — it unifies the look and breaks the "same clip everywhere" pattern.
Yes — the leading 2026 models (Runway, Kling, Pika) support 9:16 1080x1920 output natively, which is the correct resolution for Shorts, Reels, and TikTok. Export at that resolution directly rather than rendering 16:9 and cropping, which wastes pixels and reframes awkwardly.
Runway for filmic, cinematic shots and the widest general coverage. Kling for high-motion and action cutaways where motion fluidity matters. Pika for stylized, abstract, or 2D motion-graphic B-roll. For a single starter tool, Runway Standard ($12/mo) is the safest pick; add the others when stylized or high-motion beats become frequent. See text-to-video-tools-2026 for the full model breakdown.
Because the tells compound across shots. Each generative render drifts in subject, color, lighting, and style, so a sequence of individually-acceptable clips reads as synthetic in aggregate. The fixes: keep generative to ~30% of shots so real-camera stock carries the visual baseline, reuse tight prompts to limit drift, and apply one shared LUT across every clip so the mix grades as a single coherent edit.