Meta's multi-media ads let you upload up to 10 images and videos and let its AI assemble and test the winning combinations. Here is what the format actually does, the disclosure rules you cannot skip, the creative practices that decide performance, and the supply problem the AI does not solve.
In June 2026 Meta published guidance on getting the most out of its AI-powered multi-media ads, the format that lets an advertiser upload a pool of assets and have Meta's system build the ad from them. Mechanically, you load up to 10 images and videos into a single ad set, in mixed formats and aspect ratios, and Meta's AI sorts through them to generate and test multiple versions — picking the combinations most likely to perform for each placement and audience. The pitch from Meta is direct: "you don't need to create separate ads for different placements or audiences." You supply the raw material; the system handles the permutations.
It is worth being precise about what the AI is doing here, because the marketing language blurs it. This is largely a combination, cropping, and testing engine, not a creative-generation engine. It reframes your assets for different surfaces, mixes them, layers optional enhancements, and runs the variants against live performance to find winners. It does not conjure distinctive footage or a brand concept out of nothing — it optimizes the assets you give it. Meta attributes a roughly 25% increase in average revenue per ad since 2022 to this class of AI-driven ad serving, and the multi-media format is the current expression of that bet. The tools are rolling out in select regions through the updated creative workflow in Meta Ads Manager.
This page is the practical best-practice guide to that format and to AI-generated Meta creative generally: the disclosure rules that now ride along with it, the creative decisions that actually move performance, the controls worth keeping your hands on, and — the part the AI does not solve — where the creative supply comes from. For how this fits the wider industry move of ad generation shifting inside the platforms, see the guide on AI ad generation inside the ad platforms; for the cross-platform creative angle, the guide on AI ad creative generation for social platforms.
Before any creative advice, the compliance step, because it now gates whether your ad runs at all. In a March 2026 policy update, Meta made disclosure of AI-generated or AI-modified ad content mandatory — synthetic visuals, background replacements, AI voiceovers, and the like. Crucially, Meta does not rely only on your honesty: its systems scan ad submissions for C2PA Content Credentials embedded by tools such as DALL·E, Midjourney, and Stable Diffusion, as well as its own detection signals, and automatically apply a "Made with AI" label to creative flagged as photorealistic AI imagery. Advertisers cannot remove that label.
The operational consequences are concrete. Undisclosed AI content has become one of the more common reasons ads get rejected, so skipping disclosure is not a gray area you can quietly exploit — it is a rejection waiting to happen. Build disclosure into your QA: assume any model-generated or model-altered asset will be detected, label it proactively where the context calls for it, and keep a human approving final creative rather than letting an automated pipeline ship synthetic photorealism unreviewed. The defensible 2026 posture is to use AI for ideation, iteration, and production speed — not to manufacture deceptive realism — and to treat the label as a fact of the format rather than a penalty to dodge.
The single most repeated best practice, and the one Meta states plainly, is that the more creative options you provide, the more opportunities the delivery system has to optimize. The multi-media format is built to be fed: it wants the image and video slots filled so it has permutations to test. An ad set with two assets gives the AI almost nothing to optimize; one with a full, varied pool gives it room to find the winner you would never have guessed.
But volume without diversity is wasted. Ten near-identical crops of the same product photo do not give the system meaningfully different things to test — they give it the same ad ten times. The asset pool that performs is built from distinct creative angles: UGC-style clips that read as peer content rather than advertising, clean product demos, lifestyle shots, testimonial-style footage, and text-led explainer frames. Each angle is a different hypothesis about why someone buys, and the AI's job is to discover which hypothesis lands for which audience. You can only get that answer if you actually supplied competing hypotheses.
Diversity also buys durability. Creative fatigue is real on Meta — the same assets degrade as the audience sees them repeatedly, showing up as rising delivery costs and falling click-through over a week or two. A broad, varied pool that you refresh on a regular cadence keeps fresh signal flowing into the system instead of letting it grind down a stale set. The practical rhythm most performance teams settle on is a steady stream of new assets across several angles, retiring the tired ones, rather than one big creative drop followed by months of decay.
Inside the asset pool, a handful of format-level practices consistently separate ads that work from ads that get scrolled past. None of them are Meta-specific magic; they are the physics of how the feed is actually consumed.
The opening of a video decides its scroll-stop rate — front-load the strongest visual or the value proposition before any branding or slow build. And design for silence: a large majority of feed video is watched without sound, so captions are not optional. Burn them in or use Meta's auto-captioning, and check legibility on a phone screen, because that is where the ad is seen.
The overwhelming share of Meta's ad inventory is vertical, so 9:16 is the primary canvas, with 4:5 a useful secondary for feed. Because the multi-media system reframes and crops assets across placements, keep your text and key visuals inside safe zones — anything pushed to the edges risks being cropped out when the AI reformats the asset for a placement you did not design it for. Composing with margin is what lets you hand the system cropping freedom without losing the message.
Meta's creative enhancements can add or reposition text overlays and restyle copy. That is helpful for fast iteration and a liability for brand campaigns with legally or editorially approved wording, because AI-adjusted text may not match your tone or claims. The right move is selective: leave enhancements on where you want the system to experiment, and toggle them off for the assets where the copy has to stay exactly as written. The preview-and-toggle controls exist precisely so you can draw that line.
It is easy to read "AI optimizes the ad" as "hand it over and walk away." The advertisers who get the most from the format do the opposite — they let the AI own the permutation-and-testing problem and keep a firm grip on the guardrails. You retain manual control over cropping, text overlays, the destination URL, and placement preferences, and you can preview AI-generated and enhanced versions before anything goes live. Those controls are where brand safety lives.
So the division of labor is clear: the system decides which combinations to serve and to whom; you decide what is acceptable to serve at all. Set the framing so a reframed crop never cuts off your product, lock the copy where it must be exact, disable enhancements that drift off-brand, and review the previews. Human-in-the-loop is not a hedge against the AI being bad — it is how you make a powerful optimizer work for your brand specifically instead of toward a generic high-performing average.
Strip the format down and the dependency is obvious. Meta's multi-media AI is excellent at finding the best combination of the assets you give it — and completely dependent on you giving it good, varied, on-brand assets in the first place. It optimizes supply; it does not produce it. The 10-slot pool, the several distinct angles, the regular refresh cadence to beat fatigue: every best practice above is, underneath, a demand for a steady stream of distinctive creative. That is a production problem, and it is the one the ad tool deliberately leaves to you.
For a single campaign you can shoot your way out of it. At the cadence Meta's system actually rewards — multiple fresh angles, refreshed every couple of weeks, held consistent to one brand across every asset — manual production becomes the bottleneck the AI optimizer is starving against. And it compounds: the same campaign should be earning organic reach on Instagram and Facebook and beyond, not just running as paid, because the organic content is what warms the audience the ad later converts. Meta's multi-media ad has no opinion about any of that. It is a paid-placement optimizer inside one company's walls; the creative stream feeding it, and the organic presence around it, are yours to generate.
Kompozy is not an ad-platform tool and does not bid in Meta's auction or compete with the multi-media optimizer — it sits one layer up and solves the supply problem the optimizer assumes you have already solved. It is a generation engine, so it produces the distinctive, on-brand assets that fill the upload slots: Persona Shorts and Marketing Shorts give you vertical video and demo footage, Persona Photos and Photo Posts give you a clean branded image library, Carousel Posts and Quote Graphics give you text-led and lifestyle frames. That is the several-genuinely-different-angles requirement turned into a render queue instead of a shoot schedule.
It also fixes the consistency tax that a varied asset pool usually carries. When ten assets across four angles still have to read as the same brand, that is exactly what Meta's reframing-and-recombination can fracture — different shoots, different looks, different voices. Kompozy holds it together by design: the Persona Brief governs voice across every asset, Gemini face-lock keeps a persona's face identical from clip to clip, and HyperFrames renders pixel-exact brand styling. So you can hand Meta the creative volume and diversity it rewards without handing it ten variations that look like ten different companies — the optimizer recombines a coherent brand, not a grab bag.
And because it is a multi-platform publishing engine, it covers the half of the funnel the ad tool ignores. From one source, Kompozy fans out the organic spread — Clipped Shorts, Persona Tweets, Blog Articles, Email Newsletters — and schedules and publishes it across nine social platforms plus email and blog from one queue, on a cadence, behind a per-post review gate. The practical 2026 stack is both: run the paid multi-media ad on Meta with its native optimizer, and run the creative supply line and organic presence that feed it on Kompozy. Meta finds the winning combination; Kompozy produces a steady, on-brand stream worth combining and the organic reach that makes the ad land. For the wider tool map, see the 2026 AI content tool landscape, and for the related shift in how ads get made, the guide on AI ad creative generation for social platforms.
It is a single ad unit where you upload up to 10 images and videos in mixed formats and aspect ratios, and Meta's AI automatically assembles and tests different versions to find the combinations that perform best across placements and audiences. The point, in Meta's words, is that you no longer need to build separate ads for different placements or audiences — you supply the assets and the system optimizes the delivery.
Not the source creative. The multi-media format is mainly a combination, cropping, and testing engine — it works with the images and videos you upload, plus optional enhancements like reframing and text overlays. The distinctive footage, product shots, and concepts still have to be produced before Meta can optimize them. The more genuinely different, on-brand assets you feed it, the more the system has to work with.
Yes. Since a March 2026 policy update, Meta requires disclosure of AI-generated or AI-modified ad content, and it automatically applies a "Made with AI" label to creative it detects as photorealistic AI imagery — through C2PA metadata from tools like DALL·E, Midjourney, and Stable Diffusion or its own detection. Advertisers cannot remove those labels, and undisclosed AI is now a common rejection reason, so treat disclosure as a build step, not an afterthought.
Lean toward more rather than fewer, but make them genuinely different. Meta's guidance is that the more creative options you provide, the more opportunities the delivery system has to optimize. In practice that means filling the available image and video slots with several distinct angles — UGC-style clips, product demos, lifestyle shots, text-led explainers — rather than ten near-identical variants of one photo, and refreshing them on a regular cadence to fight fatigue.
You retain manual control over cropping, text overlays, the destination URL, and placement preferences, and you can preview and toggle specific creative enhancements off before the campaign runs. So the AI decides which combinations to test and serve, but you set the guardrails — which is exactly where you protect brand voice, approved copy, and the framing of key visuals.
Meta's AI multi-media ads let you upload up to 10 images and videos in mixed formats and aspect ratios; its AI then assembles and tests versions to find the combinations most likely to perform across placements and audiences. Best practice is to supply many genuinely different, on-brand assets — Meta optimizes what you give it but does not invent distinctive source creative — keep control over cropping, copy, and placement, refresh assets on a cadence, and disclose AI content, since Meta auto-applies an unremovable "Made with AI" label and rejects undisclosed AI.
Get started → · ← All guides · Compare Kompozy vs other tools