A volumetric video format that stores a moving scene as a cloud of time-aware Gaussian "splats" — letting a viewer move the camera freely through recorded motion instead of watching a fixed 2D frame.
Last verified · 2026-07-04 · by Moe Ameen
A 4D splat format stores a dynamic scene not as a flat grid of pixels but as a cloud of soft, translucent 3D blobs — "splats" — each carrying a position, an ellipsoidal shape, a color, an opacity, and, in the 4D case, how all of those change over time. "3D" gives you geometry you can orbit around; the fourth dimension is time, so the whole scene moves. Rendering it back out is called splatting: the engine projects every blob to the screen and alpha-composites them in depth order, which is why it runs in real time on a GPU. The technical name for the dominant approach is 4D Gaussian splatting (4DGS), an extension of the 3D Gaussian splatting method that broke out in 2023.
The practical difference from a normal MP4 is that an MP4 is a recording from one camera position — the director already chose the shot. A 4D splat scene stores the geometry and appearance of the whole space, so at playback time the viewer (or a VFX artist, or a game engine) chooses the camera: orbit the subject, dolly in, freeze time and walk around a frozen moment, or re-light the scene. That "free-viewpoint" property is what makes people call it volumetric video or holographic video rather than just video.
Under the hood, each 4D Gaussian is a primitive with a 4-dimensional mean and a 4×4 covariance — an anisotropic ellipsoid that can stretch and rotate through space and time — and its color is encoded with 4D spherindrical harmonics so it can look different from different angles and at different moments. Because the primitives are explicit and optimizable (unlike a NeRF, which bakes the scene into an implicit neural network you have to query), 4DGS renders far faster: research implementations report real-time frame rates at HD, which is the whole point of the format existing.
The honest caveat: there is no single agreed 4D splat file format yet. Static Gaussian splats are converging on standards — the Khronos glTF `KHR_gaussian_splatting` extension reached release-candidate status in early 2026, and OpenUSD and Cesium 3D Tiles support them — but the 4D (moving) case is still fragmented, with each research project and vendor using its own on-disk layout and compression scheme. Files are also large: streaming a moving volumetric capture is measured in tens to low-hundreds of megabits per second before compression, which is why most of the 2026 work is about hierarchical, progressively-streamed encodings rather than the primitive itself.
The lineage starts with NeRF (Neural Radiance Fields, 2020), which proved you could reconstruct a photorealistic 3D scene from ordinary photos — but it was slow to train and slow to render because the scene lived inside a neural network. In 2023, "3D Gaussian Splatting for Real-Time Radiance Field Rendering" (Kerbl et al., SIGGRAPH 2023) replaced the network with millions of explicit Gaussian blobs and a fast rasterizer, collapsing render times from seconds-per-frame to real time. That paper is the reason "splat" entered the creator vocabulary.
The move to 4D followed almost immediately. Through 2024 a wave of methods — deformation-field approaches that warp a canonical 3D splat cloud over time, and native-4D approaches that treat time as a real fourth axis of each Gaussian — extended splatting to moving scenes. By 2026 the work had shifted from "can we do it" to "can we ship it": the research focus moved to compression and streaming (hierarchical bitstreams that decode more or less detail based on bandwidth) and the format began early commercial deployment, initially in film, VFX, and sports broadcast, with vendors like Gracia and Evercoast productizing capture-to-playback pipelines and browser/WebXR streaming.
| Platform | Behavior |
|---|---|
| Web / WebXR | Renders in-browser with no install via WebGL/WebGPU. The delivery target most creators will actually touch — a splat scene embedded on a page or in a WebXR experience. Progressive streaming lets it start rendering before the full file downloads. |
| VR / AR headset | The native home for the format — free-viewpoint and depth are the whole value proposition in a headset. Playback quality is gated by the headset GPU and the scene's splat count; heavy captures get decimated for standalone headsets. |
| Film / VFX pipeline | Imported into DCC tools (Cinema 4D, Unreal, Blender via plugins) as a relightable, re-frameable element. Editors treat a captured performer as a 3D asset they can shoot from any angle in post rather than a locked plate. |
| Social feeds (Instagram, TikTok, YouTube) | No feed renders a live 4D splat scene — they play flat 2D video. To post from a splat scene you render a fixed camera path out to an MP4 first, which throws away the free-viewpoint property but gives you a normal short. This export step is where a repurposing pipeline picks it up. |
| Game engines (Unreal, Unity) | Splat scenes drop in as a rendering layer via engine plugins, useful for photoreal environments and captured characters. Real-time budget and collision/interaction (splats have no mesh) are the practical limits. |
4D splats are one of the more genuinely exciting formats to appear recently, and also one of the most misunderstood by creators, because the thing that makes it special — the viewer picks the camera — is exactly the thing that disappears the moment you post to a feed. A 4D splat scene is a source asset, not a finished post. It lives in headsets, WebXR embeds, and VFX timelines. The second it needs to appear on Instagram, TikTok, LinkedIn, or YouTube, someone renders a fixed camera move out to a flat MP4, and from that point on it is just really good footage.
That render-to-flat moment is where a tool like Kompozy is relevant, and it is worth being precise about the boundary. Kompozy does not capture or render splat scenes — that is upstream, in the capture rig and the reconstruction pipeline. What Kompozy is built to do is take the flat deliverable that comes out the other end — the orbiting hero shot, the fly-through, the volumetric replay rendered to video — and turn it into a month of distribution: clip it into [vertical shorts](/glossary/vertical-video) with burned-in captions, cut stills into a carousel, draft the launch blog and newsletter around it, and schedule the whole set across nine platforms from one source. So the useful mental model is a two-stage stack: the splat pipeline owns capture and free-viewpoint rendering; the content generation-and-publishing engine owns everything after the export — drafting the blog and newsletter net-new, clipping the shorts, and scheduling the whole set. Confusing the two — expecting a feed to render volumetric video, or expecting a publishing tool to reconstruct a scene — is where most of the 2026 hype gets it wrong.
It is a volumetric video format that stores a moving scene as a cloud of time-aware Gaussian "splats" — soft 3D blobs that carry position, shape, color, opacity, and how those change over time. Because it stores geometry rather than a single camera's pixels, a viewer can move the camera freely through the recorded motion. The dominant technique is called 4D Gaussian splatting (4DGS).
A normal MP4 records one fixed camera view — the shot is already chosen. A 4D splat scene stores the geometry and appearance of the whole space over time, so playback can orbit, dolly, freeze a moment and walk around it, or re-light the scene. That free-viewpoint property is why it is called volumetric or holographic video.
3D Gaussian splatting captures a static scene — one frozen moment you can orbit around. 4D adds time as a fourth dimension, so the whole scene moves. If the subject is moving, you need 4D; a 3D capture of motion will ghost or smear.
No. Social feeds render flat 2D video, not live volumetric scenes. To post from a splat scene you render a fixed camera path out to a normal MP4 first, which gives up the free-viewpoint feature but produces a standard short you can distribute anywhere.
Not for the moving case. Static Gaussian splats are converging on standards — the Khronos glTF KHR_gaussian_splatting extension reached release-candidate status in early 2026 — but 4D formats remain fragmented across research projects and vendors, each with its own on-disk layout and compression.
NeRF (2020) reconstructs a scene into an implicit neural network, which is slow to render. Gaussian splatting (2023) stores the scene as explicit, optimizable blobs and rasterizes them, which renders in real time. 4D splatting extends that explicit approach to moving scenes, keeping the real-time speed advantage over NeRF-style methods.
Kompozy does not capture or render splat scenes — that is the capture rig and reconstruction pipeline's job. Kompozy comes after the export: it takes the flat MP4 rendered from a splat scene and turns it into clipped vertical shorts, carousels, a blog, and a newsletter, then schedules them across nine platforms from that one source.