A research method that reconstructs a full 4D model of a moving object from a single phone video.
Last verified · 2026-06-23 · by Moe Ameen
Lift4D is a research method for 4D reconstruction — recovering the full geometry, appearance, and motion of a moving, non-rigid object over time from a single ordinary video. Its full title is "Lift4D: Harmonizing Single-View 3D Estimation for 4D Reconstruction In-the-Wild," and the notable claim is that it rebuilds the complete object, including parts the camera never directly sees, from one monocular clip shot in the wild — no multi-camera rig, no depth sensor, no studio.
The work comes from a group of academic researchers, including authors affiliated with Carnegie Mellon University (Yehonathan Litman, Xiaoxuan Ma, Manan Shah, Nicolás Ugrinovic, Kris Kitani, Fernando De la Torre, and Shubham Tulsiani). The paper was posted to arXiv on June 22, 2026 (arXiv:2606.23688). Technically, Lift4D is a test-time optimization framework: it adapts an existing single-view 3D reconstruction model into temporally consistent per-frame predictions (via what the authors call causal latent conditioning), uses those to initialize a deformable 3D Gaussian Splatting representation, then "sculpts" that representation to match the video — recovering visible surfaces faithfully while filling unobserved regions with a view-conditioned diffusion prior. The headline result is better quality than prior 4D methods on hard in-the-wild clips with heavy occlusion and large, non-rigid motion.
One thing to keep straight: as of this writing the project is a paper and a project page (lift4d.github.io) with an interactive 4D viewer — the code is marked "coming soon" and is not yet publicly released. So this is a research capability you can read about and watch demos of, not a product you can sign up for and run today. Treat any specific detail as a snapshot of an evolving project, and check the official page and repository for the current status.
Also keep its scope in mind: Lift4D reconstructs an object's shape and motion. It is not a text-to-video generator, not an editor, and not a publishing tool. What it produces is a 4D asset you can view from new angles across time — the raw material for a render, not a finished post.
Lift4D's output is a 4D asset — an object you can orbit, freeze, and re-shoot from camera angles that never existed in the original phone clip. Rendered out, that becomes footage: a slow product spin, a fly-around of a reconstructed prop, a motion study viewed from three sides. But a render is not a post, and one orbit clip is not a week of content. Kompozy is the layer that turns those renders into published content. Drop a Lift4D render into Kompozy and it cuts the orbit into a captioned vertical short, reframes it for each destination's aspect ratio, and stacks hook text or labels through HyperFrames so the silent-autoplay first second actually explains what the viewer is looking at. Then it schedules and publishes the same clip across TikTok, Reels, YouTube Shorts, X, LinkedIn, and the rest of the nine supported platforms from one queue.
The fan-out is where the pairing earns its keep. A single 4D reconstruction can seed a whole content unit in Kompozy: the orbit clip for short-form feeds, a carousel that walks through the reconstruction angle by angle, a Photo Post built on the cleanest render frame, and a blog or text post that explains the technique in your own voice through your Persona Brief. So one reconstruction — a product, a prop, a movement — becomes a short, a carousel, a graphic, and a written breakdown instead of a single render sitting on a drive. Lift4D owns the 3D-from-2D reconstruction; Kompozy owns the captions, the format fan-out, the schedule, and the publish.
Lift4D is a research method for 4D reconstruction — recovering the full geometry, appearance, and motion of a moving, non-rigid object over time from a single ordinary monocular video, including parts of the object the camera never directly sees. It was published on arXiv on June 22, 2026 (arXiv:2606.23688) by a team including Carnegie Mellon researchers.
Not yet as a product. As of this writing the project is a paper and a project page with an interactive 4D viewer, and the code is marked "coming soon" rather than released. You can read the paper and watch the demos, but there is no hosted app or downloadable release to run yet. Check the official project page and GitHub repository for the current status.
It reconstructs a real object you filmed, rather than inventing a scene from a prompt. You give it a single video and it recovers that subject's shape, appearance, and motion as a 4D representation you can view from new angles over time. It does not generate fictional footage, edit clips, or write copy — it is a reconstruction method, not a generative video tool.
A 4D reconstruction lets you render the subject from camera angles that did not exist in the original clip — orbit shots, fly-arounds, turntable spins, and views of previously occluded parts. Those renders are footage you can use in product spins, motion studies, and explainer clips.
Lift4D produces renders but does not publish them. Bring a render into Kompozy to cut it into a captioned vertical short, reframe it per platform, and fan the same reconstruction out into a carousel, a Photo Post, and a written breakdown — then schedule and publish across TikTok, Reels, YouTube Shorts, X, LinkedIn, and more from one queue.