
Share
Stability AI's Stable Video 4D pushes the boundaries of video generation with its unique ability to produce eight distinct perspectives from a single input, setting it apart in a crowded field of generative models.
Stability AI is making waves in the generative AI space with the introduction of Stable Video 4D, a groundbreaking model that adds a new dimension to video generation. While several other models like OpenAI's Sora, Runway, Haiper, and Luma AI have made strides in this area, Stable Video 4D stands out by generating multiple novel-view videos from eight different perspectives.
Stable Video 4D builds on the foundation of Stability AI’s existing Stable Video Diffusion model, which converts images into videos. However, it takes a significant step forward by accepting video input and producing dynamic 3D objects viewable from various camera angles at different timestamps. This is achieved through a combination of novel view synthesis and video generation within a single network.
Stable Video 4D is not just a technical marvel; it has practical applications in various industries:
Stable Video 4D leverages a single network to handle both novel view synthesis and video generation. This is a significant departure from existing models, which typically use separate networks for these tasks.

The development of Stable Video 4D involved fine-tuning the combined strengths of previous models, specifically:
While other generative AI models like Sora and Runway have made significant contributions to video generation, Stable Video 4D stands out for its ability to handle multiple perspectives and dynamic objects in a single network. This integration reduces the complexity and computational overhead typically associated with combining separate networks for novel view synthesis and video generation.
Stable Video 4D represents a significant advancement in generative AI, offering a unique solution for generating multidimensional videos. Its ability to handle multiple camera angles and dynamic objects within a single network makes it a valuable tool for various industries, from entertainment to gaming and beyond.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
29 July 2024
88 articles
Related Articles
Related Articles
More Stories