To accomplish this feat, the AI program employs two deep learning networks (or AI models) to carry out its conversion process. First, the AI model examines the 2D clip and identifies specific features, such as characters, objects, and environment. Then, the second AI model extrapolates from the initial data and uses this information to generate an approximated 3D version of the scene.
The end result is a convincing 3D representation of the clip, with objects and characters accurately rendered in 3D. This can be used to create high-quality animated graphics that retain the spontaneity and emotion of the original clip, while bringing it to previously unattainable levels of detail.
A demonstration video released by NVIDIA shows this method successfully transforming a range of videos, including clips from a classic cartoon and a dramatic rock-paper-scissors match. Because the system can quickly separate a scene’s objects from its background, it can be utilized for a variety of purposes, such as quickly generating 3D models of characters for animation or creating stunning cinematic effects with complex materials and physics.
The implications of this new technology are vast, as it opens up a whole new range of possibilities for videos and animation. And while NVIDIA’s technology requires some manual finetuning on a scene-by-scene basis, the results prove that AI can be used to create truly amazing visuals from existing 2D content. Hopefully, this technology will lead to even more groundbreaking projects in the future.