Google's latest black technology: AI helps you "reshoot" recorded videos, with professional-level post-production effects at your fingertips

Author：Eve Cole Update Time：2024-12-01 09:20:01

The editor of Downcodes will take you to learn about Google's innovative technology that subverts traditional video editing - ReCapture. This technology allows users to easily achieve professional-level camera movement adjustments, redefines the lens language of captured videos, and makes it easy for even novice video editors to get started. How will this change the way we create videos? Let's explore the mystery of ReCapture together.

The latest ReCapture technology launched by the Google research team is subverting the traditional video editing method. This innovation allows ordinary users to easily implement professional-level camera movement adjustments and redesign the lens language for already captured videos.

In traditional video post-production, changing the camera angle of a captured video has always been a technical problem. When existing solutions handle different types of video content, it is often difficult to maintain complex camera movement effects and picture details at the same time. ReCapture takes a different approach and does not use the traditional 4D intermediate representation method. Instead, it cleverly uses the motion knowledge stored in the generative video model and redefines the task as a video-to-video conversion process through Stable Video Diffusion.

The system uses a two-stage workflow. The first stage generates the anchor video, which is the initial output version with the new camera position. This stage can be achieved by creating multi-angle videos through diffusion models such as CAT3D, or by frame-by-frame depth estimation and point cloud rendering. While this version may have some timing inconsistencies and visual flaws, it laid the foundation for Phase Two.

The second stage applies masked video fine-tuning, leveraging a generative video model trained on existing footage to create realistic motion effects and timing changes. The system introduces a temporal LoRA (Low Rank Adaptation) layer to optimize the model so that it can understand and replicate the specific dynamic characteristics of anchor videos without retraining the entire model. At the same time, the spatial LoRA layer ensures that picture details and content are consistent with the new camera movement. This enables the generative video model to complete operations such as zooming, panning, and tilting while maintaining the characteristic motion of the original video.

Although ReCapture has made important progress in user-friendly video processing, it is still in the research stage and is still some way away from commercial application. It is worth noting that although Google has many video AI projects, it has not yet brought them to the market. Among them, the Veo project may be the closest to commercial use. Similarly, Meta's recently launched Movie-Gen model and OpenAI's Sora released at the beginning of the year have not yet been commercialized. Currently, the video AI market is mainly led by startups such as Runway, which launched its latest Gen-3Alpha model last summer.

The emergence of ReCapture technology heralds a revolution in the field of video editing. Although it is still in the research and development stage, its powerful functions and convenient operation methods undoubtedly provide unlimited possibilities for future video creation. The editor of Downcodes will continue to pay attention to the progress of this technology and bring you more related reports.