The editor of Downcodes learned that the Disney research team has released a new image compression method based on the Stable Diffusion V1.2 model. This method can generate high-realism images at low bit rates, and its performance surpasses existing JPEG and AV1 codecs. decoder. This breakthrough technology, called a "codec," cleverly utilizes the denoising process of the diffusion model to treat quantization errors in image compression as noise, allowing for efficient image reconstruction. This method does not require additional fine-tuning of the model, greatly reduces training costs, and performs well in multiple dataset tests.
This study shows that the new method performs better in restoring image details, and the required training cost is also greatly reduced. The researchers found that quantization error (a core process in image compression) is so similar to noise (a core process in diffusion models) that a traditionally quantized image can be thought of as a noisy version of the original image. In this process, the denoising process of the diffusion model is used to reconstruct the image at the target bit rate.
In a series of tests, Disney's new approach surpassed previous image compression techniques in both accuracy and detail recovery. The researchers say their method does not require additional fine-tuning of the diffusion model and can effectively use existing base models. The advantage of this new codec lies in its excellent performance in photorealistic reconstruction, although in some cases it may suffer from hallucinations, that is, artifacts may appear in the generated image that were not present in the original image. details.
Although this compression method has a certain impact on the rendering of works of art and ordinary photos, in some application scenarios where details are important, such as forensic evidence, facial recognition data, and optical character recognition (OCR) scanning, the potential for hallucinations Risk becomes more important. Currently, although this technology is still in its infancy, with the development of AI-enhanced image compression technology, challenges in this field will gradually emerge.
In order to make image storage more efficient, the Disney team finally launched this new technology after long-term exploration. They trained on the Vimeo-90k dataset and tested on multiple datasets, and the results showed that the method outperformed previous methods on multiple image quality metrics. Finally, the researchers also confirmed the superiority of their method in practical applications through user research.
Paper: https://studios.disneyresearch.com/app/uploads/2024/09/Lossy-Image-Compression-with-Foundation-Diffusion-Models-Supplementary-1.pdf
Disney's image compression technology based on Stable Diffusion demonstrates the huge potential of AI in the field of image processing. Although there are challenges such as illusion, its improvement in image quality and efficiency is significant. In the future, with the continuous improvement of technology, this technology is expected to be applied in more fields, bringing revolutionary changes to image storage and transmission. It is expected that follow-up research can further solve the problem of illusion and make it useful in more detail-demanding scenes.