The Disney research team used the Stable Diffusion V1.2 model to develop a new image compression method and achieved more realistic image generation at low bit rate. Called a "codec" and its performance surpasses traditional JPEG and AV1 codecs, performing excellently in image detail recovery and training costs. This study cleverly correlates quantization errors with noise in the diffusion model, reconstructs images using the denoising process, and tests and validates them on multiple datasets.
The study shows that the new method performs better in the recovery of image details, while the training costs required are greatly reduced. The researchers found that quantization error (the core process in image compression) is very similar to noise (the core process in diffusion model), so traditional quantized images can be considered as a noisy version of the original image. In this process, the denoising process of the diffusion model is used to reconstruct the image at the target bit rate.
In a series of tests, Disney's new approach surpassed previous image compression techniques in both accuracy and detail recovery. The researchers said their approach does not require additional fine-tuning of the diffusion model and can effectively use existing basic models. The advantage of this new codec is that it performs well in reconstruction of the sense of reality, although in some cases it may experience hallucinations, that is, it may appear in the generated image and do not exist in the original image. details.
Although this compression method has a certain impact on the presentation of art works and ordinary photos, in some application scenarios related to details, such as court evidence, facial recognition data, and optical character recognition (OCR) scanning, the potential of hallucination phenomena. Risk is more important. At present, although this technology is still in its early stages, challenges in this field will gradually emerge with the development of AI-enhanced image compression technology.
In order to make image storage more efficient, the Disney team finally launched this new technology after long-term exploration. They trained on the Vimeo-90k dataset and tested on multiple datasets, and the results showed that the method was better than previous methods on multiple image quality metrics. Ultimately, the researchers also confirmed the superiority of their method in practical applications through user research.
Paper: https://studios.disneyresearch.com/app/uploads/2024/09/Lossy-Image-Compression-with-Foundation-Diffusion-Models-Supplementary-1.pdf
Key points:
1. Disney's new AI image compression technology can generate more realistic images at lower bitrates.
2. This method performs excellently in detail recovery and training costs without additional fine-tuning.
3. Although the effect is significant, details that do not match the original image may be generated, and there is a risk of "illusion".
Although Disney's AI image compression technology still has problems such as "illusion", its ability to generate high-realistic images at low bit rate and efficient training costs have all shown its huge potential. In the future, as the technology continues to mature, this technology will play an important role in the field of image storage and transmission.