This article introduces a breakthrough study by ByteDance and POSTECH researchers that significantly improves the computational efficiency of the text-to-image (T2I) model FLUX.1-dev through 1.58-bit quantization technology, allowing it to operate under resource constraints running on the device. This method requires only self-supervised learning of the model itself and does not require access to image data. It can compress the model storage space by 7.7 times and reduce the inference memory usage by more than 5.1 times, while maintaining the same generation quality as the full-precision model. This research provides new possibilities for deploying high-performance T2I models on mobile devices and other platforms, and also provides valuable experience for future AI model optimization research.
The rapid development of artificial intelligence-driven text-to-image generation models has brought new opportunities and challenges to all walks of life. The research results of ByteDance and POSTECH provide an effective solution to solve the problem of deploying high-performance AI models on resource-constrained devices. Their significant improvements in model compression, memory optimization, and performance maintenance will pave the way for future AI applications. Popularization and development have laid a solid foundation. Future research will further explore how to overcome the limitations of 1.58-bit FLUX in speed and high-resolution image detail rendering to enable wider applications.