Google's research team has released Alchemist, a breakthrough image editing technology that allows users to precisely control the material properties of objects in images, such as color, glossiness, and transparency, without the need for professional software or skills. At its core, Alchemist is a fine-tuned text-to-image generation model that achieves fine control over material parameters by synthesizing datasets and modifying the Stable Diffusion 1.5 model architecture. This technology has the potential to revolutionize the image editing process, providing powerful tools for professionals such as designers, artists, and architects.
The Google research team recently launched a breakthrough technology - Alchemist. This technology enables users to precisely edit the material properties of items in a picture, such as color, glossiness, and transparency, without the need for professional image editing software and skills.
At the core of Alchemist’s technology is a fine-tuned Text-to-Image (T2I) generation model. The research team achieved fine control over material parameters by creating synthetic data sets and modifying the Stable Diffusion1.5 model architecture.
Specifically, the researchers first generated a large number of synthetic images using computer graphics and physically based rendering techniques. These images contain various 3D models with randomly selected materials, camera angles, and lighting conditions. They then made changes to single attributes of these images, generating multiple versions with varying editing strengths.
By fine-tuning this synthetic data, the model learns how to change only specified material properties, given context images, instructions, and edit intensity values, while keeping item shape and image lighting unchanged.
Experimental results show that this technology can effectively change the appearance of objects, such as enhancing the metallic feel or adjusting transparency. In user studies, this approach achieved significant advantages in both photorealism and user preference compared to the baseline approach.
The application prospects of this technology are broad. It can help interior designers preview how a room will look when repainted, or assist architects, artists, and designers in quickly creating design sketches for new products. In addition, because the editing effects are visually consistent, the technology can also be used for downstream 3D tasks such as NeRF (Neural Radiation Field) reconstruction.
Although Alchemist technology has made significant progress in material editing, the research team also pointed out some limitations. For example, the model still has room for improvement when it comes to handling hidden details in images.
However, the researchers are confident in the technology's potential for controlled material editing. With further research and optimization, Alchemist is expected to revolutionize the field of image editing, making complex material editing tasks simpler and more intuitive.
Google’s Alchemist technology represents another major breakthrough in artificial intelligence in the field of image processing. It not only simplifies the complex image editing process, but also provides new possibilities for the creative industry and is expected to have a profound impact in multiple fields such as design, art, and virtual reality.
Project address: https://prafullsharma.net/alchemist/
The emergence of Alchemist technology marks another milestone in the field of artificial intelligence in image editing. Its efficient and accurate material editing capabilities will surely bring new vitality to the creative industry and promote the continued development and innovation of related technologies.