Sun Yat-sen University and the Byte Digital Human Team have jointly developed a virtual try-on framework called MMTryon. This framework can generate high-quality model try-on effects with just one click by inputting clothing pictures and text instructions on how to wear them. Supports real-life and comic characters, greatly simplifying the virtual try-on process. This technology breaks through the limitations of traditional algorithms and achieves precise processing of complex dressing scenes and arbitrary clothing styles without the need for fine segmentation of clothing, greatly improving efficiency and convenience.
Recently, Sun Yat-sen University and the Byte Digital Human Team made big news. They proposed a virtual try-on framework called MMTryon. This thing is not simple. As long as you input a few pictures of clothes and add a few text instructions on how to wear them, you can generate a model try-on effect with one click, and the quality is extremely high.
Imagine that you select a coat, a pair of pants, and a bag, and then with a click, they are automatically put on the portrait. No matter you are a real person or a comic character, you can do it with one click. This operation is simply too cool!
Moreover, the power of MMTryon doesn’t stop there. In terms of single-image dress-up, it uses a large amount of data to design a clothing encoder that can handle various complex dress-up scenes and any clothing styles. As for the combination of clothing changes, it breaks the traditional algorithm's reliance on fine segmentation of clothing. It can be done with a text command, and the generated effect is both realistic and natural.
In the benchmark test, MMTryon directly won the new SOTA, and this result is not something to be ignored. The research team also developed a multi-modal multi-reference attention mechanism to make the dressing effect more accurate and flexible. Previous virtual try-on solutions either only allowed you to try on a single item, or you were helpless about the clothing style. But now, MMTryon will solve it all for you.
Moreover, MMTryon is also very smart. It uses a clothing encoder with rich representation capabilities, coupled with a novel scalable data generation process, so that the dressing process does not require any segmentation, and can achieve high-level performance directly through text and multiple try-on objects. Quality virtual dressup.
Extensive experiments on open source data sets and complex scenarios have proven that MMTryon outperforms existing SOTA methods both qualitatively and quantitatively. The research team also pre-trained a clothing encoder, using text as query to activate the features of the corresponding area of the text, getting rid of the dependence on clothing segmentation.
What’s even more awesome is that in order to train combined dress changes, the research team proposed a data amplification model based on large models and built a 1 million enhanced data set, allowing MMTryon to have real virtual trials on various types of dress changes. wear effect.
MMTryon is like a black technology in the fashion industry. It can not only help you try on clothes with one click, but also serve as a fashion dressing assistant to help you choose clothes. In terms of quantitative indicators and human evaluation, MMTryon surpasses other baseline models and has excellent effects.
Paper address: https://arxiv.org/abs/2405.00448
All in all, MMTryon has shown great application potential in the fashion field with its efficient, accurate and convenient virtual try-on function, bringing revolutionary changes to clothing design and shopping experience. Its leading technology and outstanding performance make it a new benchmark in the field of virtual try-on.