Bytedance’s multi-modal understanding and image positioning model LEGO has the ability to accurately position
Bytedance's multi-modal understanding and image positioning model LEGO, jointly developed by ByteDance and Fudan University, has multiple input processing capabilities, including images, audio and video. LEGO can not only understand multi-modal data,
2025-01-24