Liquid AI launches STAR model architecture, which is more efficient than traditional Transformer

Author：Eve Cole Update Time：2024-12-17 16:26:39

The architectural design of large language models (LLM) is undergoing profound changes, and the dominance of the Transformer architecture is facing challenges. To address this challenge, Liquid AI, a startup incubated by MIT, launched an innovative framework called STAR (Synthesis of Tailored Architectures), which aims to automatically generate and optimize AI model architecture. The STAR framework uses evolutionary algorithms and hierarchical coding technology to synthesize and optimize model architectures based on specific performance and hardware requirements, showing significant advantages in both efficiency and performance.

The STAR framework utilizes evolutionary algorithms and numerical coding systems to automate the generation and optimization of artificial intelligence model architectures. Liquid AI’s research team noted that STAR’s design approach differs from traditional architecture design by employing a hierarchical coding technique called the “STAR Genome” to explore a broad design space of potential architectures. Through genome combination and mutation, STAR is able to synthesize and optimize architectures that meet specific performance and hardware requirements.

In tests targeting autoregressive language modeling, STAR showed outperformance over traditional optimized Transformer++ and hybrid models. In terms of optimization quality and cache size, STAR's evolved architecture reduces the cache size by up to 37% compared to the hybrid model, and achieves a 90% reduction compared to the traditional Transformer. This efficiency does not sacrifice the predictive performance of the model, but in some cases outperforms competitors.

The research also shows that STAR's architecture is highly scalable. A STAR evolutionary model that scales from 125 million parameters to 1 billion parameters performs as well as or better than existing Transformer++ and hybrid models on standard benchmarks, while significantly reducing Reasoning about caching requirements.

Liquid AI said that the design concept of STAR incorporates the principles of dynamic systems, signal processing and numerical linear algebra to build a flexible computing unit search space. A distinctive feature of STAR is its modular design, which enables it to encode and optimize architectures at multiple levels, providing researchers with the opportunity to gain insight into effective combinations of architectural components.

Liquid AI believes that STAR's efficient architecture synthesis capabilities will be applied in various fields, especially in scenarios where quality and computing efficiency need to be balanced. Although Liquid AI has not announced specific commercial deployment or pricing plans, its research results mark a major advancement in the field of automated architecture design. As the field of AI continues to evolve, frameworks like STAR may play an important role in shaping the next generation of intelligent systems.

Official blog: https://www.liquid.ai/research/automated-architecture-synthesis-via-targeted-evolution

All in all, Liquid AI's STAR framework provides a new automated method for AI model architecture design. Its breakthroughs in efficiency and performance are of great significance, and provide new possibilities for the development of future AI systems. The modular design and scalability of the framework also give it broad application prospects in different fields.