Etched AI successfully burned the Transformer architecture directly into the chip, creating the world's most powerful AI inference server

Author：Eve Cole Update Time：2025-01-15 11:32:01

Etched AI, an American chip start-up, has recently made a major breakthrough, successfully burning the Transformer architecture directly into the chip and developing the world's first server specifically built for Transformer inference. The performance of this server far exceeds that of similar products from NVIDIA, it can run trillion parameter models, and has many functions such as real-time voice agent, efficient encoding and tree search, multicast speculative decoding, etc. It is also equipped with 144GB HBM3E high-bandwidth memory. This innovation is expected to completely change the application prospects of the Transformer architecture and bring revolutionary changes to the field of artificial intelligence.

The article focuses on:

American chip startup Etched AI successfully burned the Transformer architecture directly into the chip, creating the world's most powerful server dedicated to Transformer inference. This technology can run models with trillions of parameters, hundreds of miles ahead of Nvidia. The server has multiple features, including real-time voice proxying, better encoding and tree search capabilities, multicast speculative decoding capabilities, and is equipped with a 144GB HBM3E. This breakthrough technology will bring new possibilities to the application of Transformer architecture.

This breakthrough of Etched AI marks a leap in the field of artificial intelligence hardware. Its powerful performance and rich functions will bring significant improvements to many application scenarios and deserve the industry's attention and expectations. In the future, we may see more innovative applications based on this technology.