At the end of July this year, the second China Computer Federation (CCF) Chip Conference was held in Shanghai. Nearly two thousand domestic and foreign experts, scholars, researchers, and business representatives engaged in chip-related research and technology development in the fields of computer and microelectronics attended the conference to discuss chip design and EDA, new system architecture, fault-tolerant computing, emerging computer engineering and Theoretical innovation, technology research and development, application demonstration and industrial development topics in technology and other aspects.
At the "CCFTCarch Custom Computing Challenge" (LLM deployment acceleration for consumer-grade graphics cards) organized by the Architecture Committee of this chip conference, the research team led by Professor Feng Li of Southwest Jiaotong University relied on its excellence in edge inference optimization of large models. Excellent performance and won the second place in the country.
In this competition, Feng Li's team adopted a series of innovative technologies to address the challenges faced by edge reasoning in large models, including operator fusion technology. This technology reduces frequent reading and writing of video memory and significantly improves computing efficiency by merging multiple independent computing operations.
According to Feng Li, through operator fusion, the team successfully shortened the calculation time by more than 80%, greatly improving the utilization efficiency of computing resources. In addition, the team implemented precision quantization technology to convert some calculations from floating-point operations to fixed-point operations. Although this brings about a loss of accuracy to a certain extent, it still achieves a substantial increase in computing speed and a reduction in energy consumption.
Feng Li emphasized that the success of this competition is inseparable from the close cooperation among team members. Although the participating students lacked experience in the field of large models, they showed extraordinary learning ability and courage to explore. Through cooperation with top international scholars, especially cross-border exchanges with the University of Bristol, the students not only broadened their horizons, but also accumulated valuable experience. Feng Li said: "Our students are very young, but they are full of enthusiasm and have the courage to face challenges. Through extensive literature reading and technical seminars, they quickly grasped the latest developments in related fields and transformed theoretical knowledge into practical results."
Talking about the future development direction, Feng Li said that with the advancement of large model technology, especially the increasing demand for privacy protection and privatized deployment, the team's research results will have broad application prospects. For example, in fields such as finance, transportation, and aviation, many sensitive data are not suitable for transmission to the cloud for processing, but are analyzed locally, which requires efficient large model edge inference technology. Feng Li revealed that the team is currently actively cooperating with a number of companies and research institutions, and is committed to applying research results to network security vulnerability analysis and traffic big data processing to promote the commercialization of scientific research results.