Cognition launched the amazing AI programmer Genie, which beat Devin and GPT-4 with a score of 30.08% on the authoritative testing platform SWE-Bench, becoming the world's strongest AI programming assistant. The editor of Downcodes will give you an in-depth understanding of the technological breakthroughs and future prospects behind Genie.
AI startup Cognition has launched a new AI programmer "Genie". Its performance is amazing. It instantly defeated Devin and GPT-4 and became the world's most powerful AI programming assistant.
This AI programmer scored as high as 30.08% on the authoritative testing platform SWE-Bench, far exceeding Devin's 13.8% and Swe-agent+GPT-4's 12.47%.
You may be wondering, how does Genie do it? As early as December 2022, Genie co-founder Alistair Pullen demonstrated this project at the University of London. He hopes to create an AI program that can automatically code, debug and optimize like humans. After more than a year of development, Genie finally entered the testing stage and received US$2.5 million in seed round financing.
Alistair mentioned that Genie’s success is closely related to its training data and methods. Unlike traditional fine-tuning of large models, Genie uses a special dataset that incorporates the reasoning process of human programmers. The data covers the step-by-step discovery of knowledge and the case-based decision-making process, allowing Genie to demonstrate judgment similar to that of a human engineer when faced with complex problems.
In addition, Genie also uses a unique "self-improvement mechanism." Initially, Genie trained on high-quality data and reached a "perfect" state, but in the process, Genie failed to judge its own errors and failed to improve. In order to overcome this problem, developers used Genie to generate some synthetic data to further enrich the training content. This is like a mother teaching her child to walk and giving the right guidance after every fall.
After many iterations of training, Genie's abilities have greatly improved, and it can even show creative solutions to unseen problems. Functionally, Genie supports a variety of development tasks, including function development, BUG repair, code refactoring, code testing, etc., covering dozens of programming languages such as JavaScript, Python, and Java.
Now, Genie has opened applications for trial use. You can register through the official website. Test permissions are expected to be issued in the next few weeks.
Official blog: https://cosine.sh/blog/state-of-the-art
Experience address: https://cosine.sh/register
Highlight:
Genie scored as high as 30.08% in the SWE-Bench test, becoming the world's strongest AI programmer.
Using special data sets and self-improvement mechanisms, Genie excels in complex coding.
? Application for trial is currently open, and more surprise features will be launched in the future!
The emergence of Genie marks a new breakthrough in the field of AI programming assistants. Its unique training method and self-improvement mechanism deserve the attention of the industry. The editor of Downcodes looks forward to Genie bringing more surprises to developers in the future!