In recent years, artificial intelligence (AI) has been increasingly widely used in the field of programming, but its capabilities still have certain limitations. Recently, Max Woolf, a senior data scientist at BuzzFeed, found through a series of experiments that the quality of code generated by AI can be significantly improved by continuously providing tips for large language models (LLMs). This discovery not only aroused heated discussions in the technology circle, but also attracted the attention of many AI scientists, further highlighting the importance of iterative optimization and prompt word design in AI programming.
In Woolf's experiment, he chose the AI model Claude3.5Sonnet as the research object. The first step in the experiment is to let the model solve a relatively simple programming problem: how to find the difference between the minimum and maximum value of the sum of each digit is 30 in one million random integers. After receiving the task, Claude quickly generated code that meets the requirements, but Woolf believes that there is still room for improvement in this code.
In order to further optimize the code, Woolf decided to ask Claude to iteratively optimize iteratively after each generation of code. After the first iteration, Claude refactored the code into an object-oriented Python class and implemented two significant optimizations, which increased the code by 2.7 times faster. In the second iteration, Claude further introduced multi-threaded processing and vectorized computing, which ultimately enabled the code to run at 5.1 times faster than the basic version.
However, as the number of iterations increases, the improvement of code quality gradually slows down. Although the model tried more complex techniques such as JIT compilation and asynchronous programming in subsequent iterations, these optimizations did not bring the expected performance improvements and even led to performance degradation in some cases. This phenomenon shows that although iterative hints can significantly improve code quality in the early stages, their effect will gradually weaken after reaching a certain level.
Woolf's experiment not only demonstrates the huge potential of AI in the field of programming, but also reveals its limitations in practical applications. Although AI can optimize code through iteratively, how to balance performance and complexity when designing prompt words is still an issue that needs to be discussed in depth. This research provides new ideas for future AI programming, and also reminds us that AI is not omnipotent, and rational use and optimization strategies are the key.