Google collaborated with DeepMind to successfully discover and fix a memory safety vulnerability in the SQLite database using the AI model "Big Sleep". This is a stack buffer underflow problem that traditional fuzz testing failed to find, but Big Sleep successfully found it. This marks the first time that AI has discovered known vulnerabilities in real-world software, bringing new possibilities to the field of software security and heralding the future development direction of AI-assisted software security detection. The editor of Downcodes will explain this breakthrough progress in detail.
SQLite is an open source database engine. This vulnerability may allow an attacker to cause SQLite execution crash or even achieve arbitrary code execution through a maliciously constructed database or SQL injection. Specifically, the problem stems from a magic value of -1 being accidentally used as an array index, and although there is assert() in the code to catch this problem, in release builds, this debug level check will be removed.
Google pointed out that exploiting this vulnerability is not simple, but more importantly, this is the first time that AI has discovered a known vulnerability in real-world software. According to Google, traditional fuzzing methods failed to find the issue, but Big Sleep did. After analyzing a series of commits in the project's source code, Big Sleep locked down the vulnerability in early October, and it was fixed within the same day.
Google said in an announcement on November 1 that this research result has huge potential in defense. Although fuzz testing has achieved significant results, the Google team believes that a new method is needed to help developers discover vulnerabilities that are difficult to find through fuzz testing, and they are full of expectations for AI's capabilities in this regard.
Prior to this, Seattle-based Protect AI also launched an open source tool called Vulnhuntr, claiming that it can use Anthropic's Claude AI model to find zero-day vulnerabilities in Python code bases. However, the Google team emphasized that the two tools serve different purposes and that Big Sleep discovered vulnerabilities related to memory safety.
Currently, Big Sleep is still in the research stage and has been mainly tested on small programs with known vulnerabilities. This is the first time it has been tested in a real environment. For testing, the research team collected several latest commits of the SQLite code base, and after analysis, adjusted the model's prompt content, and finally found the vulnerability.
Despite this achievement, the Google team reminds everyone that these results are still highly experimental and current target-specific fuzz testing may be equally effective in discovering vulnerabilities.
The breakthrough progress of Google's AI model Big Sleep in the field of software security provides new ideas and methods for future software security detection. Although it is still in the experimental stage, its potential is huge and worth looking forward to. The editor of Downcodes will continue to pay attention to the development of this field and bring you more exciting reports.