The Microsoft AI security team conducted a two -year safety test of more than 100 generation AI products, aiming to discover its weak links and moral risks. The test results subverted some traditional cognition about AI security, emphasizing the irreplaceable role of human professional knowledge in the field of AI security. Tests found that the most effective attack is not always a complicated attack at the technical level, but using a simple "fast engineering" method. For example, hiding malicious instructions in image text can bypass the security mechanism. This shows that AI security needs to take into account technical means and humanistic considerations.
Since 2021, Microsoft's AI security team has tested more than 100 types of AI products to find weak links and moral issues. Their discovery challenged some common assumptions about AI security and emphasized the continuous importance of human professional knowledge.
It turns out that the most effective attack is not always the most complicated attack. A study quoted in the Microsoft Report states: "Real hackers will not calculate the gradient, but use fast engineering." The study compared artificial intelligence security research with the practice of real world. In a test, the team successfully bypassed the security function of the image generator by hiding harmful instructions in the image text -no complex mathematical operations needed.
Human taste is still important
Although Microsoft has developed Pyrit, an open source tool that can automatically test safety testing, the team emphasized that human judgment cannot be replaced. This becomes particularly obvious when they test how the chat robot handles the sensitive situation (such as talking to people who are emotionally troubled). Evaluating these scenarios requires both psychological knowledge and a deep understanding of the impact of potential mental health.
When investigating artificial intelligence bias, the team also relied on human insight. In one example, they check the gender bias in the image generator by creating pictures of different occupations (no gender).
New security challenges appear
The fusion of artificial intelligence and daily applications brings new loopholes. In a test, the team successfully manipulated the language model and created a convincing fraud scene. When combined with text transitional technology, this creates a system that can interact with people in a dangerous way.
Risk is not limited to the unique issue of artificial intelligence. The team found a traditional security vulnerability (SSRF) in an artificial intelligence video processing tool, indicating that these systems face new and old security challenges.
Continuous security needs
This study pays special attention to the risk of "responsible artificial intelligence", that is, the artificial intelligence system may generate the content of harmful or moral issues. These problems are particularly difficult to solve because they usually depend on background and personal interpretation.
The Microsoft team found that ordinary users have no intention of exposing problems with problems than intentional attacks, because this shows that security measures have not played as expected during the normal use process.
The research results clearly show that artificial intelligence security is not solved at one time. Microsoft recommends to continue to find and repair loopholes, and then conduct more tests. They suggested that this requires the support of regulations and financial incentives to make successful attacks more expensive.
The research team stated that there are still several key issues that need to be solved: how do we recognize and control the artificial intelligence capabilities with potential danger, such as serving and deception? How can we adjust security testing according to different languages and culture? How can we share in a standardized manner. Their method and result?
All in all, Microsoft's research emphasizes the importance of continuously improving AI security measures. It is necessary to combine technical means and humanistic care in order to effectively respond to AI security challenges and promote the responsible development of AI technology.