The AI voice transcription tool Whisper was exposed to have serious "hallucinations" and often made up nonsense

Author：Eve Cole Update Time：2024-12-04 12:48:01

Recently, the editor of Downcodes paid attention to a worrying news: the AI transcription tool widely used in the medical industry, based on OpenAI's Whisper technology, was found to have "hallucinations" and may generate false content. This has triggered concerns about the safety and reliability of AI technology in the medical field. This article will analyze this incident in detail, explore its potential risks and OpenAI’s response.

Recently, an AI transcription tool powered by OpenAI's Whisper technology has gained popularity in the medical industry. Many doctors and healthcare organizations are using this tool to record and summarize patient encounters.

According to a report by ABC News, researchers found that this tool can cause "hallucinations" in some cases, and sometimes even completely fabricate content.

The transcription tool, developed by a company called Nabla, has successfully transcribed more than 7 million medical conversations and is currently used by more than 30,000 clinicians and 40 health systems. Still, Nabla is aware of the potential for Whisper to hallucinate and says it's working to address the issue.

A study conducted by a team of researchers from Cornell University, the University of Washington, and others found that Whisper hallucinated in about 1% of its transcriptions. In these cases, the tool randomly generates meaningless phrases during the silent periods of the recording, sometimes expressing violent emotions. The researchers collected audio samples from TalkBank's AphasiaBank and noted that silence is particularly common when people with speech disorders speak.

Cornell University researcher Allison Koenecke shared some examples on social media showing the hallucinatory content generated by Whisper. The researchers found that the content generated by the tool also included made-up medical terms and even phrases like "Thanks for watching!" that sounded like words from YouTube videos.

The research was presented at the Association for Computing Machinery's FAccT conference in Brazil in June, but it's unclear whether it has been peer-reviewed. Regarding this issue, OpenAI spokesperson Taya Christianson said in an interview with "The Verge" that they take this issue very seriously and will continue to work hard to improve it, especially in reducing hallucinations. At the same time, she mentioned that when using Whisper on their API platform, there are clear usage policies prohibiting the use of the tool in certain high-stakes decision-making environments.

Highlight:

Whisper transcription tool is widely used in the medical industry and has recorded 7 million medical conversations.

⚠️ Research has found that Whisper "hallucinates" in about 1% of transcriptions, sometimes generating meaningless content.

OpenAI says it is working to improve tool performance, particularly in reducing hallucinations.

All in all, AI technology has broad prospects for application in the medical field, but it also faces many challenges. The emergence of Whisper’s “hallucination” phenomenon reminds us that we need to treat AI technology with caution and strengthen supervision of its safety and reliability to ensure its safe and effective application in the medical field and protect the rights and safety of patients. The editor of Downcodes will continue to pay attention to the subsequent development of this incident.

The AI ​​voice transcription tool Whisper was exposed to have serious "hallucinations" and often made up nonsense

The AI voice transcription tool Whisper was exposed to have serious "hallucinations" and often made up nonsense