In recent years, the application of artificial intelligence in the medical field has attracted much attention, especially chatbots represented by ChatGPT, which have high hopes. However, a new study reveals the limitations of AI in medical diagnosis. The editor of Downcodes will interpret this research published in the journal "JAMA Network Open" and analyze the current status and future development direction of AI-assisted medical diagnosis.
Text: In recent years, the application of artificial intelligence (AI) in the medical field has received increasing attention. In particular, chatbots such as ChatGPT are expected by many hospitals to be used as auxiliary tools to improve doctors' diagnostic efficiency. However, a newly released study shows that using ChatGPT did not significantly improve doctors' diagnostic abilities. The study, published in the journal JAMA Network Open, reveals the potential and limitations of AI in medical diagnosis.
Picture source note: The picture is generated by AI, and the picture authorization service provider Midjourney
In the study, participants were 50 physicians, including 26 attending physicians and 24 resident physicians. They were asked to make a diagnosis based on six real cases within one hour. In order to evaluate the auxiliary effect of ChatGPT, the researchers divided doctors into two groups, one group could use ChatGPT and traditional medical resources, and the other group could only rely on traditional resources, such as the clinical information platform UpToDate.
The results showed that doctors using ChatGPT scored 76% in diagnosis, while doctors who relied solely on traditional resources scored 74%. In comparison, ChatGPT achieved a diagnostic score of 90% on its own. Although ChatGPT performed well when working independently, its combination with doctors did not lead to significant improvements, which surprised the research team.
Ethan Goh, co-first author of the study and a postdoctoral researcher at the Stanford Center for Clinical Excellence, said the study was not designed to be conducted in a real clinical setting but was based on simulated data, so the results are not applicable Sex is restricted. He points out that the complexity doctors face when dealing with actual patients cannot be fully reflected in experiments.
Although research shows that ChatGPT performs better than some doctors at diagnosis, this does not mean that AI can replace doctors' decision-making. Instead, Goh emphasized that doctors still need to maintain oversight and judgment when using AI tools. In addition, doctors may be stubborn when making diagnoses, and the preliminary diagnosis they have formed may affect their acceptance of AI recommendations. This is also a direction that future research needs to focus on.
After the process of medical diagnosis is over, doctors also need to answer a series of new questions, such as "How to proceed with the correct treatment steps?" and "What tests are needed to guide the patient's next steps?" This shows the application of AI in the medical field. It still has broad prospects, but its effectiveness and applicability in actual clinical practice still need to be explored in depth.
All in all, this study reminds us that the application of AI in the medical field does not happen overnight and requires careful evaluation of its limitations and attention to the actual situation of doctors when using AI tools. In the future, how to better integrate AI technology in clinical practice will be an important direction for continued exploration in the medical field.