A recent study published in Scientific Reports has attracted attention, which shows that some advanced AI chatbots have surpassed humans in their ability to evaluate complex social situations. The researchers compared multiple AI chatbots with human participants through situational judgment tests. The results showed that some AIs performed better at selecting the best behavioral responses, which provides new insights into the application of AI in customer service, mental health support and other fields. Provides new possibilities. The research is not perfect, and further exploration is needed to understand how AI performs in real social interactions and how to overcome its limitations of lacking real emotions.
Recently, a study published in Scientific Reports showed that some advanced AI chatbots can perform better than humans in evaluating complex social situations.
Using a widely used psychological tool called the Situational Judgment Test, researchers found that three chatbots—Claude, Microsoft Co-pilot, and you.com’s Intelligent Assistant—outperformed humans at selecting the most effective behavioral responses. Participant performance.
Picture source note: The picture is generated by AI, and the picture authorization service provider Midjourney
As social interactions become increasingly important, the potential of AI in social interactions continues to emerge, including applications in areas such as customer service and mental health support. Large language models, such as the chatbot tested in this study, are capable of processing language, understanding context, and providing effective responses. Although previous research has demonstrated the capabilities of these models in academic reasoning and language tasks, their effectiveness in complex social dynamics remains underexplored.
The research team tested 276 human participants, who were highly qualified pilot applicants. The study used a situational judgment test that presented 12 situations to be evaluated, each providing four potential behavioral options. The researchers compared the performance of five AI chatbots and found that all tested chatbots performed at least as well as humans, and some even performed better. Claude performed best, followed by Microsoft Co pilot and you.com's smart assistant.
Interestingly, when chatbots did not choose the best response, they often chose the second most effective option, showing similarities to human decision-making patterns. This shows that although the AI system is not perfect, it has certain abilities in social judgment and probabilistic reasoning.
In addition, research has found differences in reliability between different AI systems. Claude shows the highest consistency across multiple tests, while Google Gemini can have conflicting ratings across tests. Nonetheless, the overall performance of all AI systems exceeded expectations, demonstrating their potential in providing social competence recommendations.
The researchers note that while many people already use chatbots in daily tasks, their performance in complex scenarios of social interaction still needs further validation. Research shows that large language models perform well in simulated social situations, but they do not possess the real emotions necessary for real social behavior.
All in all, this research reveals the huge potential of AI in the social field, but it also reminds us that we need to be cautious about the application of AI in real social scenarios, and further research is needed on AI's emotional understanding and real social capabilities.