Google Gemini is forcing outsourcers to evaluate AI responses outside areas of expertise

Author：Eve Cole Update Time：2024-12-25 18:16:01

Google's AI project Gemini has recently caused controversy due to its new internal regulations. According to reports, outsourced personnel responsible for evaluating the content generated by Gemini were asked to evaluate areas in which they did not have expertise, especially in sensitive areas such as health care, raising concerns about the accuracy of Gemini's information. This policy adjustment directly affects the work of outsourced personnel and indirectly affects the quality of Gemini's assessment.

Recently, Google’s AI project Gemini has raised concerns about the accuracy of information due to its new internal regulations. According to reports, outsourced personnel tasked with evaluating AI-generated content are being asked to rate areas in which they have no expertise, especially on sensitive topics such as health care.

谷歌大模型Gemini

These outsourced personnel are from GlobalLogic, a global technology services company. Google requires them to evaluate the responses generated by AI, mainly considering factors such as "authenticity". Previously, outsourcers could choose to skip questions for which they had no relevant expertise. For example, they could choose not to evaluate a specialty question about cardiology. This is done to ensure the accuracy of the scoring and that only people with the relevant background can carry out effective assessments.

However, last week GlobalLogic announced Google's latest requirement, in which outsourcers are now no longer allowed to skip questions in such areas of expertise, but instead are required to rate the parts they understand and explain their lack of knowledge in the relevant area. The change sparked widespread concern among outsourcers that the practice could affect Gemini's accuracy on some complex topics.

For example, some outsourcers mentioned in internal communications that the previous skip option was to improve the accuracy of scoring, but the implementation of the new rules forced them to evaluate some issues for which they had no experience, such as rare diseases. Internal emails show that the original rule was: "If you do not have the necessary expertise for this task, please skip it." The new rule is: "Prompts that require expertise should not be skipped." This change in policy, Make outsourcers uncomfortable.

Under the new regulations, outsourcers can only skip assessment tasks in two situations: one is where information is completely missing, such as a complete prompt or response; the other is where the content may be harmful and requires special consent to conduct the assessment. Although these new rules are intended to improve Gemini's performance, they may affect its understanding and feedback on complex topics in actual operations.

Google did not respond to this matter, and the concerns of outsourcers are also gradually growing.

Highlight:

Outsourced personnel are asked to evaluate AI-generated responses in which they have no expertise, especially in sensitive areas such as healthcare.

The new regulations eliminate the "skip" option and require outsourcers to still score even if they lack expertise.

This policy may affect Gemini's accuracy on complex topics, causing uneasiness and concern among outsourced personnel.

Google's response to Gemini's new regulations has been delayed, and concerns among outsourcing personnel continue to grow, which makes Gemini's future development face new challenges. The accuracy evaluation method of AI models needs to be further improved to ensure the reliability and safety of AI technology.