LINUX DO Channel

Carlxlx 在使用Deepseek-R1写作时要慎重——幻觉率太高中发帖

直接放主流模型的幻觉数据，来源自HHEM（Vectara公司发布的AI幻觉测试，迄今为止最具权威） 

OpenAI

GPT-4o：1.5%
o1-mini：1.4%
o1：2.4%
o3-mini-high-reasoning：0.8%


Google

Gemini-2.0-Flash-Exp：1.3%
Gemini-2.0-Flash-Thinking-Exp：1.8%
Gemini-2.0-Pro-Exp：0.8%


Anthropic

Claude-3.7-Sonnet：4.4%


Deepseek

DeepSeek-V3：3.9%
DeepSeek-R1： 14.3%




GitHub - vectara/hallucination-leaderboard: Leaderboard Comparing LLM Performance at Producing H...