Home » AI industry battles rising hallucination rates

AI industry battles rising hallucination rates

The newest and most powerful AI technologies from companies like OpenAI, Google, and the Chinese start-up DeepSeek generate more errors, not fewer. Although their math skills have markedly improved, their accuracy with factual information has become more unreliable. It is not entirely clear why this is happening.

According to internal tests, OpenAI’s newest o3 and o4-mini models have hallucinated 30-50% of the time. Testing of OpenAI’s o3 and o4-mini reasoning models revealed concerning results. The O3 model hallucinated 33% of the time during PersonQA tests, which involve asking questions about public figures.

https://x.com/natalieben/status/1919723316513763631

In the SimpleQA tests, which focus on short fact-based questions, the hallucination rate was 51%. The o4-mini model performed worse, with a 41% hallucination rate during PersonQA tests and 79% in SimpleQA tests.

https://x.com/realDanWagner/status/1919473701541552253

Chinese company DeepSeek’s R1 reasoning model hallucinates more than its traditional models, according to tests by the AI research firm Vectara.

These models can hallucinate at each step throughout their advanced “thinking” processes, increasing chances for incorrect responses. Some suggest that the training behind these reasoning models may be the root cause. AI models are trained on large datasets and respond to queries with the most statistically likely answer.

Addressing AI hallucination challenges

When asked questions outside their training data, they can create incorrect information. Incomplete or biased data sets and flaws in training further exacerbate this issue.

OpenAI’s o3 model, for instance, is designed to maximize the chance of giving an answer, which increases the likelihood of producing incorrect responses instead of admitting it doesn’t know an answer. OpenAI has acknowledged their models’ hallucination rates in research papers and stated that more research is needed to address these issues. OpenAI’s CEO has even suggested that hallucinations add value to AI systems, though this perspective is widely debated.

Companies are actively working on potential fixes. Microsoft and Google have released products aimed at flagging potentially incorrect information in AI responses, though experts doubt these will fully eliminate hallucinations. Many researchers consider stopping AI bots from hallucinating impossible, but efforts are ongoing to reduce these rates.

Some propose teaching AI models to express uncertainty, avoiding the creation of falsehoods. Others employ “retrieval-augmented generation,” a technique where AI retrieves relevant documents to reference before generating an answer. Not all researchers agree on terminology; some criticize the term “hallucination,” arguing that it anthropomorphizes AI models unnecessarily, attributing intent and consciousness they do not possess.

Understanding and addressing AI hallucinations remains a critical challenge as these systems become more integrated into everyday life. Until these issues are resolved, users should cautiously approach AI-generated responses, double-checking critical information.

Image Credits: Photo by David Pupăză on Unsplash

Johannah Lopez

Johannah Lopez is a versatile professional who seamlessly navigates two worlds. By day, she excels as a SaaS freelance writer, crafting informative and persuasive content for tech companies. By night, she showcases her vibrant personality and customer service skills as a part-time bartender. Johannah's ability to blend her writing expertise with her social finesse makes her a well-rounded and engaging storyteller in any setting.

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

AI industry battles rising hallucination rates

Addressing AI hallucination challenges

Johannah Lopez

About Our Editorial Process

How To Get More Storage on iPhone: Buy iCloud+, Free Up Space & Manage Files (2026)

How To Turn Off iPhone: Power Down, Force Shutdown & Scheduled Restart (2026)

How To Close Apps on iPhone: Force Quit Frozen & Background Apps (2026)

How To AirDrop: Send Files Between iPhone, iPad & Mac Instantly (2026)

How To Delete Apps on iPhone: Remove, Offload & Hide Apps (2026)

How To Update Apps on iPhone: App Store, Auto-Updates & Troubleshooting (2026)

How To Take a Screenshot on iPhone: Every Method for Every Model (2026)

AirPods Not Connecting? How To Fix Pairing Issues on iPhone, Android & Mac (2026)

How To Block a Number on iPhone: Calls, Texts & Unknown Callers (2026)

How To Connect AirPods: iPhone, Android, Mac, PC & Laptop Pairing Guide (2026)

How To Factory Reset MacBook Air & MacBook Pro: Erase and Reinstall macOS (2026)

Nvidia Debuts NemoClaw Agent Stack

How To Restart iPhone: Force Restart a Frozen Phone & Soft Reboot (2026)

How To Factory Reset iPhone: Erase All Data or Reset Settings Only (2026)

How To Screen Record on iPhone: Built-In Recorder & Audio Capture (2026)

How To Clear Cache on iPhone: Safari, Apps & System Cache (2026)

How To Free Up Space on Android: 10 Ways to Clear Storage (2026)

How To Update Apps on Android: Auto-Update and Manual Methods (2026)

How To Find Downloads on Android: Locate Downloaded Files Easily (2026)

How To Track a Phone Location for Free With Google Maps (2026)

How To Use Samsung Smart View to Mirror Your Phone to TV (2026)

How To Block Your Number When Calling on Android: Hide Caller ID (2026)

Omidyar Philanthropy Appoints New Leader

Caching Strategies for High-Traffic Web Applications

AI Agent Self-Edits And Learns

AI industry battles rising hallucination rates

Addressing AI hallucination challenges

Related Posts

About Our Editorial Process