Hallucinations: Measurably Improving, But You Need to Know

Knowledge workers spend an average of 4.3 hours per week verifying AI answers for accuracy. Not because they're paranoid — but because hallucinated answers sound just as convincing as correct ones. Often even more so: an MIT study shows that AI uses more confident language in wrong answers than in correct ones.

Why does this happen? AI doesn't look anything up. It has no database of facts. It calculates word by word what is statistically most likely to come next. Like an intern who has an immediate answer to every question — and makes one up when in doubt.

The good news: this is measurably improving. The latest GPT generation reduced hallucination rates by 27% compared to its predecessor. On standardised fact-checking benchmarks, the best models now score below 1% error rate — unthinkable a year ago. In practice, rates are still significantly higher. And the improvement doesn't come just from better models, but above all from better systems: AI that can access real enterprise data instead of free-associating, with built-in verification loops and clear quality standards.

Those who build these prerequisites will get hallucinations under control. Those who wait for better models will keep waiting.