Tech and AIAt Google I/O, AI that never hallucinates mistakes

At Google I/O, AI that never hallucinates mistakes

-


This year, Google I/O 2025 had one focus: Artificial intelligence.

We’ve already covered all of the biggest news to come out of the annual developers conference: a new AI video generation tool called Flow. A $250 AI Ultra subscription plan. Tons of new changes to Gemini. A virtual shopping try-on feature. And critically, the launch of the search tool AI Mode to all users in the United States.

Yet over nearly two hours of Google leaders talking about AI, one word we didn’t hear was “hallucination”.

Hallucinations remain one of the most stubborn and concerning problems with AI models. The term refers to invented facts and inaccuracies that large-language models “hallucinate” in their replies. And according to the big AI brands’ own metrics, hallucinations are getting worse — with some models hallucinating more than 40 percent of the time.

But if you were watching Google I/O 2025, you wouldn’t know this problem existed. You’d think models like Gemini never hallucinate; you would certainly be surprised to see the warning appended to every Google AI Overview. (“AI responses may include mistakes”.)

Mashable Light Speed

The closest Google came to acknowledging the hallucination problem came during a segment of the presentation on AI Mode and Gemini’s Deep Search capabilities. The model would check its own work before delivering an answer, we were told — but without more detail on this process, it sounds more like the blind leading the blind than genuine fact-checking.

For AI skeptics, the degree of confidence Silicon Valley has in these tools seems divorced from actual results. Real users notice when AI tools fail at simple tasks like counting, spellchecking, or answering questions like “Will water freeze at 27 degrees Fahrenheit?

Google was eager to remind viewers that its newest AI model, Gemini 2.5 Pro, sits atop many AI leaderboards. But when it comes to truthfulness and the ability to answer simple questions, AI chatbots are graded on a curve.

Gemini 2.5 Pro is Google’s most intelligent AI model (according to Google), yet it scores just a 52.9 percent on the Functionality SimpleQA benchmarking test. According to an OpenAI research paper, the SimpleQA test is “a benchmark that evaluates the ability of language models to answer short, fact-seeking questions.” (Emphasis ours.)

A Google representative declined to discuss the SimpleQA benchmark, or hallucinations in general — but did point us to Google’s official Explainer on AI Mode and AI Overviews. Here’s what it has to say:

[AI Mode] uses a large language model to help answer queries and it is possible that, in rare cases, it may sometimes confidently present information that is inaccurate, which is commonly known as ‘hallucination.’ As with AI Overviews, in some cases this experiment may misinterpret web content or miss context, as can happen with any automated system in Search…

We’re also using novel approaches with the model’s reasoning capabilities to improve factuality. For example, in collaboration with Google DeepMind research teams, we use agentic reinforcement learning (RL) in our custom training to reward the model to generate statements it knows are more likely to be accurate (not hallucinated) and also backed up by inputs.

Is Google wrong to be optimistic? Hallucinations may yet prove to be a solvable problem, after all. But it seems increasingly clear from the research that hallucinations from LLMs are not a solvable problem right now.

That hasn’t stopped companies like Google and OpenAI from sprinting ahead into the era of AI Search — and that’s likely to be an error-filled era, unless we’re the ones hallucinating.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest news

DEGEN Crypto Up +20% On The Week: Are We Set For A Base Szn?

One of the leading meme coins on the Base network, DEGEN crypto, is up nearly 20% in the...

Mt. Gox wallet with 80,000 BTC attacked via OP_RETURN message

BTC worth over $8B stolen in March 2011 from Mt. Gox is the target of a sophisticated phishing...

OneText raises $4.5M from Y Combinator, Khosla to reinvent shopping by text

The typical online checkout experience has become bloated with friction. And while more companies are building solutions around...

Truth Social Platform’s Parent Company Proposes Blue Chip Crypto ETF

Yorkville America Digital, LLC, in partnership with Trump Media & Technology Group (TMTG) – the company behind President...

Advertisement

Is Ethereum’s Price Ready to Pump?

TL;DR Large Ethereum investors boosted their collective holdings to nearly 27 million coins (22% of supply), signaling strong confidence...

Trump’s Strategic Bitcoin Reserve audit is now five days overdue

According to an executive order signed by Donald Trump, the US government should have audited all of its...

Must read

DEGEN Crypto Up +20% On The Week: Are We Set For A Base Szn?

One of the leading meme coins on the...

Mt. Gox wallet with 80,000 BTC attacked via OP_RETURN message

BTC worth over $8B stolen in March 2011...

You might also likeRELATED
Recommended to you