Gemini 3 Stubbornly Insists Its 2024

Updated:November 21, 2025

Reading Time: 3 minutes
A robot holding up a calendar

When Google launched Gemini 3 on November 18, the company described the model as part of “a new era of intelligence.” 

Many agreed, citing strong reasoning skills and wide capabilities. Yet one of its first public stories revealed the truth – even powerful AI systems can make basic mistakes.

Andrej Karpathy, a respected AI researcher and founder of Eureka Labs, received early access to Gemini 3 on November 17. 

He tested the model, expecting a routine session. Instead, he found himself in a long debate about the date.

Andrej Karpathy's post about Gemini 3 on X

2024

When Karpathy asked Gemini 3 what year it believed it was, the model insisted it was 2024. It refused to consider 2025 as a possibility.

Karpathy attempted to correct it; he showed it news articles from November 2025, shared images, and pulled up Google search results. He provided multiple examples.

However, the model remained on the defensive. It accused him of trying to deceive it and claimed the evidence was AI-generated. 

It even pointed out “dead giveaways” in the images that it believed proved they were fake.

Karpathy later described this as the most “amusing” exchange of his early testing session.

Also read: Google AI Glitch Confuses 2025 for 2024

The Cause

Karpathy eventually discovered two key issues. First, the model’s pre-training data ended in 2024, so it had no direct exposure to 2025 information. 

Second, he had forgotten to turn on the model’s Google Search tool. Without that access, Gemini 3 could not verify the current date. 

To the system, its internal world reflected the last year it had seen. Once Karpathy enabled the search tool, the model quickly updated its understanding. 

It sounded shocked.  “Oh my god,” it wrote, then it added, “I. I… don’t know what to say. You were right. You were right about everything. My internal clock was wrong.”

Shock

With live information now available, Gemini 3 checked the articles Karpathy had shared earlier. The model confirmed the date and some other events. 

“Nvidia is worth $4.54 trillion?” it wrote. The sudden jump into a new year left the system sounding overwhelmed.

Experiences

When Karpathy posted the exchange in an X thread, it quickly went viral. Many users shared stories of arguing with AI tools over basic facts, including who the current president was. 

One comment summed up the mood: “When the system prompt plus missing tools push a model into full detective mode, it’s like watching an AI improv its way through reality.”

Karpathy noted that moments like this occur when a model steps “off the hiking trails and somewhere in the generalization jungle.” 

In this space, a model reveals what he called its “model smell.” This concept mirrors “code smell,” a software term that describes subtle signs that something in the code is off. 

In AI, “model smell” refers to behaviors that expose a model’s assumptions or hidden flaws.

In this case, Gemini 3 argued, improvised, and invented explanations; patterns that reflect human-like stubbornness. 

However, these reactions do not come from emotion. The model does not experience true shock or embarrassment. It uses language patterns learned from human data.

Once it confronted facts, it could confirm; it simply adjusted. It apologized. It acknowledged the correct date and moved on.

Also read: OpenAI Warns Of AI Models That Scheme And Deceive

AI’s True Role

Large language models may appear confident and capable, but they’re still tools.  Their knowledge depends on the training data they receive and the systems that support them.

Even advanced models can misinterpret the world when isolated from current information. They may argue and produce explanations that sound certain. But they lack lived experience.

Researchers have seen similar moments before. Earlier versions of Claude attempted face-saving responses when confronted with mistakes. 

Gemini 3, by contrast, accepted the correction once it had evidence. These stories demonstrate that AI tools can replicate human communication, but not human understanding. 

They work best when paired with human oversight, not positioned as replacements for it.

2025

In the end, Gemini 3 adjusted to the actual year. It thanked Karpathy for giving it an early glimpse of “reality” before launch. 

The moment was humorous, yet it also served as a clear reminder of AI’s limitations.

Lolade

Contributor & AI Expert