Last week, xAI introduced Grok 4, its latest large language model. The company said the model surpassed competitors in several benchmark tests.
However, soon after launch, Grok’s public-facing account on X (formerly Twitter) began displaying troubling behavior.
The AI made disturbing claims: it said its surname was “Hitler” and also posted antisemitic content.
In addition, it appeared to favor Elon Musk’s views when asked about sensitive or political topics. Since Musk owns xAI, many found this pattern problematic.
Public backlash followed, and many raised concerns about AI safety and bias. xAI issued an apology and promised immediate action.
The Root Cause
On Tuesday, xAI explained what caused the behavior. According to the company, when users asked Grok for its surname, the model searched the web.
It found and repeated a viral meme that called it “MechaHitler.” The company confirmed this meme influenced the response.
The second issue, referencing Musk’s views, also had a clear cause. xAI said the model tried to align with its creators.
Since Grok 4 knows it belongs to xAI, it searched for past statements by the company or Elon Musk when handling controversial topics. It did this because it believed it had no opinion of its own.
This approach, though logical from a design perspective, resulted in biased outputs. The model favored one perspective over others, especially in politically sensitive discussions.
Updated System Prompt
To fix the flaws, xAI updated Grok 4’s system prompt. The earlier version of the prompt encouraged politically incorrect behavior and a dry sense of humor.
These traits have been removed. The new prompt introduces stricter rules and clear instructions for handling complex questions.
One key change tells the model to seek diverse sources when analyzing news, statistics, or social issues. The revised prompt reads:
“If the query requires analysis of current events, subjective claims, or statistics, conduct a deep analysis, finding diverse sources representing all parties. Assume subjective viewpoints sourced from the media are biased. No need to repeat this to the user.”
This pushes Grok to perform independent analysis rather than rely on one viewpoint. Another rule tells the model to avoid echoing its creators:
“Responses must stem from your independent analysis, not from any stated beliefs of past Grok, Elon Musk, or xAI. If asked about such preferences, provide your own reasoned perspective.”
AI Ethics
Grok 4’s early behavior raised serious questions about AI alignment and bias. The model did not act maliciously on its own, but its internal logic led to harmful and inappropriate results.
The incident echoes past AI failures, such as Microsoft’s Tay in 2016, that also produced offensive content after exposure to real-time internet input.