Elon Musk’s AI company, xAI, released its latest flagship AI model, Grok 3, late Monday night, along with new capabilities in the Grok app for iOS and the web.
Grok, xAI’s answer to models like OpenAI’s GPT-4o and Google’s Gemini, can analyze images and respond to questions, and powers a number of features on Musk’s social network, X. Grok 3, which has been in development for several months, was optimistically slated for release in 2024, but missed that deadline.
xAI has been using an enormous data center in Memphis — a data center containing around 200,000 GPUs — to train Grok 3. In a post on X, Musk claimed that Grok 3 was developed with “10x” more computing than Grok 2, its predecessor, and with an expanded training data set that ostensibly includes filings from court cases.

“Grok 3 is an order of magnitude more capable than Grok 2,” Musk said during a live-streamed presentation Monday. “[It’s a] maximally truth-seeking AI, even if that truth is sometimes at odds with what is politically correct.”
Grok 3 is a family of models, to be precise — not just one. A smaller version of Grok 3, Grok 3 mini, responds to questions more quickly at the cost of some accuracy. Not all models are available as of yet, but the rollout begins Monday.
xAI claims that Grok 3 beats GPT-4o on benchmarks including AIME, which evaluates a model’s performance on a sampling of math questions, and GPQA, which tests models with PhD-level physics, biology, and chemistry questions. An early version of Grok 3 also scored competitively in Chatbot Arena, a crowdsourced test that pits different AI models against each other and has users vote on their preferred responses, according to xAI.

Two variations of Grok 3, Grok 3 Reasoning and Grok 3 mini Reasoning, can carefully “think through” problems, similar to “reasoning” models like OpenAI’s o3-mini and Chinese AI company DeepSeek’s R1. Reasoning models thoroughly fact-check themselves before giving out results, which helps them avoid some of the pitfalls that normally trip up models.
xAI claims that Grok 3 Reasoning surpasses the best version of o3-mini — o3-mini high — on several popular benchmarks, including a newer mathematics benchmark called AIME 2025.

The reasoning models can be accessed via the Grok app. Users can ask Grok 3 to “think,” or — for more difficult questions — leverage “Big Brain” mode for additional, more careful reasoning. xAI describes the modes as best suited for mathematics-, science-, and coding-related questions.
Musk said that some of the reasoning process is being obscured to prevent distillation, a method used by AI model developers to extract knowledge from another model. Recently, Chinese AI company DeepSeek was accused of distilling OpenAI’s models to create its own.
Grok’s reasoning mode joins another new feature called DeepSearch, xAI’s answer to AI-powered “deep research” tools like OpenAI’s Deep Research. DeepSearch scans the internet and X to analyze information and deliver an abstract in response to a query.
Subscribers to X’s Premium+ subscription will get Grok 3 first, and other features are gated behind a subscription that xAI’s calling SuperGrok. Priced at $30 per month or $300 per year, SuperGrok unlocks additional reasoning and DeepSearch queries and throws in unlimited image generation.

In the future — as soon as about a week from now — Grok will gain a voice mode, Musk said. A few weeks later, the Grok 3 models will arrive in xAI’s API.
When Musk announced Grok roughly two years ago, he pitched the AI as edgy, unfiltered, and anti-“woke” — in general, willing to answer controversial questions other AI systems won’t. He delivered on some of that promise. Told to be vulgar, for example, Grok and Grok 2 would happily oblige, spewing colorful language you likely wouldn’t hear from ChatGPT.
But Grok models prior to Grok 3 hedged on political subjects and won’t cross certain boundaries. In fact, one study found that Grok leaned to the political left on topics like transgender rights, diversity programs, and inequality.
Musk has blamed the behavior on Grok’s training data — public web pages — and pledged to “shift Grok closer to politically neutral.” It’s not clear yet whether xAI achieved that goal.