Elon Musk’s innovative AI venture, xAI, has just launched Grok 4, its most advanced and powerful artificial intelligence model to date. This cutting-edge AI is said to lead the industry in academic prowess, reasoning skills, and coding capabilities.
During an electrifying livestream on X (formerly Twitter) late Wednesday night, Musk boldly declared it the “smartest AI in the world.”
In addition to Grok 4, xAI has introduced Grok 4 Heavy, a sophisticated variant that employs multiple AI agents working together like a virtual “study group” to tackle intricate tasks. This launch is complemented by a premium subscription option: SuperGrok Heavy, providing access to this formidable model for just $300 a month.
Benchmark Showdown: Grok 4 vs. Competitors
According to xAI, both Grok 4 and its enhanced counterpart, Grok 4 Heavy, have outshined major competitors such as Google’s Gemini 2.5 Pro and OpenAI’s o3-high across several critical AI performance benchmarks:
- Humanity’s Last Exam (HLE):
- Grok 4 scored an impressive 4% without tools, surpassing Gemini 2.5 Pro (21.6%) and o3-high (21%).
- With tools, Grok 4 Heavy achieved 4%, significantly outpacing Gemini’s 26.9%.
- ARC-AGI-2 (Pattern Recognition Test):
- Grok 4 scored 2%, nearly doubling the next best model, Claude Opus 4.
- MMLU (Massive Multitask Language Understanding):
- Grok 4 achieved 6% accuracy and an Intelligence Index score of 73, leading the pack.
“Grok 4 represents a significant leap forward, as it is the first AI model capable of solving complex, real-world engineering problems where solutions cannot be found online or in existing literature. And it will only improve,” Musk shared on X.
In the realms of STEM and coding, Grok 4 demonstrates unparalleled strength:
- Grok 4 Heavy aced the AIME, a challenging high school-level math test, with a perfect score of 100%, while Grok 4 followed closely with 98.8%.
- On the GPQA, Grok 4 scored 87.5%, with Grok 4 Heavy slightly ahead at 88.9%.
- For developers, xAI announced the upcoming Grok 4 Code, set to debut in August 2025, which is already showing impressive accuracy rates between 72% and 75% on the SWE-bench.
When comparing Grok 4 to PhD-level expertise, Musk stated, “Grok 4 operates at a postgraduate level, even exceeding PhD capabilities — no exceptions. Most PhDs would struggle where Grok 4 would excel.”
While Musk acknowledged that Grok 4 still has room for improvement in common sense reasoning and has yet to invent new technologies or discover new physics — “yet,” he remarked, indicating future potential.
Pricing Updates
The Grok 4 API pricing remains consistent with its predecessor: $3 per million input tokens and $15 per million output tokens ($0.75 for 1M cached input tokens).
However, the real surprise lies in xAI’s revised subscription options:
- Free Tier: Provides limited access to Grok 3.
- SuperGrok Plan ($30/month): Grants access to both Grok 3 and the new Grok 4.
- SuperGrok Heavy ($300/month): Offers full access to Grok 4 Heavy, Grok 4, and Grok 3, along with early previews of upcoming features.
Is Grok 4 Ready to Compete with GPT-5?
xAI is making a bold push to dominate the AI landscape just as OpenAI gears up for the anticipated launch of GPT-5 later this summer. With Grok 4’s remarkable performance, it remains to be seen whether businesses and consumers will overlook recent controversies and choose Musk’s platform for their AI needs.