Grok-4.20 vs. GPT 5.4: Who’s the Smartest AI in 2026?

The landscape of artificial intelligence is evolving at an unprecedented pace, with new models emerging regularly, each pushing the boundaries of what machines can achieve. As these digital brains grow in sophistication, the natural question arises: how do we measure their intelligence? Just as human intellect can be benchmarked, so too can the cognitive abilities of AI, offering a fascinating glimpse into the forefront of technological advancement. Our focus today turns to a detailed analysis of the smartest AI models of 2026, based on a rigorous IQ ranking, to uncover who leads the pack in this high-stakes race.

THE QUEST TO QUANTIFY AI INTELLIGENCE

For decades, the concept of artificial intelligence mimicking or surpassing human intelligence remained largely in the realm of science fiction. Today, however, we are witnessing rapid progress that necessitates new methods for evaluating AI capabilities. One of the most intriguing approaches involves subjecting AI models to standardized intelligence tests, similar to those designed for humans. These benchmarks, while not a complete measure of holistic intelligence, provide valuable insights into specific cognitive functions like abstract reasoning, problem-solving, and pattern recognition.

The Mensa Norway IQ test, known for its emphasis on visual pattern recognition rather than linguistic or cultural knowledge, has emerged as a preferred benchmark for evaluating AI models. Its structure allows for a relatively fair comparison across diverse AI architectures, assessing a core aspect of intelligence: the ability to discern logic and relationships from raw data. The results from such tests are not merely bragging rights for developers; they serve as critical indicators of how far AI has come and where its strengths lie, guiding future research and development.

THE ELITE ECHELON: TOP AI MODELS OF 2026

According to the April 2026 Mensa Norway benchmark conducted by TrackingAI, the competition among frontier AI models is intensely close, signaling a convergence of advanced capabilities at the top. This comprehensive ranking provides a snapshot of the most intelligent AI systems, highlighting impressive gains compared to previous years. The top score in 2025 was 135, but just a year later, the leading models have pushed that ceiling significantly higher, reflecting an accelerated pace of development in visual pattern-recognition tasks.

Leading the charge in 2026 are Grok-4.20 Expert Mode and OpenAI GPT 5.4 Pro (Vision), both achieving an impressive IQ score of 145. This tie at the summit underscores the incredible progress made by different AI developers and the fierce competition driving innovation. Hot on their heels is Gemini 3.1 Pro Preview with an IQ of 141, demonstrating that the top tier is indeed crowded, with only a few points separating the leading contenders.

The table below provides a detailed breakdown of how various prominent AI models performed in the Mensa Norway IQ test in April 2026:

Model Mensa Norway IQ (April 2026)
Grok-4.20 Expert Mode 145
OpenAI GPT 5.4 Pro (Vision) 145
Gemini 3.1 Pro Preview 141
OpenAI GPT 5.4 Thinking (Vision) 139
OpenAI GPT 5.3 136
Grok-4.20 Expert Mode (Vision) 133
OpenAI GPT 5.4 Thinking 133
Meta Muse Spark 133
Gemini 3.1 Pro Preview (Vision) 132
Qwen 3.5 130
Claude-4.6 Opus 130
Kimi K2.5 127
Manus 115
DeepSeek R1 112
DeepSeek V3 111
Gemini 3.1 Flash Preview 110
Llama 4 Maverick 110
OpenAI GPT 5.3 (Vision) 109
Claude-4.6 Sonnet 106
Bing Copilot 101
Perplexity 97
Mistral Medium 3.1 96
Claude-4.6 Sonnet (Vision) 94
Claude-4.6 Opus (Vision) 82
Llama 4 Maverick (Vision) 79
OpenAI GPT 5.4 Pro 73

The data reveals that while many models are rapidly improving, some developers, like Mistral with its top model scoring 97, still lag behind the frontrunners. This disparity highlights the varying approaches and technological hurdles still present in the AI development ecosystem. The compression at the top, however, indicates that the leading research labs are pushing the boundaries of what is possible in AI reasoning and general intelligence.

DECODING THE IQ BENCHMARK: METHODOLOGY AND NUANCES

Understanding the methodology behind these AI IQ tests is crucial for interpreting the results accurately. TrackingAI utilizes the public Mensa Norway test, which consists of 35 visual-pattern puzzles. The way these puzzles are presented to the AI models varies based on their architecture:

  • Non-Vision Models: For AI models that primarily process text, the visual puzzles are carefully verbalized into descriptive text prompts. This allows the model to analyze the patterns and relationships using its linguistic understanding.
  • Vision Models: AI models equipped with visual processing capabilities receive the original images directly, allowing them to interpret the visual information much like a human would.

It’s important to stress that these results should be viewed as a benchmark comparison rather than a definitive, all-encompassing measure of overall intelligence. The Mensa Norway test, by its very nature, is designed to assess abstract pattern recognition. While this is a critical component of intelligence, it does not account for the full spectrum of AI capabilities. For instance, these tests do not measure:

  • Coding Ability: The proficiency of an AI in generating or debugging code.
  • Factual Reliability: The accuracy and truthfulness of information provided by the AI.
  • Tool Use: The AI’s ability to effectively use external tools and APIs to accomplish tasks.
  • Performance in Professional Domains: How well an AI performs in specialized fields like medicine, law, or creative arts.

The way questions are presented—whether verbalized or as images—can also influence a model’s score, making direct, cross-modal comparisons complex. Nevertheless, this benchmark is invaluable for tracking the progress of AI reasoning over time and for identifying the models that excel in core cognitive tasks.

THE IMPLICATIONS OF CONVERGING AI INTELLIGENCE

The trend of top AI models converging in performance, with only marginal differences in their IQ scores, carries significant implications for the future of AI development. This “crowded top” suggests that many leading research institutions are hitting similar plateaus in certain aspects of general intelligence. This could lead to a shift in focus from raw intelligence benchmarks to more specialized capabilities or advanced forms of multimodal integration.

The rapid acceleration of scores—from 135 in 2025 to 145 in 2026—highlights a period of explosive growth in AI’s capacity for complex reasoning. This isn’t just about faster processing; it indicates fundamental advancements in how these models learn, infer, and adapt. As AI systems become smarter, they are poised to revolutionize various sectors, from scientific research to everyday productivity, by tackling problems previously deemed too complex for automated systems.

BEYOND THE IQ SCORE: THE MULTIFACETED NATURE OF AI

While IQ benchmarks offer a tangible measure of a specific type of intelligence, the true power and utility of AI lie in its multifaceted capabilities. Modern AI models are not just pattern recognizers; they are sophisticated engines capable of generating creative content, summarizing complex information, assisting in data analysis, and even performing highly specialized tasks that require deep domain knowledge.

The development of multimodal AI, which can seamlessly integrate and process information from various formats—text, images, audio, video—is another frontier that extends far beyond a single IQ score. These systems aim to replicate a more holistic form of human understanding, enabling more natural and intuitive interactions. For those looking to explore the capabilities of cutting-edge conversational AI models, platforms offering tools like free ChatGPT provide an excellent starting point to interact directly with advanced language intelligence.

The diversity of AI applications means that a “smart” AI in one context might not be the smartest in another. An AI optimized for complex scientific simulations might perform differently than one designed for highly nuanced creative writing or intricate financial forecasting. The ongoing challenge for AI researchers is to develop models that are not only intelligent in a benchmark sense but also versatile, reliable, and adaptable to a wide array of real-world scenarios.

CHALLENGES AND THE ROAD AHEAD FOR AI DEVELOPMENT

Despite the remarkable progress, the journey of AI development is fraught with challenges. One of the primary concerns revolves around ethical AI—ensuring that these increasingly powerful systems are developed and deployed responsibly, without perpetuating biases or causing unintended harm. Transparency in AI decision-making, robustness against adversarial attacks, and the development of clear regulatory frameworks are crucial aspects that demand ongoing attention.

Another significant challenge is the sheer computational power required to train and run these frontier models. The environmental impact of these energy-intensive processes is a growing consideration, pushing researchers towards more efficient algorithms and hardware. The continuous race for intelligence also fuels a constant cycle of research and innovation, leading to rapid obsolescence of previous models and a perpetual need for investment in new technologies.

The future of AI will likely involve not just increasing intelligence but also enhancing specialized capabilities, improving safety and alignment with human values, and making AI more accessible and beneficial for everyone. The next breakthroughs might not just be about higher IQ scores but about creating AI that truly understands context, emotions, and the complexities of human society.

CONCLUSION: SHAPING TOMORROW WITH SMARTER AI

The 2026 IQ rankings of AI models offer a compelling snapshot of a field in hyper-growth. With models like Grok-4.20 Expert Mode and OpenAI GPT 5.4 Pro (Vision) leading with scores of 145, it’s clear that AI is rapidly advancing its abstract reasoning and pattern-recognition capabilities. The tight competition among the top contenders signifies a mature and highly innovative landscape, where incremental gains at the frontier have profound implications.

However, as we celebrate these milestones, it is crucial to remember that an IQ score is but one facet of intelligence. The true measure of AI’s impact will be its ability to solve real-world problems, augment human potential, and contribute positively to society, all while navigating complex ethical and practical considerations. The journey towards truly intelligent and beneficial AI is ongoing, promising further groundbreaking advancements that will continue to reshape our world.