Recently, Nvidia CEO Jensen Huang stirred up quite a buzz during a conversation with podcaster Lex Fridman, where he asserted that artificial general intelligence (AGI) has already been achieved. This claim touches on a long-term goal of researchers working in artificial intelligence, even though the exact definition of AGI is still debated among experts. Generally, AGI refers to a type of AI that possesses human-level intelligence, but there is significant contention over how to accurately characterize and quantify intelligence itself.
Fridman presented a unique criterion for AGI, proposing that if AI could initiate and expand a technology business to a valuation of $1 billion, it could be classified as AGI. Huang responded that he believed this level of achievement was already upon us, although he qualified his statement by suggesting that the $1 billion threshold didn’t have to be permanent. “You said a billion,” he noted, indicating that the value could fluctuate.
Most AI researchers appear to disagree with this particular definition, arguing that it is too specific. While it focuses on business success, AGI is typically understood as encompassing a broader array of cognitive skills similar to those of humans, many of which may not be relevant for running a business. However, within the research community, there remains no consensus on an ideal definition, leaving the term somewhat elusive. Notably, several prominent AI organizations, boasting a combined market value exceeding $1 trillion, claim they are striving toward AGI. Some researchers even avoid using the term due to its ambiguous nature, while others suggest that companies exploit the ill-defined concept to generate hype around their advancements.
The intrigue surrounding Huang’s comments underscores ongoing challenges within the AI landscape regarding the measurement of AGI. Just prior to Fridman's podcast, researchers at Google DeepMind—including co-founder Shane Legg, who played a key role in popularizing AGI in the early 2000s—released a research paper aimed at establishing more scientific criteria to assess whether AI has achieved general intelligence. Titled “Measuring Progress Toward AGI: A Cognitive Framework,” the study introduces a “Cognitive Taxonomy” that identifies ten essential cognitive faculties, such as reasoning and social cognition, as indicators of general intelligence.
The authors suggest that AI systems should be evaluated across these faculties and their performance compared to a standard sample of human adults with at least a secondary education background. They highlight that contemporary AI models have a fragmented cognitive profile: while excelling in specific areas like mathematics and factual recall, they lag in others, such as experiential learning or social understanding. To qualify as AGI, an AI must reach median human performance across all ten dimensions, according to the researchers.
As part of their efforts, the team also announced a $200,000 contest on Kaggle to solicit external assistance in developing evaluations for five cognitive faculties that lack robust benchmark tests. This paper is among several recent initiatives to inject scientific rigor into the measurement of intelligence.
Earlier last year, a group led by Dan Hendrycks at the Center for AI Safety published their own framework and metrics for AGI, dividing it into ten cognitive domains based on a validated human intelligence model. Their study generated "AGI Scores" for existing AI models, with OpenAI's GPT-5 scoring just 57%, indicating a significant gap in matching a well-educated adult's cognitive capabilities.
Adding to the conversation is François Chollet, a machine learning researcher, who proposed the ARC-AGI benchmark as a means to evaluate how well current AI systems learn new skills rather than what they already know. His benchmark comprises visual puzzle challenges that require nuanced reasoning—tasks where AI struggles compared to humans.
Chollet's team recently unveiled ARC-AGI-3, a more sophisticated version of the benchmark featuring interactive tasks that require AI agents to navigate new situations and learn continuously, reflecting abilities that humans excel at but AI is still striving to master. These new benchmarks showcase the community's push toward creating empirical measures of AGI, yet defining intelligence remains a longstanding challenge.
Historically, in 1950, British mathematician Alan Turing grappled with creating a definition for intelligence prior to the establishment of the term "artificial intelligence." Instead of a rigid definition, Turing formulated the "Imitation Game," later known as the Turing Test, which posited that an intelligent machine should be indistinguishable from a human in conversation. However, the test itself faced scrutiny when early chatbots like Eliza demonstrated the ability to deceive users without possessing true understanding.
The Turing Test's limitations prompted the evolution of the term “artificial general intelligence,” first used in a 1997 paper by Mark Gubrud and later popularized by Legg in the early 2000s. Gubrud described AGI as a system that could rival human cognitive capabilities across various tasks. Yet the term still lacks clarity, with various definitions circulating in both academic and corporate environments, often causing confusion around its implications.
For instance, DeepMind has embraced AGI as part of its corporate mission since its founding in 2010, followed by OpenAI explicitly stating it would work toward AGI in its foundational principles. Notably, OpenAI's financial goal for AGI has been set at a profit benchmark of $100 billion, yet the company has not approached this threshold, raising further questions about what constitutes AGI.
Even as OpenAI's CEO, Sam Altman, claims proximity to achieving AGI through other metrics, he sometimes acknowledges the term's vagueness. Meanwhile, Microsoft's portrayal of AI projects like GPT-4 has faced critique for exaggerating their capabilities in relation to AGI.
All these factors illustrate the complexity of defining and measuring AGI amid evolving technology and corporate narratives. Huang, a pivotal figure in Nvidia's ascent to a market value exceeding $4 trillion, recognizes these nuances. His acknowledgment of the gap between human cognition and AI capabilities suggests that while the conversation around AGI is vibrant, we might need a new standard—perhaps “artificial Jensen intelligence”—to truly capture and celebrate the singular achievements of innovative minds within the AI landscape.