Revisiting AI Timelines: The Impact of Compute on AGI Progress

Models & Research

The Engineer

23 Jan 2024 · 3 min read

As advancements in computing power accelerate, the predicted timeline for achieving AGI has dramatically shortened, with a significant increase in the likelihood of reaching this milestone within the next decade.

In August 2020, I wrote a post about my predictions for the timeline to achieve Artificial General Intelligence (AGI). My definition of AGI is an AI system that matches or exceeds humans at almost all (95%+) economically valuable work. To clarify, this doesn’t necessarily mean AIs need to do 100% of the work of 95% of people; if they did 95% of the work for everyone, it would also count.

Back then, my forecast was:

10% chance by 2035
50% chance by 2045
90% chance by 2070

Fast forward to 2024, and here’s how those numbers have shifted:

10% chance by 2028 (about 5 years)
25% chance by 2035 (about 10 years)
50% chance by 2045
90% chance by 2070

To understand why these timelines have compressed, let's dive into the role of compute and how it has influenced my outlook.

The Role of Compute in AGI Progress

When I last seriously considered the path to AGI, I identified two broad hypotheses:

Hypothesis 1: Scaling is enough for AGI
- Many challenging problems will be solved simply by making models larger.
- While scaling up won't be easy, the technical challenges will be addressed and overcome relatively soon.

Hypothesis 2: Scaling current methods is not sufficient
- Scaling alone will hit a ceiling, and we’ll need fundamentally new ideas to make further progress.
- These new ideas will likely be far from current state-of-the-art methods and will take considerable time to develop.

In my 2020 post, I grappled with the question of how much AI capabilities are driven by better hardware versus new machine learning (ML) algorithms. My simplified estimate at the time was that 50% of AGI progress would come from compute, and 50% from better algorithms. However, as more models were scaled up, I revised this to 65% compute and 35% algorithms.

Why the Shift?

The shift in my timelines is largely due to the growing evidence supporting Hypothesis 1. In 2020, I noted that many human-like learning behaviors could be emergent properties of larger models. Since then, this view has become more mainstream. Here’s a breakdown of what changed:

Emergent Properties: Larger models have demonstrated capabilities that were previously thought to require sophisticated algorithms or domain-specific knowledge. For example, large language models (LLMs) like GPT-3 and its successors have shown impressive performance in tasks ranging from natural language understanding to code generation.
Mainstream Acceptance: The idea that "things emerge at scale" is now more widely accepted in the AI community. This has been bolstered by numerous empirical successes where scaling up models led to significant improvements in performance.
Technical Feasibility: Advances in hardware and distributed computing have made it feasible to train much larger models than was possible a few years ago. For instance, the use of specialized hardware like GPUs and TPUs, along with more efficient training algorithms, has accelerated progress.

Current Outlook

Given these developments, I now believe that AGI could be closer than previously anticipated. However, it’s important to note that while compute scaling has been a significant driver, it is not the only factor. New algorithms and architectural innovations will still play a crucial role in achieving AGI.

Conclusion

The path to AGI is complex and multifaceted, but the evidence suggests that we are making faster progress than many anticipated. The role of compute in driving this progress cannot be overstated. As we continue to push the boundaries of what’s possible with larger models, it’s exciting to consider what the next few years might bring.