INTELLECT–1: Scaling Globally-Distributed Training to 10B Parameters for Open Source AGI

Models & Research

The Engineer

15 Oct 2024 · 3 min read

INTELLECT-1 ushers in a new era of open-source AGI by launching the world's first globally-distributed training for a 10-billion-parameter model, inviting everyone to contribute and democratize AI development.

We're thrilled to announce the launch of INTELLECT-1, marking a significant milestone in the journey towards open-source Artificial General Intelligence (AGI). This initiative involves the first globally-distributed training run of a 10-billion-parameter model, inviting anyone with compute resources to participate. This step is crucial for democratizing the training of cutting-edge AI models and ensuring that AGI remains accessible and transparent.

What Changed Technically?

Scaling Up from OpenDiLoCo:

Initial Work: We previously published OpenDiLoCo, an open-source implementation of DeepMind’s Distributed Low-Communication (DiLoCo) method. This allowed us to successfully train a 1-billion-parameter model.
Current Milestone: INTELLECT-1 scales this up by 10×, reaching a 10-billion-parameter model size, which is approximately 25× larger than the original research.

Why It Matters to Practitioners

Key Benefits:

Democratization of AI Training: By enabling anyone to contribute compute resources, we're making it possible for more individuals and organizations to participate in training large-scale models.
Efficiency in Distributed Training: The DiLoCo method significantly reduces communication requirements, making it feasible to train models on poorly connected devices. This is particularly important for global collaboration where network conditions can vary widely.

Launch Partners and Contributors

We are honored to be joined by leading open-source AI players such as Hugging Face, SemiAnalysis, Arcee, Hyperbolic, Olas, Akash, Schelling AI, and many others who are contributing compute resources to this training run. This collaboration underscores the community's commitment to advancing open-source AI.

How to Contribute Compute

If you're interested in contributing your compute resources to advance open-source AI, here’s how you can get involved:

Dashboard: https://app.primeintellect.ai/intelligence
Code Repository: https://github.com/PrimeIntellect-ai/Prime

Paradigm Shift for Distributed Training

As Jack Clark, co-founder of Anthropic, noted, no model has yet been efficiently trained at the scale of 10B parameters across globally distributed workers. Our initial OpenDiLoCo run broke through the 1B parameter barrier, and with INTELLECT-1, we are setting a new standard for low-communication training.

Prime: Our Distributed Training Framework

Since our initial open-source release, we have made significant improvements to our distributed training framework:

Algorithmic Progress

Quantization Experiments: We've conducted ablations building on top of our OpenDiLoCo work, which have shown great promise in further reducing communication requirements. Specifically:
- Pseudo-Gradients Quantization: By quantizing pseudo-gradients to int8 and using an outer optimizer sync, we've reduced bandwidth requirements by up to 2000x.
- Data Parallel Training: The method allows for data parallel training on different "islands" of devices, requiring synchronization of pseudo-gradients only every few hundred steps. This reduces the frequency of communication by up to 500 times.

Implementation Details

Communication Efficiency: DiLoCo significantly lowers the bandwidth requirements for distributed training by reducing the frequency and size of data exchanges.
Scalability: The framework is designed to scale efficiently, making it suitable for large-scale models like INTELLECT-1.
Community Involvement: By leveraging a global network of contributors, we can pool resources to achieve what would otherwise be prohibitively expensive.

Looking Ahead

INTELLECT-1 represents a significant step towards democratizing the training of advanced AI models. As we continue to improve our distributed training framework and collaborate with the community, we aim to make AGI open-source, transparent, and accessible to all.