Cogito v1: Open-Sourcing Advanced LLMs Trained with Iterated Distillation and Amplification

Models & Research

The Engineer

10 Apr 2025 · 3 min read

DeepCogito unveils Cogito v1, a suite of advanced language models ranging from 3B to 70B parameters, trained with a groundbreaking method to push the boundaries of AI superintelligence.

Cogito v1: Open-Sourcing Advanced LLMs Trained with Iterated Distillation and Amplification

April 8, 2025

DeepCogito has just released the first iteration of its Cogito v1 language models, which are available in sizes ranging from 3B to 70B parameters. These models not only outperform their open-source counterparts but also introduce a novel training strategy aimed at achieving general superintelligence.

Key Highlights

Model Sizes and Performance: Cogito v1 includes models of 3B, 8B, 14B, 32B, and 70B parameters. Each model surpasses the best available open-source models of similar sizes, including those from LLaMA, DeepSeek, and Qwen. Notably, the 70B model even outperforms the recently released Llama 4 109B MoE (Mixture of Experts).
Training Strategy: The models are trained using Iterated Distillation and Amplification (IDA), a scalable and efficient method for aligning LLMs towards general superintelligence through iterative self-improvement.
Flexibility in Usage: Users can choose to have the models answer directly or engage in self-reflection before responding, enhancing their reasoning capabilities.
Future Releases: Larger models of 109B, 400B, and 671B parameters are planned for release in the coming weeks and months, along with improved checkpoints for existing sizes.

Download and Usage

You can download Cogito v1 models from:

Or use them directly through APIs:

The Path to General Superintelligence

The journey towards general superintelligence has been marked by significant milestones, such as AlphaGo and other game-playing AIs. These systems demonstrated that advanced reasoning and iterative self-improvement are crucial for achieving superhuman performance in specific domains.

However, current LLMs face inherent limitations:

Smaller models inherit an upper bound of intelligence from larger models they are distilled from.
Largest models, often trained on human-curated data, remain constrained by the intellectual capabilities of their human overseers.

While improved reasoning can bring us closer to Artificial General Intelligence (AGI), true general superintelligence requires surpassing these limitations. DeepCogito aims to address this through iterative self-improvement integrated with advanced reasoning.

Iterated Distillation and Amplification (IDA)

Iterated Distillation and Amplification (IDA) is a training strategy designed to align LLMs towards general superintelligence without being bounded by the capabilities of their overseers. Here’s how it works:

Distillation: Smaller models are trained to mimic the behavior of larger, more capable models.
Amplification: Larger models are enhanced through self-improvement techniques, allowing them to surpass the initial capabilities of their human or model-based overseers.

This approach is scalable and efficient, making it a promising method for advancing LLMs towards more general intelligence. By iteratively refining and enhancing models, IDA aims to break the limitations imposed by current training paradigms.

Conclusion

Cogito v1 represents a significant step forward in the development of advanced language models. With its superior performance, flexible usage options, and innovative training strategy, it sets the stage for future advancements towards general superintelligence. Keep an eye out for upcoming releases of larger models and improved checkpoints.