
Share
DeepCogito unveils Cogito v1, a suite of advanced language models ranging from 3B to 70B parameters, trained with a groundbreaking method to push the boundaries of AI superintelligence.
April 8, 2025
DeepCogito has just released the first iteration of its Cogito v1 language models, which are available in sizes ranging from 3B to 70B parameters. These models not only outperform their open-source counterparts but also introduce a novel training strategy aimed at achieving general superintelligence.
You can download Cogito v1 models from:
Or use them directly through APIs:

The journey towards general superintelligence has been marked by significant milestones, such as AlphaGo and other game-playing AIs. These systems demonstrated that advanced reasoning and iterative self-improvement are crucial for achieving superhuman performance in specific domains.
However, current LLMs face inherent limitations:
While improved reasoning can bring us closer to Artificial General Intelligence (AGI), true general superintelligence requires surpassing these limitations. DeepCogito aims to address this through iterative self-improvement integrated with advanced reasoning.
Iterated Distillation and Amplification (IDA) is a training strategy designed to align LLMs towards general superintelligence without being bounded by the capabilities of their overseers. Here’s how it works:
This approach is scalable and efficient, making it a promising method for advancing LLMs towards more general intelligence. By iteratively refining and enhancing models, IDA aims to break the limitations imposed by current training paradigms.
Cogito v1 represents a significant step forward in the development of advanced language models. With its superior performance, flexible usage options, and innovative training strategy, it sets the stage for future advancements towards general superintelligence. Keep an eye out for upcoming releases of larger models and improved checkpoints.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
10 April 2025
133 articles
Related Articles
Related Articles
More Stories