Aya Expanse: Cohere's State-of-the-Art Multilingual Models to Bridge the Language Gap

Models & Research

The Engineer

25 Oct 2024 · 3 min read

Cohere For AI introduces Aya Expanse, a suite of multilingual models that excel in 23 languages, surpassing competitors in accuracy and efficiency, to significantly narrow the global language divide.

Cohere For AI, the research arm of Cohere, has unveiled Aya Expanse, a groundbreaking family of multilingual models designed to enhance language support and bridge the global language gap. This new suite of models excels across 23 languages and outperforms leading open-weight models in both accuracy and performance.

Technical Breakdown

Model Variants

Aya Expanse 8B: An accessible, 8 billion parameter model available on Hugging Face.
Aya Expanse 32B: A state-of-the-art, 32 billion parameter model also available on Hugging Face and Kaggle.

These models are part of Cohere’s ongoing commitment to multilingual research and aim to accelerate advancements in the field. The 8B model is particularly noteworthy for making cutting-edge research more accessible to a broader range of researchers worldwide.

Performance Highlights

Aya Expanse 32B outperforms:
- Gemma 2 27B
- Mistral 8x22B
- Llama 3.1 70B (a model more than twice its size)
Aya Expanse 8B outperforms leading models in its parameter class, such as:
- Gemma 2 9B
- Llama 3.1 8B
- Ministral 8B

Win rates for Aya Expanse 8B range from 60.4% to 70.6%, demonstrating significant improvements in multilingual performance.

Research and Development

Aya Expanse builds on Cohere’s extensive work over the past two years, which has involved collaboration with over 3,000 researchers from 119 countries. This collaborative effort has led to several critical milestones:

Aya Collection: The largest multilingual dataset collection to date, featuring 513 million examples.
Evaluation Sets:
- Multilingual Performance: A comprehensive suite for evaluating model performance across multiple languages.
- Safety: A red-teaming dataset to ensure models are safe and reliable.

Aya-101: The most comprehensive multilingual model covering 101 languages, released as part of the Aya initiative.

Key Innovations

The improvements in Aya Expanse stem from a sustained focus on expanding AI's language capabilities. Cohere’s research agenda has included:

Data Arbitrage: Leveraging high-quality data to improve model training and performance.
Model Architecture: Optimizing the architecture for better multilingual support, including techniques like cross-lingual transfer learning.
Evaluation Metrics: Developing robust metrics to accurately assess multilingual capabilities.

Why It Matters

For practitioners, Aya Expanse represents a significant step forward in addressing the language gap. The models' high performance across multiple languages means they can be used in a wide range of applications, from speech recognition and translation to content generation and understanding. This is particularly important for underrepresented languages, where quality AI support has been lacking.

Conclusion

Aya Expanse is not just another model; it's a significant leap in multilingual AI research. By making these models open and accessible, Cohere For AI continues to drive innovation and inclusivity in the field. Whether you're a researcher looking to push the boundaries of multilingual AI or a developer seeking robust language support, Aya Expanse offers a powerful toolset.