Cohere Labs Releases 35B Parameter Command-R LLM with Multilingual and RAG Capabilities

Models & Research

The Engineer

12 Mar 2024 · 3 min read

Cohere Labs unveils Command-R, a massive 35-billion-parameter LLM adept at reasoning and summarization across 10 languages, plus standout RAG capabilities that set it apart from competitors.

Cohere Labs has released a new large language model (LLM) named Command-R. This research release is a non-quantized version of the model, featuring 35 billion parameters and optimized for various applications such as reasoning, summarization, and question answering. Command-R also boasts multilingual capabilities, evaluated in 10 languages, and strong Retrieval-Augmented Generation (RAG) performance.

Key Technical Details

Model Size: 35 billion parameters
Context Length: 128K tokens
Multilingual Support: Evaluated in 10 languages
RAG Capabilities: Highly performant for retrieval-augmented generation tasks

Access and Licensing

To access the model, you need to agree to share your contact information and accept the conditions outlined by Cohere. This includes adhering to the CC-BY-NC License and Cohere Lab's Acceptable Use Policy. You can log in or sign up on Hugging Face to review these conditions and access the model content.

Quantized Version

If you prefer a more resource-efficient version, a quantized 8-bit precision variant of Command-R is available. This version can be accessed using the bitsandbytes library:

Quantized Model: CohereLabs/c4ai-command-r-v01-4bit

Try It Out

Before diving into the full model, you can try Command-R in a Hugging Face Space. This provides a quick and easy way to test its capabilities without downloading the weights:

Try Command-R: Hugging Face Space

Usage Example

To use Command-R, you need transformers version 4.39.1 or higher. Here’s a simple example to get you started:

# pip install 'transformers>=4.39.1'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/c4ai-command-r-v01"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")

gen_tokens = model.generate(
    input_ids,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.3,
)

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Why It Matters

Command-R is a significant addition to the landscape of large language models, particularly for its multilingual capabilities and RAG performance. The 128K context length allows it to handle longer sequences, making it suitable for complex tasks that require understanding and generating lengthy text. For researchers and practitioners, this model offers a powerful tool for a wide range of natural language processing (NLP) applications.

Conclusion

Cohere Labs Command-R is a robust and versatile LLM with a focus on performance and multilingual support. Whether you're working on reasoning tasks, summarization, or question answering, this model provides a strong foundation. For those looking to balance performance with resource efficiency, the quantized version is also available.