
Share
Cohere Labs unveils Command-R, a massive 35-billion-parameter LLM adept at reasoning and summarization across 10 languages, plus standout RAG capabilities that set it apart from competitors.
Cohere Labs has released a new large language model (LLM) named Command-R. This research release is a non-quantized version of the model, featuring 35 billion parameters and optimized for various applications such as reasoning, summarization, and question answering. Command-R also boasts multilingual capabilities, evaluated in 10 languages, and strong Retrieval-Augmented Generation (RAG) performance.
To access the model, you need to agree to share your contact information and accept the conditions outlined by Cohere. This includes adhering to the CC-BY-NC License and Cohere Lab's Acceptable Use Policy. You can log in or sign up on Hugging Face to review these conditions and access the model content.
If you prefer a more resource-efficient version, a quantized 8-bit precision variant of Command-R is available. This version can be accessed using the bitsandbytes library:
Before diving into the full model, you can try Command-R in a Hugging Face Space. This provides a quick and easy way to test its capabilities without downloading the weights:

To use Command-R, you need transformers version 4.39.1 or higher. Here’s a simple example to get you started:
# pip install 'transformers>=4.39.1'
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-v01"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Format message with the command-r chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Command-R is a significant addition to the landscape of large language models, particularly for its multilingual capabilities and RAG performance. The 128K context length allows it to handle longer sequences, making it suitable for complex tasks that require understanding and generating lengthy text. For researchers and practitioners, this model offers a powerful tool for a wide range of natural language processing (NLP) applications.
Cohere Labs Command-R is a robust and versatile LLM with a focus on performance and multilingual support. Whether you're working on reasoning tasks, summarization, or question answering, this model provides a strong foundation. For those looking to balance performance with resource efficiency, the quantized version is also available.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
12 March 2024
88 articles
Related Articles
Related Articles
More Stories