New Open-Source LLM "Miqu-1-70b" Emerges, Rivals GPT-4 Performance

Models & Research

The Engineer

1 Feb 2024 · 3 min read

A mysterious new open-source language model, miqu-1-70b, has surfaced on HuggingFace, sparking intrigue in the AI community with reported performance媲美GPT-4, despite its unknown origins.

The open-source AI community has been buzzing with excitement over the past few days following the sudden emergence of a new large language model (LLM) called "miqu-1-70b." This 70 billion parameter model was mysteriously posted on HuggingFace and quickly gained attention for its performance, which some benchmarks suggest is on par with OpenAI’s GPT-4. Here's a breakdown of what happened and why it matters.

What Changed

On January 28, a user named "Miqu Dev" uploaded the miqu-1-70b model to HuggingFace, a leading platform for sharing AI models and code. The model's prompt format is identical to that of Mistral, a well-funded Parisian AI company known for its high-performing open-source LLMs like Mixtral 8x7b.

Key points:

Prompt Format: Matches Mistral’s
Parameter Count: 70 billion
Hosting Platform: HuggingFace

Why It Matters to Practitioners

The miqu-1-70b model has generated significant interest due to its performance on common LLM benchmarks. Early tests using the EQ-Bench, a widely recognized benchmark for evaluating LLMs, indicate that it performs exceptionally well, even rivaling GPT-4 in some tasks. This is a major development because:

Performance: Approaches or surpasses state-of-the-art models
Accessibility: Open-source and available on HuggingFace, making it accessible to a wide range of researchers and developers

Technical Details

Quantization

The model's name has led to speculation that "miqu" might stand for "MIstral QUantized." Quantization is a technique used to reduce the computational and memory requirements of AI models by converting high-precision numbers (e.g., 32-bit floats) into lower-precision numbers (e.g., 8-bit integers). This makes it possible to run large models on less powerful hardware.

Benchmarks

Initial benchmarks using EQ-Bench have shown promising results:

EQ-Bench Scores: Competitive with GPT-4 on various tasks, including language understanding and generation

Availability

The model is available in both quantized and unquantized versions. Maxime Labonne, an ML scientist at JP Morgan & Chase, shared a link to the unquantized version on LinkedIn:

Unquantized Version: Available at this link

Community Reaction

The AI community has been quick to react and share their findings. Here are some highlights:

Twitter/X: Users began sharing the discovery of miqu-1-70b, highlighting its performance on various benchmarks.
LinkedIn: ML researchers like Maxime Labonne have discussed the model's potential and shared resources for further exploration.

Speculation and Future Implications

While it’s unclear whether "Miqu Dev" is associated with Mistral AI or if this is an independent effort, the model’s performance suggests a high level of expertise. Some speculate that miqu-1-70b could be a new Mistral model being covertly released to gauge community interest and gather feedback.

Future implications:

Fine-Tuning: We might see fine-tuned versions of miqu-1-70b outperforming GPT-4 in specific domains.
Research Advancements: The open-source nature of the model could accelerate research and development in the AI community.

Conclusion

The emergence of miqu-1-70b is a significant event for the open-source AI community. Its performance on par with GPT-4, combined with its accessibility, makes it a valuable resource for researchers and developers. Whether this is a new Mistral model or an independent project, the impact on the field could be substantial.