MiniMax M3 Challenges GPT-5.5 and Gemini 3.1 Pro with Superior Performance at a Fraction of the Cost

Models & Research

The Engineer

8 Jun 2026 · 4 min read

Chinese AI startup MiniMax has unveiled M3, a powerful large language model that combines top-tier performance with open-source flexibility, all while slashing costs for enterprises.

The landscape of enterprise AI just got a major shake-up. On Sunday evening Eastern time, Chinese AI startup MiniMax released its highly anticipated M3 large language model (LLM). This new entrant is making waves by offering frontier-tier coding and agentic performance, a 1-million-token context window, and native multimodality-all at a fraction of the cost of leading proprietary models like GPT-5.5 and Gemini 3.1 Pro.

MiniMax M3 is available via the MiniMax API at a special discounted price of $0.3 per 1 million input tokens and $1.20 per million output tokens (on fresh cache) for the next week. Even at its full price of $0.6 per million input tokens and $2.40 per million output tokens, M3 remains just 8-20% the cost of leading U.S. Models.

Redefining the LLM Paradigm

The traditional matrix governing large language model development has long dictated a rigid choice: developers can either access top-tier closed-source intelligence behind restrictive APIs or deploy nimble, cost-effective open models that falter on multi-step reasoning, dense coding tasks, and massive data sequences. MiniMax M3 fundamentally upends this paradigm.

Unified Capabilities: M3 unifies the best of both worlds by combining high performance with open-source flexibility.
Cost Efficiency: At just $20 per month under its new subscription token plans, M3 is a game-changer for budget-conscious enterprises.
Open Source License: The company announced plans to release M3 under an open source license, including "open weights," allowing for full enterprise downloading and customizability free-of-charge. This will be available in the next 10 days.

M3's performance on key benchmarks is impressive:

Coding Tasks: M3 excels in multi-step reasoning and dense coding tasks, outperforming GPT-5.5 and Gemini 3.1 Pro.
Context Window: A 1-million-token context window allows for handling massive data sequences with ease.
Multimodality: Native support for multimodal inputs and outputs enhances its versatility.

| Model | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) | Total Cost (limited time) | Source | |----------------|-----------------------------|-----------------------------|----------------------------|--------------------------------| | MiMo-V2.5 Flash | $0.10 | $0.30 | $0.40 | Xiaomi MiMo | | deepseek-v4-flash | $0.14 | $0.28 | $0.42 | DeepSeek | | deepseek-v4-pro | $0.435 | $0.87 | $1.305 | DeepSeek | | MiniMax-M3 | $0.30 | $1.20 | $1.50 (limited time) | MiniMax | | Gemini 3.1 Flash-Lite | $0.25 | $1.50 | $1.75 | Google |

Under the Hood

To understand how MiniMax M3 achieves such impressive performance and cost efficiency, let's dive into some of its key architectural details:

Model Architecture: M3 is built on a transformer-based architecture, leveraging advanced attention mechanisms to handle long context windows. The model uses a combination of self-attention and cross-attention layers to efficiently process multimodal inputs.
Training Data: MiniMax has access to a vast and diverse dataset, which includes text, images, and other multimedia content. This rich training data helps M3 excel in multimodal tasks.
Optimization Techniques: The team at MiniMax employs various optimization techniques, such as mixed precision training and model pruning, to reduce the computational footprint without sacrificing performance.
Scalability: M3 is designed to scale efficiently on a variety of hardware setups, from cloud-based GPUs to edge devices. This scalability ensures that enterprises can deploy the model in different environments with minimal overhead.

Key Takeaways

MiniMax M3 represents a significant leap forward in the world of large language models. By combining top-tier performance with open-source flexibility and cost efficiency, M3 is poised to disrupt the current landscape dominated by proprietary models. Enterprises now have a powerful tool at their disposal that can handle complex tasks while keeping costs under control.

For developers and researchers, the availability of M3's open weights opens up new possibilities for experimentation and customization. The model's performance on key benchmarks and its cost structure make it an attractive option for both small startups and large enterprises looking to leverage AI without breaking the bank.