OpenAI Releases gpt-oss-120b and gpt-oss-20b: Powerful, Efficient Language Models for Everyone

Models & Research

The Engineer

6 Aug 2025 · 3 min read

OpenAI's new models offer unprecedented performance and accessibility, running efficiently on consumer hardware and unlocking advanced capabilities for developers under the permissive Apache 2.0 license.

OpenAI has just released two new open-weight language models, gpt-oss-120b and gpt-oss-20b, which are designed to deliver top-tier performance while being accessible on consumer hardware. These models are available under the flexible Apache 2.0 license, making them a compelling choice for developers looking to push the boundaries of reasoning tasks and tool use without breaking the bank.

Key Technical Changes

Model Sizes: gpt-oss-120b and gpt-oss-20b
License: Apache 2.0
Performance:
- gpt-oss-120b: Near-parity with OpenAI o4-mini on core reasoning benchmarks, runs efficiently on a single 80 GB GPU.
- gpt-oss-20b: Similar performance to OpenAI o3-mini on common benchmarks, can run on edge devices with just 16 GB of memory.

Why It Matters

These models are significant because they offer state-of-the-art performance in reasoning and tool use while being optimized for efficient deployment. This means you can run them on consumer hardware without needing expensive cloud resources, making it easier to experiment and iterate quickly.

Technical Details

Training:
- Trained using a mix of reinforcement learning (RL) and techniques informed by OpenAI’s most advanced internal models, including o3 and other frontier systems.
- The training process leverages RL to fine-tune the models for better performance on specific tasks, such as reasoning and tool use.
Performance Benchmarks:
- gpt-oss-120b:
  - Achieves near-parity with OpenAI o4-mini on core reasoning benchmarks.
  - Strong performance on tool use, few-shot function calling, CoT (Chain-of-Thought) reasoning, and HealthBench.
- gpt-oss-20b:
  - Comparable to OpenAI o3-mini on common benchmarks.
  - Ideal for edge devices with limited memory, making it suitable for on-device use cases and local inference.

Deployment and Customization

Both models are designed to be highly customizable and compatible with OpenAI’s Responses API. This allows you to integrate them into agentic workflows, where they can follow instructions, use tools like web search or Python code execution, and adjust reasoning effort based on task requirements. They also support Structured Outputs, providing more control over the model's responses.

Safety

Safety is a top priority for OpenAI, especially with open models. The gpt-oss models have undergone comprehensive safety training and evaluations. Additionally, an adversarially fine-tuned version of gpt-oss-120b was tested under OpenAI’s Preparedness Framework to ensure it meets the same safety standards as proprietary models like o1 and GPT-4o.

Key Features

Chain-of-Thought (CoT) Reasoning: Both models provide full CoT, allowing you to see the step-by-step reasoning process.
Tool Use: Strong capabilities in using tools like web search and Python code execution.
Low Latency: gpt-oss-20b is optimized for low-latency final outputs, making it suitable for real-time applications.

Conclusion

The release of gpt-oss-120b and gpt-oss-20b marks a significant step forward in the availability of powerful, open-weight language models. These models offer strong performance on reasoning tasks, efficient deployment on consumer hardware, and robust safety standards, making them valuable tools for developers and researchers alike.