The gpt‑oss‑120b is one of OpenAI’s newly released open‑weight large language models (as of August 2025). It is designed to offer high reasoning and coding performance, while being freely available for download and use, including fine-tuning and commercial deployment.
🔍 Overview of gpt‑oss‑120b
Feature | Description |
---|---|
Model type | Mixture-of-Experts (MoE) |
Total parameters | ~120 billion |
Active parameters | ~12.7 billion per forward pass |
Number of layers | 36 transformer layers |
Experts per layer | 128 experts |
Experts activated | 4 per token |
Context window | 128,000 tokens |
License | Apache 2.0 (permissive, commercial use allowed) |
Release date | August 5, 2025 |
Performance | Comparable to OpenAI's proprietary GPT-4o-mini |
Hardware support | Optimized for multi-GPU systems (e.g., 4×A100, 8×H100), also available via Hugging Face, Databricks, and AWS |
⚙️ How It Works (Mixture-of-Experts)
-
In a Mixture-of-Experts architecture:
-
Each layer contains 128 separate expert networks.
-
For each input token, only 4 of those experts are activated.
-
This makes the model more efficient (lower compute cost) while preserving high performance.
-
This sparse activation allows the model to scale to 120B total parameters without requiring the compute of a dense 120B model.
🧠 Capabilities
-
High performance on benchmarks:
-
Reasoning: MMLU, ARC, and Big-Bench Hard
-
Math: GSM8K
-
Coding: HumanEval, MBPP
-
Health: HealthBench
-
-
Handles long documents and conversations (128K token context)
-
Effective at:
-
Chain-of-thought reasoning
-
Tool use
-
Instruction following
-
Summarization and question answering
-
🛠️ How You Can Use It
-
Download and run locally (with strong enough hardware)
-
Fine-tune or customize on your own data
-
Deploy via cloud platforms:
🔐 Safety & Policy
-
Released after extensive safety testing including simulated misuse, red-teaming, and external evaluations.
-
Not a fully open-source model (training data and pretraining code are not released), but open weights mean you have full access to the model for any use case, under the Apache 2.0 license.
📦 Where to Get It
-
Direct download from OpenAI’s GitHub (or via Hugging Face and other ML model hubs)
-
Works with:
-
Transformers libraries (
transformers
,vllm
) -
Tools like LangChain, LlamaIndex
-
Popular inference backends like
vLLM
andTGI
-
No comments:
Post a Comment