~120B / 5.1B (MoE) and ~21B / 3.6B (MoE)
Apache 2.0
First OpenAI open-weight release since GPT-2 (2019); both models use MoE architecture with MXFP4 quantization; 120B for reasoning and complex tasks (fits single H100); 21B for lightweight applications and on-device deployment (runs on 16GB); 128K context; adjustable reasoning effort levels; strong on code generation and structured reasoning.