OLMo 3.1 | NLP.COM.AI

Parameters

32B Think / 32B Instruct / 7B RL-Zero variants (3 model types; dense)

License

Apache 2.0

Key Features

Extended training of OLMo 3 with additional 21 days on 224 GPUs; Think 32B outperforms Qwen3-32B on AIME 2025 and performs close to Gemma 27B; Instruct 32B is strongest fully open 32B-scale instruct model; substantial improvements: +5 points AIME, +4 points ZebraLogic, +4 points IFEval, +20 points IFBench; beats Gemma 3 on Math benchmark; updated RL-Zero 7B models for math and coding with longer, more stable training runs; demonstrates continued training past initial "completion" yields significant gains; maintains full model flow transparency with all checkpoints, datasets, and training decisions openly accessible; trained on Dolma 3 (6T tokens) with 65K context.

Paper / Source

https://allenai.org/blog/olmo3