Nanbeige4-3B-Thinking-2511

Parameters

3B (dense)

License

Apache 2.0

Key Features

Small reasoning model with exceptional performance-to-size ratio; outperforms Qwen3-32B on AIME 2024 (90.4 vs 81.4) and GPQA-Diamond (82.2 vs 68.7); trained on 23T tokens with novel Fine-Grained Warmup-Stable-Decay (FG-WSD) technique; ranks #11 on WritingBench and #15 on EQBench3; scores 60 on Arena-Hard V2; SOTA open-source under 32B parameters on multiple benchmarks; uses advanced knowledge distillation and RL optimization; enhanced iteration of 2510 version; demonstrates extreme efficiency through training innovation rather than parameter scaling.

Paper / Source

https://arxiv.org/abs/2512.06266