3B模型VibeThinker-3B在推理任务上达到高竞争力水平

精选理由

3B的小模型在数学和代码推理上快追上大模型了，适合部署在低算力场景，值得关注。

AI 摘要

VibeThinker-3B是仅3B参数的小模型，在AIME26上取得94.3分，在LiveCodeBench v6上Pass@1达80.2，在未见过的LeetCode比赛中正确率96.1%。其训练基于Qwen2.5-Coder，结合课程SFT、多领域RL、离线自蒸馏和最终RL指导阶段。结果表明，部分可验证推理能力可被高效压缩到小密集模型中。

AI 翻译 · 中文

kimmonismusCrazy: A 3B model is now reaching highly competitive results on verifiable reasoning tasks. VibeThinker-3B scores 94.3 on AIME26, 80.2 Pass@1 on LiveCodeBench v6, and 96.1% on unseen LeetCode contests. The gains appear t…

查看原推