VibeThinker-3B：探索小语言模型的可验证推理前沿

精选理由

想看看3B小模型怎么打平千亿级大模型？VibeThinker-3B用AIME 94.3分、LiveCodeBench 80.2%的成绩告诉你，小模型也能杀进顶级推理梯队。

AI 摘要

VibeThinker-3B是一个3B参数的小型稠密模型，基于Spectrum-to-Signal后训练范式，通过课程监督微调、多域强化学习和离线自蒸馏提升。在AIME26上达到94.3分（测试时扩展至97.1），LiveCodeBench v6上Pass@1为80.2，最新LeetCode竞赛接受率96.1%。其性能与DeepSeek V3.2、GLM-5和Gemini 3 Pro等旗舰大模型相当或超越。IFEval得分为93.4，表明强推理未损害指令遵循能力。该工作提出了参数压缩-覆盖假说：可验证推理可压缩为紧凑推理核心，而开放域知识需宽参数覆盖。

AI 翻译 · 中文

arXiv: DeepSeekThis technical report introduces VibeThinker-3B, a compact dense model with 3B parameters developed to investigate how far verifiable reasoning can be pushed within a strictly small-model regime. Building upon the Spectr…

阅读原文