Minimax M3 在 BU Bench 测试中表现亮眼，与 Claude 4.6-sonnet 和 Gemini 3.5 flash 同级别

精选理由

做浏览器自动化或智能体开发的团队值得关注——Minimax M3 用 26% 的提升证明自己已跻身第一梯队，可以直接拿来对比测试。

AI 摘要

Minimax M3 模型在 BU Bench 基准测试中取得了显著进步，相比之前版本提升了 26%。该测试使用 browsercode 方法评估模型在浏览器自动化任务上的表现。目前 M3 的性能已与 Claude 4.6-sonnet 和 Gemini 3.5 flash 等主流模型持平。这一结果表明 Minimax 在浏览器智能体领域取得了重要突破，为未来进一步优化奠定了基础。

AI 翻译 · 中文

Browser UseWe tested Minimax M3 on BU Bench! Alexander Yue @Alezander907 MiniMax m3 is a huge 26% improvement on BU Bench with browsercode, and shows promise for some potential future improvement. Now it is on the level of Claude 4…

IT之家06-01 01:24原文
OpenRouter06-01 02:42原文
Guillermo Rauch06-01 23:40原文
Together AI06-02 20:40原文
歸藏(guizang.ai)06-01 06:01原文
岚叔06-01 10:53原文

查看原推