Gemini 3.5 Flash (Medium) 登顶 AutomationBench，中等推理表现更优

精选理由

做自动化流程的开发者终于有了性价比之选——Gemini 3.5 Flash 不仅性能领先，成本还低 7 倍，建议直接试试 medium 设置。

AI 摘要

Gemini 3.5 Flash (Medium) 在 Zapier 的 AutomationBench 基准测试中夺得第一，得分 14.5%，远超 GPT 5.5 (xhigh) 的 12.9%。值得注意的是，中等推理设置（medium）表现优于高推理（high），因为高推理会过度消耗工具调用限制。该模型还以约 7 倍的成本优势领先，成为目前最持久的自动化模型。Google 已推荐将 medium 作为默认 API 设置，适用于大多数任务。

AI 翻译 · 中文

Patrick LoeberGemini 3.5 Flash (Medium) is 🥇 on AutomationBench! Also note that medium thinking performs better than high, which matches our own evals. medium is the new default API setting and we recommend it for most tasks. more in…

查看原推