Claude Fable 5 登顶 FrontierCode 基准，一天内刷新纪录

精选理由

Claude Fable 5 在真实工程任务基准上碾压 Opus，做复杂代码合并的开发者可以直接在 Devin 中体验，效率提升立竿见影。

AI 摘要

在 FrontierCode 基准发布仅一天后，Cognition 的 Claude Fable 5 模型即成为新的最高分获得者，尤其在最具挑战性的任务上表现突出。在 FrontierCode Diamond 子集上，Fable 5 得分从 13.4% 跃升至 29.3%，远超 Opus 的 4.8%。该基准专注于真实世界的工程任务，评估代码的可合并性和质量。Fable 5 现已可在 Devin 中使用，为开发者提供更强的编程辅助能力。

AI 翻译 · 中文

Scott WuA new top scorer just one day after our benchmark released! Especially strong on the hardest tasks: 13.4% -> 29.3% on FrontierCode Diamond compared to Opus 4.8. Cognition @cognition Claude Fable 5 is now available in …

Cognition06-09 17:25原文
Decoder06-09 18:25原文
berryxia06-09 22:47原文
Jerry Liu06-10 01:26原文
Alex Albert06-09 17:09原文
The Rundown AI06-09 17:09原文
OpenRouter06-09 17:13原文
宝玉06-09 17:22原文
rohanpaul_ai06-09 17:53原文
Andrej Karpathy06-09 18:10原文

查看原推