AI编程助手找对文件但漏关键行，SWE-Explore基准揭示问题

精选理由

AI编程助手找不准关键行

AI 摘要

一项新研究显示，Claude Code和Codex等AI编程助手在定位文件时准确率较高，但会错过文件中大部分关键代码行。新发布的SWE-Explore基准首次将代码搜索与修复步骤分离测试，发现缺乏足够上下文时，即使最佳修复也会失败。该基准评估了多个模型，结果显示它们平均只能找到约30%的关键行。这表明AI编码代理在精确理解代码逻辑方面仍有显著短板。

AI 翻译 · 中文

DecoderAI coding agents like Claude Code or Codex reliably find the right file but miss most of the critical lines within it. The new SWE-Explore benchmark is the first to test code search separately from the actual repair, and…

阅读原文