AA-Briefcase评分显示AI模型快速进步，开放与封闭模型差距明显

精选理由

新评测让AI做多周复杂咨询，结果看到开放模型和封闭模型差距挺大，进步也很快。

AI 摘要

AA-Briefcase评分由@ArtificialAnlys发布，用于衡量AI完成多周复杂咨询任务的能力。最新得分曲线显示，AI模型在短期内取得了快速进步。开放权重模型与封闭模型之间存在明显的差距，封闭模型整体表现更优。该评测揭示了当前AI在多步骤复杂任务中的能力差异。

AI 翻译 · 中文

Ethan MollickI took the new AA-Briefcase scores from @ArtificialAnlys (basically having the AI do multi-week consulting gigs with a lot of complexity) and graphed the frontier curve for open and closed models: 1) Surprise, rapid gain…

查看原推