LLM 法律问答的时间失效模式：诊断与缓解

精选理由

法律从业者和AI开发者会关心：LLM 在法律场景中的时间失效问题直接关系到合规风险，RAG 方案已被证明能有效缓解，值得在实务中尝试。

AI 摘要

大型语言模型在静态知识截止日期与动态法律条文之间存在根本矛盾，导致两种时间失效模式：一是模型在立法修订后仍使用旧规则（后截止日期失效），二是模型偏好新条款而忽略历史版本（近因偏差）。研究者构建了包含312个专家验证的德语法律问答基准，涵盖三类时间敏感问题，并评估了OpenAI、Anthropic和DeepSeek的五种模型。实验发现，在无辅助的推理设置下，模型在后截止日期场景中表现严重下降；检索增强生成（RAG）方法通过提取事实日期和版本过滤显著提升所有问题类型的准确性，而网络搜索则不稳定且加剧近因偏差。研究结论指出，可靠的法律问答必须将时间有效性作为硬约束。

AI 翻译 · 中文

arXiv: AnthropicLarge language models are increasingly used for legal research, yet their fixed training cutoffs and reliance on static parametric knowledge are at odds with the evolving nature of statutory law. We study two temporal fa…

lmarena.ai05-21 15:41原文
Thomas Wolf05-20 17:47原文
OpenAI05-20 19:06原文
Greg Brockman05-20 19:32原文
shao__meng05-21 01:10原文
Pandaily05-21 02:12原文
IT之家05-21 02:15原文
marktechpost05-21 04:58原文
Jerry Liu05-21 06:51原文
The Rundown AI05-21 10:30原文

阅读原文