从漂移到一致：LLM中的信念稳定性研究

精选理由

这篇论文发现了LLM回答重复问题时信念会自己稳定，还给了两种让模型更一致的方法，适合关注推理可靠性的读者。

AI 摘要

论文发现LLM在多项选择问答中早期存在信念漂移，违背鞅性质。通过提出的提示预测重采样（PPR）方法，模型在多次重采样后信念自稳定并收敛。基于此，研究者进一步提出种子答案提示策略和自一致性损失微调方法。在多项选择QA基准测试中，这些方法显著减少信念漂移并提高预测一致性，且不牺牲准确性。

AI 翻译 · 中文

arXiv cs.LGLarge language models (LLMs) are often hypothesized to perform implicit Bayesian inference, yet a key coherence condition, the martingale property of predictive beliefs, has been shown to fail in controlled synthetic in-…

阅读原文