Kolmogorov回归用于鲁棒扩散策略

精选理由

这篇论文用Kolmogorov方程改进了扩散策略，在PushT和制造线上奖励提升17%、漂移减少67%，还能做故障检测，比普通扩散方法靠谱。

AI 摘要

这篇论文提出用向后Kolmogorov方程将扩散策略提升至Cameron-Martin空间，以确定性PDE替代随机分数匹配。在PushT操作基准上，Cameron-Martin损失使最大回合奖励提升17%（0.95 vs 0.78），推理时步间漂移降低67.6%。在6站CONWIP制造线上，RMSE较LSTM降低28.4%，饥饿事件召回率达1.0，瓶颈识别Precision@1=1.0，信噪比13倍。结合Hamilton-Jacobi可达性理论，死锁事件减少96%（351次预防）。方法提供收敛保证、轨迹规整性和无奖励信号的故障检测。

AI 翻译 · 中文

arXiv cs.AIFinite-dimensional (FD) diffusion policies exhibit temporal drift owing to discretization artifacts that degrade long-horizon performance (when deployed on physical systems). We introduce a backward Kolmogorov equation t…

阅读原文