偏好协调的多目标多智能体强化学习PCMA

精选理由

让多智能体学会互相配合完成多目标任务

AI 摘要

本研究提出了偏好协调多智能体策略优化（PCMA），用于解决合作多目标多智能体强化学习中的冲突问题。PCMA为每个智能体学习协调的个性化偏好，使智能体在多个目标（如效率与公平）之间形成互补性权衡。理论证明，在一定条件下，偏好多样性可通过一阶改进分解推动团队整体提升。在多个合作多目标环境及实际交通控制场景中，PCMA同时提升了任务性能和权衡协调能力。

AI 翻译 · 中文

arXiv cs.AICooperative multi-objective multi-agent reinforcement learning (MOMARL) models team decision making under multiple, potentially conflicting objectives. In this setting, conflicts arise not only across objectives but also…

阅读原文