Much work has been dedicated to the exploration of Multi-Agent Reinforcement Learning (MARL) paradigms implementing a centralized learning with decentralized execution approach to improve human-like collaboration in cooperative tasks. Here, we introduce variations of centralized training to describe cases where shared/independent reward structure is utilized to improve learning by training agents in an intelligent way, and to analyze cooperative behavior in multi-agent systems. This work discusses implications for providing a classification of the recent MARL algorithmic approaches on the basis of their information sharing mechanism (e.g., reward, gradient, action, parameter, observation/state space sharing) on cooperative behavior in multi-agent systems.
|