Rhys Newbury | Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning

Abstract: Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits from curriculum learning, a sweeping-angle ordering to locally represent neighboring agents and encouraging good formations with reward structure that combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach, with non-holonomic agents, performs on par with classical algorithms with omni-directional agents, and outperforms their non-holonomic adaptations. The learned policy is successfully transferred to the real world in a proof-of-concept demonstration with three motion-constrained pursuer drones.

@article{de2021decentralized,
  title = {Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning},
  author = {de Souza, Cristino and Newbury, Rhys and Cosgun, Akansel and Castillo, Pedro and Vidolov, Boris and Kuli{\'c}, Dana},
  journal = {IEEE Robotics and Automation Letters},
  volume = {6},
  number = {3},
  pages = {4552--4559},
  year = {2021},
  publisher = {IEEE},
      }