Model-aided federated reinforcement learning for multi-UAV trajectory planning in IoT networks

Chen, Jichao; Esrafilian, Omid; Bayerlein, Harald; Gesbert, David; Caccamo, Marco
GLOBECOM 2023, 6G Innovations and Emerging Technologies 1 Workshop, 4-8 December 2023, Kuala Lumpur, Malaysia

Deploying teams of cooperative unmanned aerial vehicles (UAVs) to harvest data from distributed Internet of Things (IoT) devices requires efficient trajectory planning and coordination algorithms. Multi-agent reinforcement learning (MARL) has emerged as an effective solution, but often requires extensive and costly real-world training data. In this paper, we propose a novel model-aided federated MARL algorithm to coordinate multiple UAVs on a data harvesting mission with limited knowledge about the environment, significantly reducing the real-world training data demand. The proposed algorithm alternates between learning an environment model from real-world measurements and federated QMIX training in the simulated environment. Specifically, collected measurements from the real-world environment are used to learn the radio channel and estimate unknown IoT device locations to create a simulated environment. Each UAV agent trains a local QMIX model in its simulated environment and continuously consolidates it through federated learning with other agents, accelerating the learning process and further improving training sample efficiency. Simulation results demonstrate that our proposed model-aided FedQMIX algorithm substantially reduces the need for real-world training experiences while attaining similar data collection performance as standard MARL algorithms.


Type:
Conférence
City:
Kuala Lumpur
Date:
2023-12-04
Department:
Systèmes de Communication
Eurecom Ref:
7365
Copyright:
© 2023 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PERMALINK : https://www.eurecom.fr/publication/7365