Giorgia Ramponi
ETH Zurich
giorgia.ramponi@polimi.it
Bio
Giorgia is a postdoctoral researcher at ETH AI Center advised by Andreas Krause and Niao He. Her research interests lie in machine learning and mathematical modeling with a focus on Reinforcement Learning and Multi-Agent learning. Previously she worked on Social Network Analysis with Marco Brambilla and Stefano Ceri and on Networking with Gaia Maselli. In June 2021 she completed her Ph.D. in Information Technology at Politecnico di Milano (with honors) advised by Marcello Restelli. In July 2017 she obtained a Master of Science in Computer Science with the Honours Programme (110/110 cum laude) at la Sapienza advised by Flavio Chierichetti and Alessandro Panconesi. In 2015 she obtained a Bachelor of Science in Computer Science with the Honours Programme (110/110 cum laude) at la Sapienza advised by Gaia Maselli.
Multi-agent RL: Truly batch model-free Inverse Reinforcement Learning about multiple intentions
Multi-agent RL: Truly batch model-free Inverse Reinforcement Learning about multiple intentions
In recent years Reinforcement Learning methods have made substantial progress in solving real-world problems. The most successful applications such as beating the world champion player of Go facing robotic control problems achieving promising results in autonomous driving involve multiple agents. However although the Multi-Agent RL (MARL) setting is an important research area of practical interest this framework is still poorly understood from a theoretical viewpoint. In general the presence of many agents makes the learning problem more complex and in many situations single RL algorithms cannot be applied. In our research we take a step toward solving this problem providing theoretically sound algorithms for this setting. We analyze the challenges and opportunities that a multi-agent environment creates in the RL framework providing new approaches in three RL subproblems: Inverse RL (IRL) online learning in MARL and optimization in MARL. In this presentation we consider IRL about multiple intentions i.e. the problem of estimating the unknown reward functions optimized by a group of experts that demonstrate optimal behaviors and cluster the experts by the recovered rewards. Most of the existing algorithms either require access to a model of the environment or need to repeatedly compute the optimal policies for the hypothesized rewards. However these requirements are rarely met in real-world applications in which interacting with the environment can be expensive or even dangerous. We address the IRL about multiple intentions in a fully model-free and batch setting. We first cast the single IRL problem as a constrained likelihood maximization and then we use this formulation to cluster agents based on the likelihood of the assignment. In this way we can efficiently solve without interactions with the environment both the IRL and the clustering problem. Finally we evaluate the proposed methodology on simulated domains and on a real-world social network application.