Beaussant, Samuel Lengagne, Sebastien Thuilot, Benoit Stasse, Olivier
Despite numerous improvements regarding the sample-efficiency of Reinforcement Learning (RL) methods, learning from scratch still requires millions (even dozens of millions) of interactions with the environment to converge to a high-reward policy. This is usually because the agent has no prior information about the task and its own physical embodim...
Wibowo, Ardianto Santos, Paulo Baghdadi, Amer Stephenson, Matthew Diguet, Jean-Philippe
Space exploration robots must operate autonomously due to the challenges posed by communication delays and power constraints, especially in dynamic and unpredictable extraterrestrial environments. Decentralized Multi-Agent Reinforcement Learning (MARL) offers a potential solution by enabling agents to operate without the need for continuous communi...
Chane-Sane, Elliot Amigo, Joseph Flayols, Thomas Righetti, Ludovic Mansard, Nicolas
Parkour poses a significant challenge for legged robots, requiring navigation through complex environments with agility and precision based on limited sensory inputs. In this work, we introduce a novel method for training end-to-end visual policies, from depth pixels to robot control commands, to achieve agile and safe quadruped locomotion. We form...
Stoyanova, Ivelina Museux, Nicolas Nguyen, Sao Mai Filliat, David
This article presents Open the Chests, a novel benchmark environment designed for simulating and testing activity recognition and reactive decision-making algorithms. By leveraging temporal logic, Open the Chests offers a dynamic, event-driven simulation platform that illustrates the complexities of real-world systems. The environment contains mult...
Xu, Zhuofan Bollig, Benedikt Függer, Matthias Nowak, Thomas
Permutation equivariance (PE) is a property widely present in mathematics and machine learning. Classic deep reinforcement learning (DRL) algorithms, such as Deep Q-Network (DQN), require thoroughly exploring the state space to achieve optimal performance. For a PE problem such as the Multi-Armed Bandit (MAB) problem, the PE property helps reduce t...
Kohler, Hector Delfosse, Quentin Akrour, Riad Kersting, Kristian Preux, Philippe
Deep reinforcement learning agents are prone to goal misalignments. The black-box nature of their policies hinders the detection and correction of such misalignments, and the trust necessary for real-world deployment. So far, solutions learning interpretable policies are inefficient or require many human priors. We propose INTERPRETER, a fast disti...
Menezes Morato, Marcelo Spinola Felix, Monica
Model Predictive Control (MPC) is an established control framework, based on the solution of an optimisation problem to determine the (optimal) control action at each discrete-time sample. Accordingly, major theoretical advances have been provided in the literature, such as closed-loop stability and recursive feasibility certificates, for the most ...
Terrier, Guillaume Gueguen, Cédric Hadjadj-Aoul, Yassine
With the increased demands for 5G networks and the limited radio resources, providing high spectral efficiency, low delay, low energy consumption, and other Key Performance Indicators (KPIs) is a challenging task. Extensive research has been conducted to propose efficient solutions for specific objectives and contexts. Although these solutions (oft...
Saulières, Léo Cooper, Martin Dupin de Saint-Cyr, Florence
History eXplanation based on Predicates (HXP), studies the behavior of a Reinforcement Learning (RL) agent in a sequence of agent's interactions with the environment (a history), through the prism of an arbitrary predicate. To this end, an action importance score is computed for each action in the history. The explanation consists in displaying the...
Jinschek, Richard Konstantin Bahri, Mounib Gianni, Mario Shen, Yao-Chun Browning, Nigel D.
Published in
BIO Web of Conferences