novotný, radek
Tato práce analyzuje možnosti využití neuronových sítí a zpětnovazebního učení v kompetitivních hrách. Dále se věnuje problémům vzniklým při implementaci a jejich řešením. Cílem bylo implementovat hru Bomberman a využít zpětnovazebního učení pro vytvoření agenta, který dokáže konkurovat hráčům. Zkoumanými možnostmi pro hru byly Unreal Engine, Unity...
Tovey, Samuel Lohrmann, Christoph Holm, Christian
Published in
Machine Learning: Science and Technology
Reinforcement learning (RL) is a flexible and efficient method for programming micro-robots in complex environments. Here we investigate whether RL can provide insights into biological systems when trained to perform chemotaxis. Namely, whether we can learn about how intelligent agents process given information in order to swim towards a target. We...
Wang, Guanlin
Published in
Frontiers in Neurorobotics
In swimming, the posture and technique of athletes are crucial for improving performance. However, traditional swimming coaches often struggle to capture and analyze athletes' movements in real-time, which limits the effectiveness of coaching. Therefore, this paper proposes RL-CWtrans Net: a robot vision-driven multimodal swimming training system t...
Elsborg, Jonas Bhowmik, Arghya
Published in
Machine Learning: Science and Technology
Finding low-energy atomic ordering in compositionally complex materials is one of the hardest problems in materials discovery, the solution of which can lead to breakthroughs in functional materials—from alloys to ceramics. In this work, we present the Artificial Structure Arranging Net (ArtiSAN)—a reinforcement learning agent utilizing graph repre...
Baldassarre, Gianluca Duro, Richard Cartoni, Emilio Khamassi, Mehdi Romero, Alejandro Santucci, Vieri Giuliano
Autonomous open-ended learning (OEL) robots are able to cumulatively acquire new skills and knowledge through direct interaction with the environment, for example re- lying on the guidance of intrinsic motivations and self-generated goals. OEL robots have a high relevance for applications as they can use the autonomously acquired knowledge to accom...
Cao, Yajuan Tao, Chenchen
Published in
Frontiers in Energy Research
A lot of infrastructure upgrade and algorithms have been developed for the information technology driven smart grids over the past twenty years, especially with increasing interest in their system design and real-world implementation. Meanwhile, the study of detecting and preventing intruders in ubiquitous smart grids environment is spurred signifi...
Frikha, Noufel Pham, Huyên Song, Xuanye
We consider reinforcement learning (RL) methods for finding optimal policies in linear quadratic (LQ) mean field control (MFC) problems over an infinite horizon in continuous time, with common noise and entropy regularization. We study policy gradient (PG) learning and first demonstrate convergence in a model-based setting by establishing a suitabl...
Ikeuchi, Katsushi Takamatsu, Jun Sasabuchi, Kazuhiro Wake, Naoki Kanehira, Atsushi
Published in
Frontiers in Computer Science
Utilizing a robot in a new application requires the robot to be programmed at each time. To reduce such programmings efforts, we have been developing “Learning-from-observation (LfO)” that automatically generates robot programs by observing human demonstrations. So far, our previous research has been in the industrial domain. From now on, we want t...
Fazzari, Edoardo Loughlin, Hudson A Stoughton, Chris
Published in
Machine Learning: Science and Technology
This study applies an effective methodology based on Reinforcement Learning to a control system. Using the Pound–Drever–Hall locking scheme, we match the wavelength of a controlled laser to the length of a Fabry-Pérot cavity such that the cavity length is an exact integer multiple of the laser wavelength. Typically, long-term drift of the cavity le...
Tambwekar, Pradyumna Gombolay, Matthew
Published in
Frontiers in Robotics and AI
Safefy-critical domains often employ autonomous agents which follow a sequential decision-making setup, whereby the agent follows a policy to dictate the appropriate action at each step. AI-practitioners often employ reinforcement learning algorithms to allow an agent to find the best policy. However, sequential systems often lack clear and immedia...