Model-based versus model-free reinforcement learning in quantitative asset management
- Authors
- Publication Date
- Dec 07, 2022
- Source
- HAL-Descartes
- Keywords
- Language
- English
- License
- Unknown
- External links
Abstract
The promise of machine learning is to learn rules from raw data without any predefined programming rules. Thus, the machine learns and develops a form of intelligence by identifying these rules by itself, purely relying on data. The limits of this new paradigm are that the computer may loop forever because of an infinite number of rules and that found rules may not be stable over time. This is in particular very relevant in quantitative asset management that aims at finding rules and patterns in financial markets, well known to change behavior over time. In this thesis, we examine the central question of whether machine learning should apply with or without models in the field of quantitative asset management. Rather than supporting one or the other thesis, we examine the two approaches in turn. We initially show that machine learning provides some guidance in selecting model decisions to increase overall performance. We then show that machine learning is able to learn rules directly from data using deep reinforcement learning. We prove that this approach generalizes traditional portfolio optimization methods as it lifts the limits of convex optimization and allows for more informed decisions, by directly linking actions to data and extending the agent's states beyond mean and variance. We examined similarities between supervised and reinforcement learning (RL) and demonstrated that the gradient policy method in RL can be presented as a supervised learning method where the labels are the rewards and the loss function in the cross-entropy function. We conclude the thesis with a Bayesian analysis of the CMAES algorithm and the use of Shapley's value to better understand the machine learning model decision process.