Affordable Access

Introduction of musical knowledge and qualitative analysis in chord extraction and prediction tasks with machine learning. : application to human-machine co-improvisation

  • Carsault, Tristan
Publication Date
Dec 17, 2020
External links


This thesis investigates the impact of introducing musical properties in machine learning models for the extraction and inference of musical features. Furthermore, it discusses the use of musical knowledge to perform qualitative evaluations of the results. In this work, we focus on musical chords since these mid-level features are frequently used to describe harmonic progressions in Western music. Hence, amongs the variety of tasks encountered in the field of Music Information Retrieval (MIR), the two main tasks that we address are the Automatic Chord Extraction (ACE) and the inference of symbolic chord sequences. In the case of musical chords, there exists inherent strong hierarchical and functional relationships. Indeed, even if two chords do not belong to the same class, they can share the same harmonic function within a chord progression. Hence, we developed a specifically-tailored analyzer that focuses on the functional relations between chords to distinguish strong and weak errors. We define weak errors as a misclassification that still preserves the relevance in terms of harmonic function. This reflects the fact that, in contrast to strict transcription tasks, the extraction of high-level musical features is a rather subjective task. Moreover, many creative applications would benefit from a higher level of harmonic understanding rather than an increased accuracy of label classification. For instance, one of our application case is the development of a software that interacts with a musician in real-time by inferring expected chord progressions. In order to achieve this goal, we divided the project into two main tasks : a listening module and a symbolic generation module. The listening module extracts the musical structure played by the musician, where as the generative module predicts musical sequences based on the extracted features. In the first part of this thesis, we target the development of an ACE system that could emulate the process of musical structure discovery, as performed by musicians in improvisation contexts. Most ACE systems are built on the idea of extracting features from raw audio signals and, then, using these features to construct a chord classifier. This entail two major families of approaches, as either rule-based or statistical models. In this work, we identify drawbacks in the use of statistical models for ACE tasks. Then, we propose to introduce prior musical knowledge in order to account for the inherent relationships between chords directly inside the loss function of learning methods. In the second part of this thesis, we focus on learning higher-level relationships inside sequences of extracted chords in order to develop models with the ability to generate potential continuations of chord sequences. In order to introduce musical knowledge in these models, we propose both new architectures, multi-label training methods and novel data representations.

Report this publication


Seen <100 times