Nouira, Asma
Genome-Wide Association Studies, or GWAS, aim at finding Single Nucleotide Polymorphisms (SNPs) that are associated with a phenotype of interest. GWAS are known to suffer from the large dimensionality of the data with respect to the number of available samples. Many challenges limiting the identification of causal SNPs such as dependency between SN...
Anani, Thibault Delbot, Francois Pradat-Peyre, Jean-François
Il est courant d'utiliser les méthodes d'apprentissage automatique pour développer des modèles de pronostics précis et fiables permettant d'établir une classification des patients atteints d'une certaine maladie. La classification des patients, permet de regrouper les individus en fonction de leurs besoins et donc d'adapter le traitement du patient...
Nguyen, Trung Tin
In this thesis, we study the approximation capabilities, model estimation and selection properties, of a rich family of mixtures of experts (MoE) models in a high-dimensional setting, including MoE with Gaussian experts and soft-max (SGaME) or Gaussian gating functions (GLoME). Firstly, we improve upon universal approximation results in the context...
Feofanov, Vasilii
Learning with partially labeled data, known as semi-supervised learning, deals with problems where few training examples are labeled while available unlabeled data are abundant and valuable for training. In this thesis, we study this framework in the multi-class classification case with a focus on self-learning and feature selection. Self-learning ...
Feofanov, Vasilii
Learning with partially labeled data, known as semi-supervised learning, deals with problems where few training examples are labeled while available unlabeled data are abundant and valuable for training. In this thesis, we study this framework in the multi-class classification case with a focus on self-learning and feature selection. Self-learning ...
Lefort, Gaëlle
Parmi les nombreuses données omiques qui décrivent le fonctionnement biologique d'un organisme, le métabolome suscite un intérêt croissant car il est plus proche des phénotypes d'intérêt et qu'il a donc avoir un potentiel important pour la recherche de biomarqueurs. La spectrométrie par résonance magnétique nucléaire (RMN) est une technologie haut-...
Genuer, Robin
Jammal, Mahdi
Two principles at the forefront of modern machine learning and statistics are sparse modeling and robustness. Sparse modeling enables the construction of simpler statistical models. At the same time, statistical models need to be robust they should perform well when data is noisy in order to make reliable decisions. While sparsity and robustness ar...
Huynh, Bao Tuyen
This thesis deals with the problem of modeling and estimation of high-dimensional MoE models, towards effective density estimation, prediction and clustering of such heterogeneous and high-dimensional data. We propose new strategies based on regularized maximum-likelihood estimation (MLE) of MoE models to overcome the limitations of standard method...
Soret, Perrine
Dans les études cliniques et grâce aux progrès technologiques, la quantité d’informations recueillies chez un même patient ne cesse de croître conduisant à des situations où le nombre de variables explicatives est plus important que le nombre d’individus. La méthode Lasso s'est montrée appropriée face aux problèmes de sur-ajustement rencontrés en g...