Barbier, Jean Krzakala, Florent Macris, Nicolas Miolane, Léo Zdeborová, Lenka

We consider generalized linear models where an unknown $n$-dimensional signal vector is observed through the successive application of a random matrix and a non-linear (possibly probabilistic) componentwise function. We consider the models in the high-dimensional limit, where the observation consists of $m$ points, and $m/n {\to} {\alpha}$ where ${...

Tremblay, Nicolas Barthelmé, Simon Amblard, Pierre-Olivier

When one is faced with a dataset too large to be used all at once, an obvious solution is to retain only part of it. In practice this takes a wide variety of different forms, but among them " coresets " are especially appealing. A coreset is a (small) weighted sample of the original data that comes with a guarantee: that a cost function can be eval...

Rossetti, Giulio Cazabet, Rémy

Networks built to model real world phenomena are characeterised by some properties that have attracted the attention of the scientific community: (i) they are organised according to community structure and (ii) their structure evolves with time. Many researchers have worked on methods that can efficiently unveil substructures in complex networks, g...

Lerasle, Matthieu Szabó, Zoltán Lecué, Guillaume Massiot, Gaspar Moulines, Eric

Mean embeddings provide an extremely flexible and powerful tool in machine learning and statistics to represent probability distributions and define a semi-metric (MMD, maximum mean discrepancy ; also called N-distance or energy distance), with numerous successful applications. The representation is constructed as the expectation of the feature map...

Lasserre, Jean B. Pauwels, Edouard
Published in
Advances in Computational Mathematics

We illustrate the potential applications in machine learning of the Christoffel function, or, more precisely, its empirical counterpart associated with a counting measure uniformly supported on a finite set of points. Firstly, we provide a thresholding scheme which allows approximating the support of a measure from a finite subset of its moments wi...

Guedj, Benjamin Li, Le

When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. Principal curves act as a nonlinear generalization of PCA and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our ...

Lathuilière, Stéphane Mesejo, Pablo Alameda-Pineda, Xavier Horaud, Radu

Deep learning revolutionized data science, and recently, its popularity has grown exponentially, as did the amount of papers employing deep networks. Vision tasks such as human pose estimation did not escape this methodological change. The large number of deep architectures lead to a plethora of methods that are evaluated under different experiment...

Rakotomamonjy, Alain Gasso, Gilles Salmon, Joseph

Soheily-Khah, Saeid Marteau, Pierre-François

Temporal data are naturally everywhere, especially in the digital era that sees the advent of big data and internet of things. One major challenge that arises during temporal data analysis and mining is the comparison of time series or sequences, which requires to determine a proper distance or (dis)similarity measure. In this context, the Dynamic ...

Lauer, Fabien

The paper deals with regression problems, in which the nonsmooth target is assumed to switch between different operating modes. Specifically, piecewise smooth (PWS) regression considers target functions switching deterministically via a partition of the input space, while switching regression considers arbitrary switching laws. The paper derives ge...