El Methni, Jonathan Girard, Stéphane

Weissman extrapolation device for estimating extreme quantiles from heavy-tailed distribution is based on two estimators: an order statistic to estimate an intermediate quantile and an estimator of the tail-index. The common practice is to select the same intermediate sequence for both estimators. In this work, we show how an adapted choice of two ...

Roche, Angelina

In more and more applications, a quantity of interest may depend on several covariates, with at least one of them infinite-dimensional (e.g. a curve). To select relevant covariate in this context, we propose an adaptation of the Lasso method. The criterion is based on classical Lasso inference under group sparsity (Yuan and Lin, 2006; Lounici et al...

Simon, Pierre-Alexandre Stoica, Radu Sur, Frédéric

The huge amount of temporal data available nowadays in numerous scientific fields requires dedicated analysis and prediction methods. Stochastic temporal point processes are certainly one of the popular approaches available to model time series. While point processes have been successfully applied in many application domains, they need strong assum...

Bouaziz, Olivier

In the context of right-censored and interval-censored data we develop asymptotic formulas to compute pseudo-observations for the survival function and the Restricted Mean Survival Time (RMST). Those formulas are based on the original estimators and do not involve computation of the jackknife estimators. For right-censored data, Von Mises expansion...

Martinet, Lison Sueur, Cédric Beltzung, Benjamin Pelé, Marie

We need specific and objective methods to analyse the temporal changes of drawing in children, especially those too young to communicate via verbalisations. We asked 134 children, ranging from three to ten years old, and 38 adults to draw on a tablet under two conditions: free drawing and selfportrait. We then used seven metrics from three categori...

Bénézet, Cyril Gobet, Emmanuel Targino, Rodrigo

In financial risk management, modelling dependency within a random vector X is crucial, a standard approach is the use of a copula model. Say the copula model can be sampled through realizations of Y having copula function C: had the marginals of Y been known, sampling X^(i) , the i-th component of X, would directly follow by composing Y^(i) with i...

Dieuleveut, Aymeric Fort, Gersende Moulines, Eric Robin, Geneviève

The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models. As in any other field of machine learning, applications of latent variable models to very large datasets make the use of advanced parallel and distributed architectures mandatory. This paper introduces FedEM, which is the first extension of...

Duchesnay, Edouard Lofstedt, Tommy Younes, Feki

This document describes statistics and machine learning in Python using:• Scikit-learn for machine learning.• Pytorch for deep learning.• Statsmodels for statistics.

Lartigue, Thomas Bottani, Simona Baron, Stephanie Colliot, Olivier Durrleman, Stanley Allassonniere, Stephanie
Published in
IEEE transactions on pattern analysis and machine intelligence

Gaussian graphical models (GGM) are often used to describe the conditional correlations between the components of a random vector. In this article, we compare two families of GGM inference methods: the nodewise approach and the penalised likelihood maximisation. We demonstrate on synthetic data that, when the sample size is small, the two methods p...

Fromont, Magalie Grela, Fabrice Le Guével, Ronan

Motivated by applications in cybersecurity and epidemiology, we consider the problem of detecting an abrupt change in the intensity of a Poisson process, characterised by a jump (non transitory change) or a bump (transitory change) from constant. We propose a complete study from the nonasymptotic minimax testing point of view, when the constant bas...