Hubert, Thibault Vaillant, Ghislain Birot, Olivier Arias, Camila Neuraz, Antoine Coulet, Adrien
The task of Named Entity Recognition (NER) is central for leveraging the content of clinical texts in observational studies. Indeed, texts contain a large part of the information available in Electronic Health Records (EHRs). However, clinical texts are highly heterogeneous between healthcare services and institutions, between countries and languag...
Lopez-Labrador, F. Xavier; Huber, Michael; Sidorov, Igor A.; Brown, Julianne R.; Cuypers, Lize; 85889; Laenen, Lies; 75882; Vanmechelen, Bert; 112922; Maes, Piet; 40919; Fischer, Nicole; Pichler, Ian;
...
Metagenomics is gradually being implemented for diagnosing infectious diseases. However, in-depth protocol comparisons for viral detection have been limited to individual sets of experimental workflows and laboratories. In this study, we present a benchmark of metagenomics protocols used in clinical diagnostic laboratories initiated by the European...
Holzmüller, David Grinsztajn, Léo Steinwart, Ingo
For classification and regression on tabular data, the dominance of gradient-boosted decision trees (GBDTs) has recently been challenged by often much slower deep learning methods with extensive hyperparameter tuning. We address this discrepancy by introducing (a) RealMLP, an improved multilayer perceptron (MLP), and (b) improved default parameters...
Bouvard, Christophe Ciancone, Mathieu Gourru, Antoine Schaeffer, Marion
Les grands modèles de langage ont récemment été largement exploités dans les agents conversationnels, où l’injection de connaissances pour des domaines d’applications spécifiques est un enjeu crucial. Nous comparons deux approches : le fine-tuning et la génération augmentée de récupération. Nous évaluons ces techniques pour deux cas d’usage différe...
Réda, Clémence Vie, Jill-Jênn Wolkenhauer, Olaf
Drug development is known to be a costly and time-consuming process, which is prone to high failure rates. Drug repurposing allows drug discovery to consider reusing approved compounds to mitigate those issues. The outcomes of past clinical trials can be used to predict novel drug-disease associations by leveraging drug- and disease-related similar...
Xu, Zhen Escalera, Sergio Guyon, Isabelle Pavão, Adrien Richard, Magali Tu, Wei-Wei Yao, Quanming Zhao, Huan
Obtaining standardized crowdsourced benchmark of computational methods is a major issue in data science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here we introduce Codabench, an open-source, community-driven platform for benchmarking algorithms or software agents versus datasets o...
GUIBERT, Julien TREMBLAY-FRANCO, Marie Letertre, Marine P.M. Piou, Marine Dumez, Jean-Nicolas Giraudeau, Patrick Canlet, Cecile
NMR-based metabolomic studies are mostly performed with proton 1D liquid-state NMR, using well-established protocols for (bio)fluids or extracts. Proton 1D NMR is rapid and robust but may be limited by an extensive signal overlap, which could impair the accurate identification and quantification of biomarkers. To overcome this limitation, 2D NMR ex...
lukeš, michal
Cílem této práce je prozkoumat, implementovat, ověřovit a porovnat řešiče pro problém vozového parku s časovými okny (Vehicle Routing Problem with Time Windows) a problém rozvrhu sester (Nurse Rostering Problem) pomocí nástrojů pro matematickou optimalizaci jako je Smíšené Celočíselné Lineární Programování (Mixed Integer Linear Programming) a Progr...
Thébault, Cyril
With the growing availability of hydro-meteorological data and the constant increase in computing resources, large-sample hydrology datasets are now widely used to evaluate hydrologic models. Large-sample hydrology datasets use a large set of catchments with variable climatic and geomorphological characteristics to derive robust conclusions. While ...
Doukhan, David Maertens, Christine Le Personnic, William Speroni, Ludovic Dehak, Reda
InaGVAD is an audio corpus collected from 10 French radio and 18 TV channels categorized into 4 groups: generalist radio, music radio, news TV, and generalist TV.It contains 277 1-minute-long annotated recordings aimed at representing the acoustic diversity of French audiovisual programs and was primarily designed to build systems able to monitor m...