Ulliana, Federico Bisquert, Pierre Charoensit, Akira Colin, Renaud Tornil, Florent Yeche, Quentin
Conducting experimental analysis on rule reasoners is a mainstream task for validating novel algorithms and systems. Nevertheless, providing robust, veriable, and reproducible experiments can still raise a sensible challenge. We propose to demonstrate B-Runner, an open library for collaborative benchmarking focusing on the deployment of articulate ...
Aly, Adel Pivert, Olivier Thion, Virginie
This paper presents a survey of digital musical score databases, focusing on symbolic representation of the music content (as opposed to audio representation) in the context of Music Information Retrieval (MIR). We first provide a primer on Western classical music notation for unacquainted readers. Then, the core of our study categorizes and discus...
Nomburg, Jason Doherty, Erin E Price, Nathan Bellieny-Rabelo, Daniel Zhu, Yong K Doudna, Jennifer A
The rapid evolution of viruses generates proteins that are essential for infectivity and replication but with unknown functions, due to extreme sequence divergence1. Here, using a database of 67,715 newly predicted protein structures from 4,463 eukaryotic viral species, we found that 62% of viral proteins are structurally distinct and lack homologu...
Price, Morgan N Arkin, Adam P
Automated annotations of protein functions are error-prone because of our lack of knowledge of protein functions. For example, it is often impossible to predict the correct substrate for an enzyme or a transporter. Furthermore, much of the knowledge that we do have about the functions of proteins is missing from the underlying databases. We discuss...
Vallat, Brinda Webb, Benjamin Westbrook, John Goddard, Thomas Hanke, Christian Graziadei, Andrea Peisach, Ezra Zalevsky, Arthur Sagendorf, Jared Tangmunarunkit, Hongsuda
IHMCIF (github.com/ihmwg/IHMCIF) is a data information framework that supports archiving and disseminating macromolecular structures determined by integrative or hybrid modeling (IHM), and making them Findable, Accessible, Interoperable, and Reusable (FAIR). IHMCIF is an extension of the Protein Data Bank Exchange/macromolecular Crystallographic In...
Talwar, James V Klie, Adam Pagadala, Meghana S Carter, Hannah
SummaryHarmonizing variant indexing and allele assignments across datasets is crucial for data integrity in cross-dataset studies such as multi-cohort genome-wide association studies, meta-analyses, and the development, validation, and application of polygenic risk scores. Ensuring this indexing and allele consistency is a laborious, time-consuming...
Parsons, Michael T de la Hoya, Miguel Richardson, Marcy E Tudini, Emma Anderson, Michael Berkofsky-Fessler, Windy Caputo, Sandrine M Chan, Raymond C Cline, Melissa S Feng, Bing-Jian
The ENIGMA research consortium develops and applies methods to determine clinical significance of variants in hereditary breast and ovarian cancer genes. An ENIGMA BRCA1/2 classification sub-group, formed in 2015 as a ClinGen external expert panel, evolved into a ClinGen internal Variant Curation Expert Panel (VCEP) to align with Food and Drug Admi...
Camargo, Antonio Pedro Roux, Simon Schulz, Frederik Babinski, Michal Xu, Yan Hu, Bin Chain, Patrick SG Nayfach, Stephen Kyrpides, Nikos C
Identifying and characterizing mobile genetic elements in sequencing data is essential for understanding their diversity, ecology, biotechnological applications and impact on public health. Here we introduce geNomad, a classification and annotation framework that combines information from gene content and a deep neural network to identify sequences...
EL Abed, Maha Dauvignac, Jean-Yves Lantéri, Jérôme Migliaccio, Claire
International audience
Zhang, Oufan Naik, Shubhankar Liu, Zi Forman-Kay, Julie Head-Gordon, Teresa
MOTIVATION: Sidechain rotamer libraries of the common amino acids of a protein are useful for folded protein structure determination and for generating ensembles of intrinsically disordered proteins (IDPs). However, much of protein function is modulated beyond the translated sequence through the introduction of post-translational modifications (PTM...