Linderman, Michael D Wallace, Jacob van der Heyde, Alderik Wieman, Eliza Brey, Daniel Shi, Yiran Hansen, Peter Shamsi, Zahra Liu, Jeremiah Gelb, Bruce D
...
Published in
Bioinformatics (Oxford, England)
Structural variants (SVs) play a causal role in numerous diseases but can be difficult to detect and accurately genotype (determine zygosity) with short-read genome sequencing data (SRS). Improving SV genotyping accuracy in SRS data, particularly for the many SVs first detected with long-read sequencing, will improve our understanding of genetic va...
Gisdon, Florian J Zunker, Mariella Wolf, Jan Niclas Prüfer, Kai Ackermann, Jörg Welsch, Christoph Koch, Ina
Published in
Bioinformatics (Oxford, England)
The functional complexity of biochemical processes is strongly related to the interplay of proteins and their assembly into protein complexes. In recent years, the discovery and characterization of protein complexes have substantially progressed through advances in cryo-electron microscopy, proteomics, and computational structure prediction. This d...
Groot Koerkamp, Ragnar Ivanov, Pesho
Published in
Bioinformatics (Oxford, England)
Sequence alignment has been at the core of computational biology for half a century. Still, it is an open problem to design a practical algorithm for exact alignment of a pair of related sequences in linear-like time. We solve exact global pairwise alignment with respect to edit distance by using the A* shortest path algorithm. In order to efficien...
Liu, Erhu Lyu, Hongqiang Liu, Yuan Fu, Laiyi Cheng, Xiaoliang Yin, Xiaoran
Published in
Bioinformatics (Oxford, England)
Topologically associating domains (TADs) are fundamental building blocks of 3D genome. TAD-like domains in single cells are regarded as the underlying genesis of TADs discovered in bulk cells. Understanding the organization of TAD-like domains helps to get deeper insights into their regulatory functions. Unfortunately, it remains a challenge to ide...
Ji, Fahu Zhou, Qian Ruan, Jue Zhu, Zexuan Liu, Xianming
Published in
Bioinformatics (Oxford, England)
Seeding is a rate-limiting stage in sequence alignment for next-generation sequencing reads. The existing optimization algorithms typically utilize hardware and machine-learning techniques to accelerate seeding. However, an efficient solution provided by professional next-generation sequencing compressors has been largely overlooked by far. In addi...
Figueroa Iii, Jose L Dhungel, Eliza Bellanger, Madeline Brouwer, Cory R White Iii, Richard Allen
Published in
Bioinformatics (Oxford, England)
MetaCerberus is a massively parallel, fast, low memory, scalable annotation tool for inference gene function across genomes to metacommunities. MetaCerberus provides an elusive HMM/HMMER-based tool at a rapid scale with low memory. It offers scalable gene elucidation to major public databases, including KEGG (KO), COGs, CAZy, FOAM, and specific dat...
Cordes, Jonas Enzlein, Thomas Hopf, Carsten Wolf, Ivo
Published in
Bioinformatics (Oxford, England)
Python is the most commonly used language for deep learning (DL). Existing Python packages for mass spectrometry imaging (MSI) data are not optimized for DL tasks. We, therefore, introduce pyM2aia, a Python package for MSI data analysis with a focus on memory-efficient handling, processing and convenient data-access for DL applications. pyM2aia pro...
Kasapi, Melpomeni Xu, Kexin Ebbels, Timothy M D O'Regan, Declan P Ware, James S Posma, Joram M
Published in
Bioinformatics (Oxford, England)
Random forests (RFs) can deal with a large number of variables, achieve reasonable prediction scores, and yield highly interpretable feature importance values. As such, RFs are appropriate models for feature selection and further dimension reduction. However, RFs are often not appropriate for correlated datasets due to their mode of selecting indiv...
Kim, Jung Naqvi, Ammar S Corbett, Ryan J Kaufman, Rebecca S Vaksman, Zalman Brown, Miguel A Miller, Daniel P Phul, Saksham Geng, Zhuangzhuang Storm, Phillip B
...
Published in
Bioinformatics (Oxford, England)
With the increasing rates of exome and whole genome sequencing, the ability to classify large sets of germline sequencing variants using up-to-date American College of Medical Genetics-Association for Molecular Pathology (ACMG-AMP) criteria is crucial. Here, we present Automated Germline Variant Pathogenicity (AutoGVP), a tool that integrates germl...
Chen, Yuting Zhang, Haoling Wang, Wen Shen, Yue Ping, Zhi
Published in
Bioinformatics (Oxford, England)
The advancement of structural biology has increased the requirements for researchers to quickly and efficiently visualize molecular structures in silico. Meanwhile, it is also time-consuming for structural biologists to create publication-standard figures, as no useful tools can directly generate figures from structure data. Although manual editing...