Affordable Access

Publisher Website

Graphical Workflow System for Modification Calling by Machine Learning of Reverse Transcription Signatures

  • Schmidt, Lukas1
  • Werner, Stephan1
  • Kemmer, Thomas2
  • Niebler, Stefan3
  • Kristen, Marco1
  • Ayadi, Lilia4, 5
  • Johe, Patrick1
  • Marchand, Virginie4
  • Schirmeister, Tanja1
  • Motorin, Yuri4, 5
  • Hildebrandt, Andreas2
  • Schmidt, Bertil3
  • Helm, Mark1
  • 1 Institute of Pharmacy and Biochemistry, Johannes Gutenberg-University, Mainz , (Germany)
  • 2 Institute of Computer Science, Scientific Computing and Bioinformatics, Johannes Gutenberg-University, Mainz , (Germany)
  • 3 Institute of Computer Science, High Performance Computing, Johannes Gutenberg-University, Mainz , (Germany)
  • 4 Next-Generation Sequencing Core Facility UMS2008 IBSLor CNRS-UL-INSERM, Biopôle, University of Lorraine, Vandœuvre-lès-Nancy , (France)
  • 5 IMoPA UMR7365 CNRS-UL, Biopôle, University of Lorraine, Vandœuvre-lès-Nancy , (France)
Published Article
Frontiers in Genetics
Frontiers Media SA
Publication Date
Sep 25, 2019
DOI: 10.3389/fgene.2019.00876
PMID: 31608115
PMCID: PMC6774277
PubMed Central


Modification mapping from cDNA data has become a tremendously important approach in epitranscriptomics. So-called reverse transcription signatures in cDNA contain information on the position and nature of their causative RNA modifications. Data mining of, e.g. Illumina-based high-throughput sequencing data, is therefore fast growing in importance, and the field is still lacking effective tools. Here we present a versatile user-friendly graphical workflow system for modification calling based on machine learning. The workflow commences with a principal module for trimming, mapping, and postprocessing. The latter includes a quantification of mismatch and arrest rates with single-nucleotide resolution across the mapped transcriptome. Further downstream modules include tools for visualization, machine learning, and modification calling. From the machine-learning module, quality assessment parameters are provided to gauge the suitability of the initial dataset for effective machine learning and modification calling. This output is useful to improve the experimental parameters for library preparation and sequencing. In summary, the automation of the bioinformatics workflow allows a faster turnaround of the optimization cycles in modification calling.

Report this publication


Seen <100 times