Affordable Access

Access to the full text

FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures

Authors
  • Kim, Hyunbin1
  • Lee, Andy Jinseok1
  • Lee, Jongkeun1
  • Chun, Hyonho2
  • Ju, Young Seok3
  • Hong, Dongwan1
  • 1 Bioinformatics Analysis Team, National Cancer Center, 323 Ilsan-ro, Ilsandong-gu, Goyang-si, Gyeonggi-do, 10408, Republic of Korea , Goyang-si (South Korea)
  • 2 Boston University, Boston, MA, 02215, USA , Boston (United States)
  • 3 Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea , Daejeon (South Korea)
Type
Published Article
Journal
Genome Medicine
Publisher
Springer (Biomed Central Ltd.)
Publication Date
Dec 17, 2019
Volume
11
Issue
1
Identifiers
DOI: 10.1186/s13073-019-0695-x
Source
Springer Nature
Keywords
License
Green

Abstract

BackgroundAccurate identification of real somatic variants is a primary part of cancer genome studies and precision oncology. However, artifacts introduced in various steps of sequencing obfuscate confidence in variant calling. Current computational approaches to variant filtering involve intensive interrogation of Binary Alignment Map (BAM) files and require massive computing power, data storage, and manual labor. Recently, mutational signatures associated with sequencing artifacts have been extracted by the Pan-cancer Analysis of Whole Genomes (PCAWG) study. These spectrums can be used to evaluate refinement quality of a given set of somatic mutations.ResultsHere we introduce a novel variant refinement software, FIREVAT (FInding REliable Variants without ArTifacts), which uses known spectrums of sequencing artifacts extracted from one of the largest publicly available catalogs of human tumor samples. FIREVAT performs a quick and efficient variant refinement that accurately removes artifacts and greatly improves the precision and specificity of somatic calls. We validated FIREVAT refinement performance using orthogonal sequencing datasets totaling 384 tumor samples with respect to ground truth. Our novel method achieved the highest level of performance compared to existing filtering approaches. Application of FIREVAT on additional 308 The Cancer Genome Atlas (TCGA) samples demonstrated that FIREVAT refinement leads to identification of more biologically and clinically relevant mutational signatures as well as enrichment of sequence contexts associated with experimental errors. FIREVAT only requires a Variant Call Format file (VCF) and generates a comprehensive report of the variant refinement processes and outcomes for the user.ConclusionsIn summary, FIREVAT facilitates a novel refinement strategy using mutational signatures to distinguish artifactual point mutations called in human cancer samples. We anticipate that FIREVAT results will further contribute to precision oncology efforts that rely on accurate identification of variants, especially in the context of analyzing mutational signatures that bear prognostic and therapeutic significance. FIREVAT is freely available at https://github.com/cgab-ncc/FIREVAT

Report this publication

Statistics

Seen <100 times