Affordable Access

Database construction and peptide identification strategies for proteogenomic studies on sequenced genomes

Authors
Type
Published Article
Journal
Current Topics in Medicinal Chemistry
Publisher
Bentham Science
Publication Date
Jan 17, 2014
Volume
14
Issue
3
Identifiers
PMID: 24304320
Source
MyScienceWork
License
Green

Abstract

Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.

Report this publication

Statistics

Seen <100 times