Affordable Access

iProClass: an integrated database of protein family, function and structure information

Nucleic Acids Research
Oxford University Press
Publication Date
  • Articles
  • Biology
  • Computer Science
  • Design


gkm960 281..288 Published online 26 November 2007 Nucleic Acids Research, 2008, Vol. 36, Database issue D281–D288 doi:10.1093/nar/gkm960 The Pfam protein families database Robert D. Finn1,*, John Tate1, Jaina Mistry1, Penny C. Coggill1, Stephen John Sammut1, Hans-Rudolf Hotz1, Goran Ceric2, Kristoffer Forslund3, Sean R. Eddy2, Erik L. L. Sonnhammer3 and Alex Bateman1 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton Hall, Hinxton, Cambridgeshire, CB10 1SA, UK, 2Howard Hughes Medical Institute Janelia Farm Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA and 3Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-10691 Stockholm, Sweden Received September 15, 2007; Revised October 10, 2007; Accepted October 16, 2007 ABSTRACT Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metage- nomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (http://, the USA (http://pfam.janelia. org/) and Sweden (, as well as from mirror sites in France (http://pfam.jouy. and South Korea ( INTRODUCTION Pfam is designed to be a comprehensive and accurate collection of protein domains and families (1,2). Pfam families are divided into two categories, Pfam-A and Pfam-B. Each Pfam-A family consists of a curated seed alignment containing a small set of representative members of the family, profile hidden Markov models (profile HMMs) built from the seed alignment and an automatically generated full alignment which contains all detectable protein sequences belonging to the family, as defined by profile HMM searches of primary

There are no comments yet on this publication. Be the first to share your thoughts.