In this study we analyzed the bias existing in the Protein Data Bank (PDB) using the novel contrast classifier approach. We trained an ensemble of neural network classifiers, called a contrast classifier, to learn the distributional differences between non-redundant sequence subsets of PDB and SWISS-PROT. Assuming that SWISS-PROT is a representative of the sequence diversity in nature while the PDB is a biased sample, output of the contrast classifier can be used to measure whether the properties of a given sequence or its region are underrepresented in PDB. We applied the contrast classifier to SWISS-PROT sequences to analyze the bias in PDB towards different functional protein properties. The results showed that transmembrane, signal, disordered, and low complexity regions are significantly underrepresented in PDB, while disulfide bonds, metal binding sites, and sites involved in enzyme activity are overrepresented. Additionally, hydroxylation and phosphorylation posttranslational modification sites were found to be underrepresented while acetylation sites were significantly overrepresented. These results suggest the potential usefulness of contrast classifiers in the selection of target proteins for structural characterization experiments.