Affordable Access

Publisher Website

A simple technique to classify diffraction data from dynamic proteins according to individual polymorphs.

  • Nguyen, Thu1
  • Phan, Kim L2
  • Kozakov, Dima3
  • Gabelli, Sandra B2
  • Kreitler, Dale F4
  • Andrews, Lawrence C5
  • Jakoncic, Jean4
  • Sweet, Robert M4
  • Soares, Alexei S4
  • Bernstein, Herbert J5
  • 1 Department of Computer Science, Stony Brook University, Stony Brook, NY 11794-2424, USA.
  • 2 Department of Medicine, Oncology, Biophysics and Biophysical Chemistry, Johns Hopkins University, 725 North Wolfe Street, Baltimore, MD 21205, USA.
  • 3 Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794-3600, USA.
  • 4 National Synchrotron Light Source II, Building 745, Brookhaven National Laboratory, PO Box 5000, Upton, NY 11973-5000, USA.
  • 5 Ronin Institute for Independent Scholarship, c/o NSLS-II, Building 745, Brookhaven National Laboratory, PO Box 5000, Upton, NY 11973-5000, USA.
Published Article
Acta crystallographica. Section D, Structural biology
Publication Date
Mar 01, 2022
Pt 3
DOI: 10.1107/S2059798321013425
PMID: 35234141


One often observes small but measurable differences in the diffraction data measured from different crystals of a single protein. These differences might reflect structural differences in the protein and may reveal the natural dynamism of the molecule in solution. Partitioning these mixed-state data into single-state clusters is a critical step that could extract information about the dynamic behavior of proteins from hundreds or thousands of single-crystal data sets. Mixed-state data can be obtained deliberately (through intentional perturbation) or inadvertently (while attempting to measure highly redundant single-crystal data). To the extent that different states adopt different molecular structures, one expects to observe differences in the crystals; each of the polystates will create a polymorph of the crystals. After mixed-state diffraction data have been measured, deliberately or inadvertently, the challenge is to sort the data into clusters that may represent relevant biological polystates. Here, this problem is addressed using a simple multi-factor clustering approach that classifies each data set using independent observables, thereby assigning each data set to the correct location in conformational space. This procedure is illustrated using two independent observables, unit-cell parameters and intensities, to cluster mixed-state data from chymotrypsinogen (ChTg) crystals. It is observed that the data populate an arc of the reaction trajectory as ChTg is converted into chymotrypsin. open access.

Report this publication


Seen <100 times