Affordable Access

Publisher Website

Guidelines for standardizing the application of discriminant analysis of principal components to genotype data.

Authors
  • Thia, Joshua A1
  • 1 Bio21 Institute, School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia. , (Australia)
Type
Published Article
Journal
Molecular Ecology Resources
Publisher
Wiley (Blackwell Publishing)
Publication Date
Apr 01, 2023
Volume
23
Issue
3
Pages
523–538
Identifiers
DOI: 10.1111/1755-0998.13706
PMID: 36039574
Source
Medline
Keywords
Language
English
License
Unknown

Abstract

Despite the popularity of discriminant analysis of principal components (DAPC) for studying population structure, there has been little discussion of best practice for this method. In this work, I provide guidelines for standardizing the application of DAPC to genotype data sets. An often overlooked fact is that DAPC generates a model describing genetic differences among a set of populations defined by a researcher. Appropriate parameterization of this model is critical for obtaining biologically meaningful results. I show that the number of leading PC axes used as predictors of among-population differences, paxes , should not exceed the k-1 biologically informative PC axes that are expected for k effective populations in a genotype data set. This k-1 criterion for paxes specification is more appropriate compared to the widely used proportional variance criterion, which often results in a choice of paxes ≫ k-1. DAPC parameterized with no more than the leading k-1 PC axes: (i) is more parsimonious; (ii) captures maximal among-population variation on biologically relevant predictors; (iii) is less sensitive to unintended interpretations of population structure; and (iv) is more generally applicable to independent sample sets. Assessing model fit should be routine practice and aids interpretation of population structure. It is imperative that researchers articulate their study goals, that is, testing a priori expectations vs. studying de novo inferred populations, because this has implications on how their DAPC results should be interpreted. The discussion and practical recommendations in this work provide the molecular ecology community with a roadmap for using DAPC in population genetic investigations. © 2022 The Author. Molecular Ecology Resources published by John Wiley & Sons Ltd.

Report this publication

Statistics

Seen <100 times