BackgroundThe United States Department of Agriculture (USDA) National Plant Germplasm System (NPGS) sorghum core collection contains 3011 accessions randomly selected from 77 countries. Genomic and phenotypic characterization of this core collection is necessary to encourage and facilitate its utilization in breeding programs and to improve conservation efforts. In this study, we examined the genome sequences of 318 accessions belonging to the NPGS Sudan sorghum core set, and characterized their agronomic traits and anthracnose resistance response.ResultsWe identified 183,144 single nucleotide polymorphisms (SNPs) located within or in proximity of 25,124 annotated genes using the genotyping-by-sequencing (GBS) approach. The core collection was genetically highly diverse, with an average pairwise genetic distance of 0.76 among accessions. Population structure and cluster analysis revealed five ancestral populations within the Sudan core set, with moderate to high level of genetic differentiation. In total, 171 accessions (54%) were assigned to one of these populations, which covered 96% of the total genomic variation. Genome scan based on Tajima’s D values revealed two populations under balancing selection. Phenotypic analysis showed differences in agronomic traits among the populations, suggesting that these populations belong to different ecogeographical regions. A total of 55 accessions were resistant to anthracnose; these accessions could represent multiple resistance sources. Genome-wide association study based on fixed and random model Circulating Probability (farmCPU) identified genomic regions associated with plant height, flowering time, panicle length and diameter, and anthracnose resistance response. Integrated analysis of the Sudan core set and sorghum association panel indicated that a large portion of the genetic variation in the Sudan core set might be present in breeding programs but remains unexploited within some clusters of accessions.ConclusionsThe NPGS Sudan core collection comprises genetically and phenotypically diverse germplasm with multiple anthracnose resistance sources. Population genomic analysis could be used to improve screening efforts and identify the most valuable germplasm for breeding programs. The new GBS data set generated in this study represents a novel genomic resource for plant breeders interested in mining the genetic diversity of the NPGS sorghum collection.