BackgroundPlasmodium falciparum (Pf) whole-organism sporozoite vaccines have been shown to provide significant protection against controlled human malaria infection (CHMI) in clinical trials. Initial CHMI studies showed significantly higher durable protection against homologous than heterologous strains, suggesting the presence of strain-specific vaccine-induced protection. However, interpretation of these results and understanding of their relevance to vaccine efficacy have been hampered by the lack of knowledge on genetic differences between vaccine and CHMI strains, and how these strains are related to parasites in malaria endemic regions.MethodsWhole genome sequencing using long-read (Pacific Biosciences) and short-read (Illumina) sequencing platforms was conducted to generate de novo genome assemblies for the vaccine strain, NF54, and for strains used in heterologous CHMI (7G8 from Brazil, NF166.C8 from Guinea, and NF135.C10 from Cambodia). The assemblies were used to characterize sequences in each strain relative to the reference 3D7 (a clone of NF54) genome. Strains were compared to each other and to a collection of clinical isolates (sequenced as part of this study or from public repositories) from South America, sub-Saharan Africa, and Southeast Asia.ResultsWhile few variants were detected between 3D7 and NF54, we identified tens of thousands of variants between NF54 and the three heterologous strains. These variants include SNPs, indels, and small structural variants that fall in regulatory and immunologically important regions, including transcription factors (such as PfAP2-L and PfAP2-G) and pre-erythrocytic antigens that may be key for sporozoite vaccine-induced protection. Additionally, these variants directly contributed to diversity in immunologically important regions of the genomes as detected through in silico CD8+ T cell epitope predictions. Of all heterologous strains, NF135.C10 had the highest number of unique predicted epitope sequences when compared to NF54. Comparison to global clinical isolates revealed that these four strains are representative of their geographic origin despite long-term culture adaptation; of note, NF135.C10 is from an admixed population, and not part of recently formed subpopulations resistant to artemisinin-based therapies present in the Greater Mekong Sub-region.ConclusionsThese results will assist in the interpretation of vaccine efficacy of whole-organism vaccines against homologous and heterologous CHMI.