Practical Considerations for the Application of Identity-By-Descent to the Inference of Plasmodium falciparum Demography
Abstract
Genomic surveillance combined with traditional epidemiological analyses is important for the identification of at-risk populations for targeted intervention to support malaria control/elimination efforts. Identity by descent (IBD), representing genomic segments shared by pairs of genomes and inherited from the same common ancestor without break via recombination, is a key population genetic metric used to study genetic relatedness, effective population size (N_e), migration, population structure, and positive selection in malaria parasite Plasmodium falciparum (Pf) and other organisms. However, the application of IBD in Pf parasites may be unreliable or biased due to two issues: the low marker density per genetic unit resulting from a high recombination rate and strong positive selection related to resistance to antimalarial drugs or other interventions. In this study, I used population genetic simulations, a genealogy-based true IBD inference algorithm, and empirical data sets from various malaria transmission settings to assess these issues and biases and identify mitigation strategies. I found that high recombination rates can dramatically reduce the density of genetic markers and affect the accuracy of detected IBD segments. I demonstrated that the accuracy of detected IBD segments and downstream demography estimates can be improved by optimizing IBD caller-specific parameters and prioritizing IBD callers for quality-sensitive downstream analysis. I showed that IBD estimated by most callers can capture known selection signals and population structure patterns after parameter optimization, but only the Hidden-Markov model-based caller hmmIBD can reliably infer N_e for Pf-like genomes. Furthermore, I demonstrated that positive selection can distort IBD distributions, leading to underestimated effective population size and blurred population structure, which can be mitigated by removing IBD peak regions, with the efficacy contingent on the population’s background genetic relatedness and inbreeding. Therefore, I recommend optimizing parameters for IBD callers that were originally designed for species other than Pf and prioritizing hmmIBD for quality-sensitive analysis, such as estimation of N_e, when analyzing Pf data sets. Additionally, I suggest applying peak removal-based selection corrections before performing IBD-based inferences of demography and population structure in parasite populations under strong positive selection, particularly in high-transmission settings.Description
University of Maryland, Baltimore, School of Medicine, Ph.D. 2024.Keyword
identity-by-descentBenchmarking
Malaria
Genetic Linkage
Genetics, Population
Plasmodium falciparum