• Deciphering relatedness and population demographics in diverse population structures by leveraging haplotype and rare variant sharing detected from whole genome sequencing

      Shetty, Amol Carl; O'Connor, Timothy D; 0000-0001-8790-7649 (2019)
      Genealogical analysis using genomic variants is essential for a variety of applications in human genetics such as estimating population structure, migration events and evolutionary history. The 1000 Genomes Project is an example of a study of human genomic diversity and continental population structure. Multiple studies illustrate the utility of genetic variation for the reconstruction of human migratory patterns within and between continental populations and the demographic events influencing evolutionary history. Current methods for assessment of population structure and genetic relatedness use individual genetic loci and do not take full advantage of the large number of markers provided by whole genome sequencing techniques. More recently, haplotype sharing or identity by descent (IBD) estimates have been used as a promising method to elucidate demographic admixture/migratory events. This dissertation focuses on the application of IBD sharing to decipher genetic relatedness and demographic events that influence population substructure. Knowledge of genome-wide patterns of IBD sharing among individuals helped distinguish between ancient and recent demographic events and detect fine-structure among the recently expanded and admixed New World populations from Peru. This addresses the gap in knowledge regarding the population fine-structure of indigenous and admixed communities from geographically distinct regions of Peru. IBD sharing, primarily utilized to study human demography, was applied to study fine-structure and demography of haploid malarial parasite populations in Southeast Asia which helped elucidate the migratory patterns of the parasite and guide the elimination strategies of the World Health Organization (WHO). Current IBD methods accurately detect long segments based on information from common variants. However, cohorts involving cryptic relatedness mostly share short IBD segments. In light of this limitation, rare variants arising from recent dramatic events of population expansion convey more information on short IBD segments than common variants. This knowledge of IBD sharing leveraged by rare variants influences the timescales at which familial relatedness and population structure can be assessed. In sum, this dissertation illustrates the utility of IBD segments of variable lengths and the accumulation of rare variants within these segments to detect fine-scale population structure at different evolutionary timescales and fills the gaps in knowledge in both human and non-human populations.