Prediction of biogeographical ancestry from genotype: a comparison of classifiers

Elaine CHEUNG, Michelle GAHAN, Dennis MCNEVIN

    Research output: Contribution to journalArticlepeer-review

    21 Citations (Scopus)

    Abstract

    DNA can provide forensic intelligence regarding a donor’s biogeographical ancestry (BGA) and other externally visible characteristics (EVCs). A number of algorithms have been proposed to assign individual human genotypes to a BGA using ancestry informative marker (AIM) panels. This study compares the BGA assignment accuracy of the population clustering program STRUCTURE and three generic classification approaches including a Bayesian algorithm, genetic distance, and multinomial logistic regression (MLR). A selection of 142 ancestry informative single nucleotide polymorphisms (SNPs) were chosen from existing marker panels (SNPforID 34-plex, Eurasiaplex, Seldin, and Kidd’s AIM panels) to assess BGA classification at the continental level for Africans, Europeans, East Asians, and Amerindians. A training set of 1093 individuals with self-declared BGA from the 1000 Genomes phase 1 database was used by each classifier to predict BGA in a test set of 516 individuals from the HGDP-CEPH (Stanford) cell line panel. Tests were repeated with 0, 10, 50, 70, and 90% of the genotypes missing. Comparison of the area under the receiver operating characteristic curves (AUROCs) showed high accuracy in STRUCTURE and the generic Bayesian approach. The latter algorithm offers a computationally simpler alternative to STRUCTURE with little loss in accuracy and is suitable for phenotype prediction while STRUCTURE is not.
    Original languageEnglish
    Pages (from-to)901-912
    Number of pages12
    JournalInternational Journal of Legal Medicine
    Volume131
    Issue number4
    DOIs
    Publication statusPublished - Jul 2017

    Fingerprint

    Dive into the research topics of 'Prediction of biogeographical ancestry from genotype: a comparison of classifiers'. Together they form a unique fingerprint.

    Cite this