Weight of the evidence of genetic investigations of ancestry informative markers

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Authors Torben Tvedebrink, Poul Svante Eriksen, Helle Smidt Mogensen, Niels Morling
Journal/Conference Name Theoretical population biology
Paper Category
Paper Abstract Ancestry-informative markers (AIMs) are markers that give information about the ancestry of individuals. They are used in forensic genetics for predicting the geographic origin of the investigated individual in crime and identification cases. In the exploration of the genogeographic origin of an AIMs profile, the likelihoods of the AIMs profile in various populations may be calculated. However, there may not be an appropriate reference population in the database. The fact that the likelihood ratio (LR) of one population compared to that of another population is large does not imply that any of the populations is relevant. To handle this phenomena, we derived a likelihood ratio test (LRT) that is a measure of absolute concordance between an AIMs profile and a population rather than a relative measure of the AIMs profile's likelihood in two populations. The LRT is similar to a Fisher's exact test. By aggregating over markers, the central limit theorem suggests that the resulting quantity is approximately normally distributed. If only a few markers are genotyped or if the majority of the markers are fixed in a given population, the approximation may fail. We overcome this using importance sampling and show how exponential tilting results in an efficient proposal distribution. By simulations and published AIMs profiles, we demonstrate the applicability of the derived methodology. For the genotyped AIMs, the LRT approach achieves the nominal levels of rejection when tested on data from five major continental regions.
Date of publication 2018
Code Programming Language R
Comment

Copyright Researcher 2021