New Algorithm Allows Researchers to Utilize Shorter DNA Segments in Determining Associations Through Identity-by-Descent
Mountain View, Calif. – May 12, 2014 – 23andMe, the leading personal genetics company, said today it has published an analysis that improves the accuracy and efficiency of identity-by-descent (IBD) detection through a new, open-source algorithm called HaploScore.
The study, titled “Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis” was published on April 30, 2014 in Molecular Biology and Evolution. HaploScore provides a metric by which to rank the likelihood that a stretch of DNA is inherited IBD between two individuals or not. Analysis of genomic segments shared IBD between individuals is fundamental to many genetic applications, from demographic inference to estimating the heritability of diseases and identifying distant relatives, but IBD detection accuracy in non-simulated data was previously largely unknown.
To determine the accuracy of existing IBD detection algorithms, researchers extracted 25,432 genotyped European individuals containing 2,952 father-mother-child trios from the 23andMe, Inc. dataset. The team then used GERMLINE, a widely used IBD detection method, to detect IBD segments within this cohort and identified a false positive rate over 67 percent for short (2 to 4 centiMorgan) segments, arising primarily from the allowance of DNA phasing errors when detecting IBD which is necessary for retrieving long (> 6 centiMorgan) segments. The team then replicated the false IBD findings in an external dataset and introduced the HaploScore algorithm to improve the accuracy of short IBD segments while retaining long segments.
Because the open-source HaploScore algorithm can be applied to existing IBD segments, its introduction will differentiate between true and false reported IBD segments detected by any method to improve accuracy. The usage of IBD segments in genetic analyses will become increasingly common as the number of individuals with their genetic composition known increases.
“Identifying these false positives and creating the HaploScore solution will allow us to improve IBD detection and DNA phasing,” said Cory McLean, Ph.D., study author and 23andMe computational biologist. “Improved IBD detection and DNA phasing will allow all researchers to more accurately identify genetic relationships between distantly-related individuals and allow for improved ancestry reports within 23andMe.”
Link to the Published Version of the Article: http://mbe.oxfordjournals.org/content/early/2014/04/30/molbev.msu151.full.pdf+html
23andMe, Inc. is the leading consumer genetics and research company. Founded in 2006, the mission of the company is to help people access, understand, and benefit from the human genome. 23andMe has millions of customers worldwide, with more than 80 percent of customers consented to participate in research. 23andMe, Inc. is located in Sunnyvale, CA. More information is available at www.23andMe.com.