The best DNA testing kits are handy if you want to know more about yourself, your ancestors, and where you come from, but how do scientists take your genetic makeup and extract information? As it turns out, it's not an exact science.
The DNA testers database size and reference populations
When asked about how database size affects ancestry results, David Nicholson, co-founder of Living DNA, told us: “The tests absolutely rely on the reference database. If you have Polish ancestry but there are no people in the database who are Polish, then what the test will do is show what the next closest group is next to Polish, like German or Eastern European ancestry.”
Each ancestry DNA service has its own sample database and reference panel made of the DNA samples collected from their users and information collected from sources like the 1000 Genomes Project. The database consists of all this information collectively. A reference panel is made of certain curated samples with known family history and roots in a specific place.
The services use insights gleaned from the reference panel to give you geographical ancestry results. In theory, a larger database leads to more information available to create a good reference panel, which then leads to better results for customers.
In testing, we found that many tests have much more specific and detailed results for European ancestry than anywhere else. This is due more to the diversity of the database than size.
For example, AncestryDNA has the largest database with over 10 million samples yet results for Asian ancestry are markedly less specific than results from several companies with much smaller databases, including 23andMe and Living DNA.
Instead of pulling reference samples directly from the existing database, however, many companies seek out high-quality data with special research projects. 23andMe, for example, offers its Global Genetics project, which sends free kits to people with all four grandparents born in certain countries that are underrepresented in the database.
DNA traits: How accurate are DNA tests?
Our testers took multiple DNA ancestry tests, and the services returned slightly different results for each person. This doesn’t necessarily mean that any one company is more accurate than another.
Every DNA testing service uses its own algorithm and data set – different reference populations drawn from different databases.
Nacho Esteban of 24Genetics told us: “Ancestry is not an exact science. The top five companies in the world would show very similar results when talking about continents; the similarity is smaller when talking about countries. In regional ancestry, some border regions are difficult to identify and sometimes there may be discrepancies. So we cannot take the information as something 100% sure. But at the end, it gives a great picture of where our ancestors were from.”
In our tests, we did find consistency across our results on the continental level. For example, my ancestry is exclusively East Asian, but 23andMe breaks it down into 80% Korean, 10.5% Japanese and 0.8% Chinese, with the remaining 8.7% in broader categories. However, Ancestry reports my DNA as 98% Korean and Northern Chinese, with only 2% Japanese.
National Geographic places 85% of my ancestry from Northeastern Asia and 14% from the South China Sea region, with my DNA most closely matching the Korean and Japanese reference populations.
Types of DNA
Of the 23 pairs of chromosomes in the human genome, 22 are autosomes. Most direct-to-consumer DNA tests look primarily at your autosomal DNA to determine your geographic ancestry percentages.
This DNA is a mix of inherited DNA segments – half from each parent. Because everyone inherits at least one X chromosome from their mother, DNA tests often include the X chromosome in autosomal testing, though the X chromosome is not an autosome.
The 23rd pair of chromosomes is comprised of sex chromosomes – X and Y chromosomes that determine whether you’re male (XY) or female (XX). Traits like red-green color blindness, male pattern baldness and hemophilia are specifically linked to X or Y chromosomes and are called sex-linked characteristics.
All of those examples, and most other sex-linked traits, are X-linked and more common in males, who only have one X chromosome. Many DNA tests isolate Y DNA in males to show consumers their paternal haplogroup. Since the Y chromosome is directly inherited from father to son, it is possible to trace direct paternal lineage for many generations.
Similarly, mitochondrial DNA, or mtDNA, is used by direct-to-consumer DNA tests to trace your direct maternal lineage and determine maternal haplogroups. While most DNA lives in your cells' nuclei, mtDNA lives in the mitochondria. Mitochondria are the cells' powerhouses – their 37 genes are necessary for cellular energy production and respiration.
Previous research suggested that mtDNA is inherited directly from your mother, but a recent study found that biparental mtDNA may be more common. This discovery may affect maternal haplogroup testing in DNA tests in the future, but for now, it’s safe to assume your results are correct.
Genotyping and sequencing
Most of the services we tested use genotyping to read your DNA. Genotyping looks for specific markers in your genetic code. For something like ancestry testing, genotyping is effective because it identifies known variants in your DNA. Scientifically speaking, genotyping’s weakness is that it can only recognize previously identified markers.
This is one reason DNA tests’ accuracy relies so heavily on the DNA database size; there must be enough information available and identified genetic variants in the database to recognize new customers’ markers.
A few of the DNA tests we tested, including the National Geographic Geno 2.0, use genetic sequencing instead of genotyping. Sequencing is newer in the mainstream direct-to-consumer DNA testing market, as it used to cost more and take much longer to sequence a person’s DNA.
Sequencing identifies the exact makeup of a certain piece of DNA – be it a short segment or the whole genome. The Helix tests sequence the Exome, which are the parts of the genome responsible for protein production, plus several other regions of interest.
DNA sequencing gives more information overall and has more uses in medical testing than genotyping. In the future, more DNA kits may move from genotyping to DNA sequencing as the technology gets cheaper and faster, but for now, both are effective ways to look into your geographic ancestry.