Skip to Main Content

You Probably Already Have Cousins in a DNA Database


A DNA database led police to the Golden State Killer suspect through data his distant cousins had uploaded. Now, population genetics researchers have calculated the probability that your relatives have given their genetic information to a similar database.

According to their calculations, chances are most of us would have a handful of third cousins in a 1-million-person database, about a hundred if the database contains 5 million people, and over 200 in a 10-million-person database. At any of these sizes, chances are near 100 percent that the database would contain at least one person who is your fourth cousin or beyond.

GEDmatch, the database that law enforcement used in the Golden State Killer case, currently holds around 650,000 records. AncestryDNA has around 5 million, and 23andMe has around 2 million. (These numbers were collected by genealogist Leah Larkin last year.)

The larger databases aren’t currently used by law enforcement for DNA searches, since they don’t accept data files, just spit samples. But interest in genealogy is only growing over time, so it’s worth keeping an eye on these numbers.

Putting these calculations together, it’s pretty reasonable to expect that most of us have distant cousins in GEDmatch, and probably some closer cousins on AncestryDNA and 23andMe.

There are important limitations to these numbers. They assume that the population in these databases is a random sample of the population at large, which it’s probably not. (I’d bet money that it skews white, wealthy, and Mormon.) They also assume no inbreeding and that people select their partners totally at random. And finally, they’re averages; it’s possible that just by chance you don’t have any relatives interested in genealogy, or on the flip side that your mom and sister are working on a family tree and have convinced all your close relatives to participate.

But the bottom line, these scientists say, is that that law enforcement finding a suspect’s family’s DNA in a public database was probably not a lucky find at all, but a totally expected one.

How lucky was the genetic investigation in the Golden State Killer case? | The Coop Lab