Why a Data Breach at a Genealogy Site Has Privacy Experts Worried

The peculiar matches began early on a Sunday morning. Across the world, genealogists found that they had numerous new relatives on GEDmatch, a website known for its role in helping crack the Golden State Killer case.

New relatives are typically cause for celebration among genealogists. But upon close inspection, experienced users noticed that some of the new relatives seemed to be the DNA equivalent of a Twitter bot or a Match.com scammer; the DNA did things that actual people’s DNA should not be able to do.

Others seemed to be suspected murderers and rapists, uploaded by genealogists working with law enforcement. Users knew that the police sometimes used the site to try to identify DNA found at crime scenes. But users found the new profiles strange because they also knew that profiles made for law enforcement purposes were supposed to be hidden to prevent tipping off or upsetting a suspect’s relatives amid an investigation. What really drew attention, however, was the fact that all one million or so users who had opted not to help law enforcement had been forced to opt in.

GEDmatch, a longstanding family history site containing around 1.4 million people’s genetic information, had experienced a data breach. The peculiar matches were not new uploads but rather the result of two back-to-back hacks, which overrode existing user settings, according to Brett Williams, the chief executive of Verogen, a forensic company that has owned GEDmatch since December.

Though the growth of genealogy sites has slowed slightly in recent years, their use by the police has increased. After the authorities in California used GEDmatch in 2018 to identify a suspect in the decades-long Golden State Killer case, police departments across the country began to dig through their cold case files in the hopes that this new technique could solve old crimes.

And GEDmatch was often their preferred site. Unlike the genealogy services Ancestry and 23andMe, which are marketed to people who are new to using DNA to learn about themselves, GEDmatch caters to more advanced researchers. The site appeals to the police because it allows DNA that has been processed elsewhere to be uploaded. Verogen has a long history of working with law enforcement, and the acquisition of GEDmatch further solidified this collaboration.

Scientists and genealogists say the GEDmatch breach — which exposed more than a million additional profiles to law enforcement officials — offers an important window into what can go wrong when those responsible for storing genetic information fail to take necessary precautions.

In an interview, Mr. Williams said that the first breach occurred early on July 19. After shutting down the site, his team “covered up the vulnerability,” he said, and brought it back online, but only briefly. “On Monday we took the site down again because it was clear the hackers were trying again,” he said.

This time the site remained down for nearly a week. “We’re taking an abundance of caution because we don’t want to end up in the same situation again,” Mr. Williams said.

Mr. Williams said he had hired an outside security team and contacted the F.B.I. to see if the agency would investigate. The F.B.I. did not respond to a request for comment.

All was far from resolved when the site’s settings were restored, said Debbie Kennett, a genealogist in Box, England, who wrote about the breach on her blog. We’re stuck with our DNA for life, she said. “Once it’s out there it’s not like an email address you can change,” she said in an interview. Because of its interconnected nature, she added, when any one person’s genetic information is exposed, the exposed DNA can potentially affect their family members too.

In a paper published last year, Michael Edge, a professor of biological sciences at the University of Southern California, and fellow researchers warned several genealogy websites that they were vulnerable to data breaches.

“Of course, hacks happen to lots of companies, even entities that take security very seriously,” he said. “At the same time, GEDmatch’s, and eventually Verogen’s, response to our paper didn’t inspire much confidence that they were taking it seriously.” Other genealogy websites, he added, seemed more open to the researchers’ recommendations for improving security.

For many, the presence of fake users in GEDmatch was as alarming as the breach itself. Genealogists know that they cannot trust names or emails. They also know that a user can easily upload someone else’s genetic profile. But the breach exposed that behind the scenes, hidden by privacy settings, were all kinds of profiles of people who were not even real.

The giveaway that the matches were not actual relatives was that their DNA was too good to be true, said Leah Larkin, a biologist who runs DNA Geek, a genealogical research company. People who managed profiles for many clients and relatives repeatedly found that these fake users somehow were displayed as close relatives across the unrelated profiles. Their visible ancestry information reinforced the matches were impossible and suggested the fake profiles had been designed to trick the site’s search algorithm for some reason.

In Dr. Edge’s paper, he warned that it was possible to create fake profiles to identify people with genetic variants associated with Alzheimer’s and other diseases.

“If something is just a geeky genealogist messing around, there is no concern,” Dr. Larkin said. But it becomes a problem, she said, if users are trying to find people who all share a particular genetic mutation or trait, as Dr. Edge cautioned. Such information could be abused by insurance companies, pharmaceutical companies or others, she said.

The breach also reinforced something that genealogists have been saying for years: Mixing genealogy and law enforcement is messy, even when you try to draw clear lines. Until two years ago, the primary DNA databases that law enforcement used for investigations were maintained by the F.B.I. and the police. That changed with the Golden State Killer case in 2018.

As police departments rushed to reinvestigate cold cases, GEDmatch, which at the time was run by two family history hobbyists as a sort of passion project, tried to serve two audiences: genealogists who simply wanted to trace their family tree and law enforcement officials who wanted to know if a murder or a rapist was hiding in one of its branches. Amid a backlash, GEDmatch changed its policy in May 2019 so that only users who explicitly opted to help law enforcement would show up in police searches. Still, there is little regulation around how the authorities can use GEDmatch and other genealogy databases, so it’s largely up to the companies and their users to police themselves.

And as the breach demonstrated, users’ wishes could be quickly overridden.

For some users, the reason for keeping their profiles private is philosophical. Even if helping law enforcement could mean helping catch a killer, they do not want their genetic information used to incriminate their relatives. Others, like Carolynn ni Lochlainn, a genealogist from Huntington, N.Y., keep their profiles private because they worry the data will be improperly used to arrest innocent people.

“I work with a lot of Black clients and cousins, and I was most angered by the inexcusable risk at which they were placed,” Ms. ni Lochlainn, said.

Colleen Fitzpatrick, the founder of Identifinders International, which applies forensic genealogy techniques toward identifying unclaimed remains and suspects in crimes, oversees a team that relies heavily on GEDmatch.

Her team was affected differently than the genealogists’ clients. They had uploaded DNA from crime scenes and unidentified babies who had been abandoned by their mothers. Because they’d checked the law enforcement box, these profiles were not supposed to show up in their relative’s searches. For a brief window in time, “the whole database, they could see us,” she said.

She said it was unlikely that anyone working with law enforcement had exploited the breach to obtain a match against a relative’s will, given the short amount of time involved. “It wasn’t this magnificent reveal that we’re going to cash in on,” she said.

Nonetheless, the breach undeniably undermined trust for all, she said. “I think Verogen needs to up its game,” she said.

View original article here Source