Just in time for HIPAA update, researchers ID 50 men in genomic database

It’s the kind of thing that data privacy advocates fear most: that the massive amounts of de-identified genomic data collected through diagnostic tests and exchanged by scientists for research will somehow be hacked and the true identities behind the data will be revealed. But that’s exactly what happened when a group of researchers discovered a loophole and were able to identify 50 males as part of a study published by the journal Science this month.

Yaniv Erlich, a human geneticist at the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts led the study, according to Nature. The researchers used a “surname interface” to find patterns in the Y chromosome and search recreational genetic genealogy databases. A summary describing the study said:

“We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources.”

That’s not the sort of thing data sharing champions particularly like and in a cautious move, the NIH division — US National Institute of General Medical Sciences — removed some data from public view.


The study concludes that people who participate in genomic research need to be better informed of data security risks, according to a Fierce Health IT article.

It’s a worrying development, particularly since it comes just as the new rules for the 17 year old Health Insurance Portability and Accountability Act were published this week (coming in at a weighty 532 pages). Although it may be a bit of a reach to connect HIPAA, which  is more concerned with patient data in electronic medical records, with genomics, that’s the direction we’re moving in.

A Reuters article illustrating the difference between the old and new HIPAA rules on data breaches said:

Before the change, companies only had to report the breach if the disclosure of information presented a significant risk of financial, reputational or other harm to the patient. Now if there’s an unauthorized disclosure and the health information is likely compromised, the company has to notify the patients and the government regardless of the risk of harm. If the breach affects more than 500 people in a certain area, the company must also inform the local media.

It looks like this is the kind of breach that would result in some particularly onerous penalties, depending on whether the NIH would be considered a healthcare company or “business associate,” in the eyes of the law. It’s sure to cause much more debate between those urging a more cautious approach to how data is handled and those who want  data to be more freely available for a variety of needs such as advancing personalized medicine to developing a better understanding of population health trends.