Health IT

Now that’s a security breach: England’s entire healthcare data set uploaded to Google servers

In England, consultants uploaded the country’s entire healthcare dataset – inpatient, outpatient and emergency room records – to Google servers outside Britian.


Who needs a national spy service to track your citizens when you’ve got a national health service? hasn’t been around long enough to replace the NSA, but that’s not the case in Britain with the respective agencies there.

The National Health Service’s plan to share de-identified data with pharma companies was bad and unpopular to begin with.

The UK government’s Health and Social Care Information Centre quietly announced plans to share all patient records held by the National Health Service with private companies, from insurers to pharmaceutical companies. The information sharing is on an opt-out basis, so if you don’t want your “clinical records, mental health consultations, drug addiction rehabilitation details, sexual health clinic attendance and abortion procedures” shared, along with your “GP records, HS numbers, post-codes, gender, date of birth,” you need to contact your doctor and opt out of the process.

Just a few weeks ago, the NHS decided to delay the program for six months:

…privacy experts have warned there will be no way for the public to work out who has their medical records, or to what use their data will be put. There have been questions raised about commercial companies buying data.

Last week, the British government admitted to releasing the data to insurance companies:

… the Health and Social Care Information Centre admitted giving the insurance industry the coded hospital records of millions of patients, pseudonymised, but re-identifiable by anyone with malicious intent. These were crunched by actuaries into tables showing the likelihood of death depending on various features such as age or disease, to help inform insurance premiums.

Then today this happened:

A prominent Tory MP on the powerful health select committee has questioned how the entire NHS hospital patient database for England was handed over to management consultants who uploaded it to Google servers based outside the UK.

The patient information had been obtained by PA Consulting, which claimed to have secured the “entire start-to-finish HES dataset across all three areas of collection – inpatient, outpatient and A&E”.

The data set was so large it took up 27 DVDs and took a couple of weeks to upload. The management consultants said: “Within two weeks of starting to use the Google tools we were able to produce interactive maps directly from HES queries in seconds.”

The revelations alarmed campaigners and privacy experts, who queried how Google maps could have been used unless some location data had been provided in the patient information files.

And it looked like the obviously re-identifiable data had spread to such a mapping site:

A service offered by a data mapping website was closed down on Monday, as health authorities launched an investigation into the site amid concerns it had apparently acquired millions of identifiable patient records without regulatory scrutiny.

A Hertfordshire-based online mapping company, Earthware, which offers services including property data, claimed to allow users to locate areas in England where a single individual had gone for specialised treatment.

This whole story proves what The Guardian had pointed out last summer:

… there’s one rule of thumb that should be borne in mind whenever any data-protection proposals are on the table: Any time someone speaks of relaxing the rules on sharing data that has been “anonymised” (had identifying information removed) or “pseudonymised” (had identifiers replaced with pseudonyms), you should assume until proven otherwise that he or she is talking rubbish.

Anonymising data is a very, very difficult business. When it comes to anonymising, there are three high-profile failures that get widely cited: AOL’s 2006 release of anonymous search data; the State of Massachusetts’ Group Insurance Commission release of anonymised health records; and Netflix’s 2006 release of 100m video-rental records.

In each case, researchers showed how relatively simple techniques could be used to re-identify the data in these sets, usually picking out the elements of each record that made them unique.

H/T to Fred Trotter for his tweet about this that caught my eye this morning.