Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They have no idea. Even data from continuous glucose monitors (most commonly used in type 1 diabetes) are directly shared with insurance companies, where certain patients with diabetes are "flagged" as "problem patients": https://type1tennis.blogspot.com/2016/05/when-data-fluxes-co...

I have 2 rare diseases, and one of which was discovered via NIH funds at the Mayo Clinic in the early 2000s. I also have type 1 diabetes, and I can attest to the veracity of the claims made on the blog post.

For somebody like me, the situation is unwinnable, if I want to live. HIPAA is a joke because it is perfectly legal to combine other data with the HIPAA anonymized source to identify the individual. Every day, leaving the US looks better.



You should move to Canada or Europe, they'll take care of your rare diseases.


Can you elaborate about what insurance companies can do in the US to someone known to have a rare disease? Can they deny coverage or price it higher than healthy people?


How would you combine HIPAA with another data source to identify the individual? Not suggesting it can't be done, just wondering how one might do that? Being able to link data that can identify a person to some de-identified would only be possible if the original data was not properly de-identified right?


There is no such thing as "proper de-identification" in general; it's all the matter of what other data sets the re-identifying party has at its disposal.

Consider the following de-identified data sets:

- [date, time, clinic, procedure or test being done, insurer] - as collected by the clinic chain so that it can get money from insurers

- [month, clinic, test name, test result] - for all tests made in the last year, collected for statistical purposes

- [date, time, latitude, longitude, phone number] - because AFAIR telcos sell this data

- [name, surname, phone number, ...] - some insurance company's list of customers

If you can get your hands on these datasets, you can trivially de-identify patients and even assign test results to them with high probability (that depends on how many tests of a given type are made in any given clinic per the unit of time used to group the second data set).

Real-world data sets may be less clear-cut than this, but there is more of it, and you can apply statistical methods to find correlations. You don't need to be 100% sure customer X has diabetes for the information to be useful to you; 70% or 60% is useful too.


Section 164.514(b)

"The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:

...

(B) All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code

(C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

... "

This the "Safe Harbor" method.

You could use the "Expert Determination" method. However, date + time + location attached to health information in your first data set definitely doesn't meet the criteria. I'll eat my hat if you find a supposed "non-PHI" data set with those.

In fact, the criteria for expert determination is literally that re-identification cannot be performed (without already having PHI-type information).


Yea this was my impression too. I've worked with HIPAA data and usually I had to remove far more than just like a "name" for it to be de-identified.



Genomic data is by definition identified.


> HIPAA is a joke because it is perfectly legal to combine other data with the HIPAA anonymized source to identify the individual.

HIPAA may be a joke, but not for this reason.

If information can be re-identified as PHI in any way (including matching phone numbers, birth date, IP addresses, patient account #s, etc.) it doesn't meet the de-identification standard.

Section 164.514(b)

You must remove the 20 types of identifiers, or receive a certification:

"A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:

Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information;"

Moreover, your information can only be used for research if you give written permission (Section 164.508). If you have given this permission, you may revoke it for the future.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: