As a kind of note and response to all of your other responses, the issue isn't actually anonymization, but being able to correlate it with other data.
That is, if health data were anonymized, and was done right, and was made unable to be correlated with any other data, it likely would be sufficient. It's when you start allowing it to be correlated with personally identifiable things that it ceases to be anonymous.
That is, sure, let's take a case where you have a super rare genetic disorder. That, combined with the time in 2005 where you broke your leg, is sufficient to distinguish you from every other person in the country. In short, you have a unique health profile.
So what? Unless there is further information, that can't be traced to you. As an example, it's when we start saying "Ah, and the person is receiving treatment at (facility)" that we now know where you live. It's when we start correlating it with usernames that we start getting an internet trail. It's when we start correlating those with forum profiles that we get a real name, and now we know who you are.
The only other way someone could match that profile with you, is to have access to the profile, and to know you personally. Otherwise it links nowhere.
I agree the risk is huge; people don't do it right. But anonymous health profiles are -not- in and of themselves dangerous; it's when details linking them to further information leaks out that it's a problem.
But, pragmatically, while yes it would be incredibly hard...has anyone here read the rights they're signing away when they go to the doctor? Does everyone here trust every system a doctor uses, every system a health insurer uses, and every system used by marketers and researchers that the feds -do- allow to have access to this data? The real risk of Google would be that they could correlate it with so many other things about you; but the health insurers still have your medical history combined with all your PII.
Although your health records may have some legal protections, health care is only one determinant of health.
Other determinants of health, like your gender, food choices, lifestyle, income, driving history, family history, physical environment, education, social network, etc. have all been heavily mined.
Much legally protected stuff can largely be inferred anyway. There aren't too many people without peanut allergies that haven't bought anything containing peanuts for the past 5 years.
There are only two changes that would make it very easy for me to accept revealing my medical data, for science, research or just about anything else.
1. No insurance companies involved as health care gatekeepers. At the moment, they are very much an adversary to me.
2. Strong, enforced laws against employers discriminating for health. I'm sure the letter of the law currently sounds strong, but I'm assuming you have to sue to right any wrongs. Advantage employer.
Neither one of these will happen in my life time, because insurance companies make huge profits on throttling our healthcare, employers will always like flexibility to do what they want with the law, and both camps fund Congress.
The anonymization rules for PHI are strict enough that it would make a lot of the interesting mining you could do difficult if not impossible. Specifically the restrictions related to dates and locations.
It's a lot trickier than this. Suppose you have a rare genetic condition that affects 0.005% of the population. It takes very little additional information to single out a person when the first thing you do is rule out 99.995% of the possibilities.