This is like using a sledgehammer to put up pictures. You simply do not need the kind of parallel compute grunt something like watson can provide to do correlation analysis as they describe. You could cobble this together in {dirty secret scripting language of choice} and run it on your laptop.
This is a PR piece - not the article, the activity - to bring forth the idea of computers making decisions about healthcare, based on metrics - not humans, based on metrics and compassion.
The cynic in me sees this as setting a disquieting precedent in the direction of healthcare being distributed according not only to patients prospects and treatment needs, but other factors which a financially liable party, say an insurer, would be interested in - such as the earning potential of the patient.
I'm pretty sure there was a Star Trek episode about this.
I've worked in the area of clinical genomics using whole genome sequencing. Your statement is unfortunately untrue though it perhaps could be if tools were better written.
While it is easy enough to analyze a single genome on your laptop most current popular analytical tools simply fall over when you start looking at hundreds of genomes on even a large server. Even basic stuff like combining multiple genomes into one file with consistent naming of variants can take entirely ridiculous multi terabyte amounts of ram because the tools to do so just weren't written with this scale in mind.
Most of these tools could (and should) be rewritten to do things without loading the whole data set in memory and work natively in a cluster of commodity clusters. There is some resistance to this of course because scientists prefer to use the published and established methods and often feel new methods need to be published and peer reviewed etc.
Until new tools are written and widely adopted to a large shared memory machine is a bandaid many hospitals and research seam eager to adopt.
You are right to say this.
I treat breast cancer, and I'm doing a PhD on breast cancer genomics, and there is no evidence that high throughput data of any kind, whether it is genomics, transcriptomics, epigenomics, proteomics, metabolomics etc-omics actually helps patients. At the moment, a small panel of biomarkers using technology that is at least 20 years old is all we use to make treatment decisions. Is it adequate? Certainly not, but there is a HUGE amount of carefully collected data in many thousands of patients backing it up.
Not sure who is downvoting you, but they seem to have swallowed the hype wholesale. At the risk of sounding gratuitously negative, I find the discussion of medicine on HN to be of very poor quality, markedly below the general standard.
I think there is a distinction to be made here in questioning patient outcomes and questioning the relevance of genomic sequencing in treatment decisions.
Don't you think it is fair to say that high throughput data (whole genome sequencing with variant calling) is still in a state of being evaluated to measure its effectiveness in aiding the treatment decision process but that early results seems to lean towards it becoming part of the standard diagnostic approach?
Genomic sequencing and patient outcomes is a thornier question. My non-practitioner take is that it is too early to tell scientifically, but that there will probably be some benefit to early identification of specific cancer types and choosing treatment. But I think many people would have made a similar statement about mammography and early detection, and absolute mortality appears to not be reduced by adding mammography to the diagnostic procedures, right?
The research value of genomic sequencing seems high enough to make it worthwhile. At least, when I sit in on molecular tumor board reviews (the oncologists at a table looking at called variant results for a specific patient), I hear them commenting about possibly new and unknown variants being of research value.
I am really looking forward to your reply - Internet message boards in general have to be almost the worst way to discuss medicine, but having participation from researchers and practioners like you is tremendously illuminating!
I define genomics as the unbiased interrogation of the genome using high throughput technology. Sequencing one mutant locus using Sanger sequencing does not fall under this definition - I don't think IBM's business model is using Watson to interpret that. So when other people point out that HER2 is a useful genomic marker they are missing the point - HER2 can be determined with immunohistochemistry for example which has been around for 50 years.
I'm not sure what your question is... genomics has research value, for sure, it's great.
Is it worth trying to incorporate it into routine care? Yes, probably, if you have enough cash. Should a hospital pay for a black box machine learning algorithm to make recommendations from a highly polluted, often erroneous and hugely incomplete literature corpus? The alternative put forward by people actually doing the science is that we should try and develop large open source databases/repositories about the significance of genomic findings, and then collect the data about what happens to the patients.
Some information is hidden away in supplemental table 6, which points out candidate drugs to affect different biological pathways for different mutations.
would provide information on treatment decisions generally made by finding appropriate subtype classifications.
I think that it is pretty clear that genomic sequencing of patient normal and tumor tissue to find mutations is going to be standard-of-care sooner rather than later, but it is fair to point out that genomic sequencing is not currently standard-of-care. However, I know of studies currently underway that look at variant calls and the possibility of taking action on those calls in ways the involve specifically adding those results back into the patient medical record.
I am struggling a bit with how to phrase this, but I don't think you can argue against (1) different subtypes of breast cancer are separate diseases and can be classified by genomic sequencing and (2) treatments for these separate diseases are different and have different efficacies.
Herceptin. Trastuzumab inhibits the effects of overexpression of HER2. If the breast cancer doesn't overexpress HER2, trastuzumab will have no beneficial effect (and may cause harm).
he original studies of trastuzumab showed that it improved overall survival in late-stage (metastatic) breast cancer from 20.3 to 25.1 months.[1] In early stage breast cancer, it reduces the risk of cancer returning after surgery by an absolute risk of 9.5%, and the risk of death by an absolute risk of 3% however increases serious heart problems by an absolute risk of 2.1% which may resolve if treatment is stopped.[2]
If it was any other field in computer science people would be really critical of your methodology.
The size of the human genome is 21 MB.
If you are trying to find the co-ordinate of every cancer cell in a human body then sure, You need a lot of RAM.
But the output of the collective field of cancer research doesn't seem to be there yet. So why do you need so much RAM ?
Usually when your problem becomes NP-hard. You switch to simpler models. Have you checked the search space for all simpler models ? Or are you sticking to complex models since it helps you publish papers ?
You also need to understand that hardware only gets you so far, running a cluster has its own costs - network latency.
Most often than not, better techniques are required, rather than than say the tremendous improvement in computational power is not good enough.
> In the real world, right off the genome sequencer: ~200 gigabytes
> As a variant file, with just the list of mutations: ~125 megabytes
> What this means is that we’d all better brace ourselves for a major flood of genomic data. The 1000 genomes project data, for example, is now available in the AWS cloud and consists of >200 terabytes for the 1700 participants. As the cost of whole genome sequencing continues to drop, bigger and bigger sequencing studies are being rolled out. Just think about the storage requirements of this 10K Autism Genome project, or the UK’s 100k Genome project….. or even.. gasp.. this Million Human Genomes project. The computational demands are staggering, and the big question is: Can data analysis keep up, and what will we learn from this flood of A’s, T’s, G’s and C’s….?
Also the world of genomics has done fantastic work on compression and if you can compress it further you will probably win a decent award with a ceremony and free booze.
Scientific computing requires a lot of memory, and a lot of computer time. I think it's fair to say that the underlying libraries (LAPACK,ScaLAPACK, Intel's MKL) are the most intensively optimised code in the world. Most of the non trivial algorithms are polynomial in both time and memory.
I suspect this Press Release is hinting at a next-generation (cheap, fast) DNA sequencing method. These are derived from Shotgun Sequencing methods, were hundreds of gigabytes of random base pair sequences are reassembled to a coherent genome. The next-generation methods realise cost savings by an even more lossy method of reading smaller fragments of the genome, with much greater computational demands to reassemble.
Of course its PR. Its hard enough to accept that your doctor is a computer, how much harder if its a javascript from healthcare.gov on a tab opposite cat pictures?
The fiction is that, yes its a computer, but its the biggest smartest computer ever! Look it won Jeopardy!
We do exactly the same thing with people. The medical system does a great deal of theater to convince everyone that they are getting "the best care possible" even though a moments thought is all it takes to realize that every facility and every doctor can't be even close to equal or "the best".
This is one step down the road to getting people to accept automated medical care. It will be lousy at first but it probably is the right path to take if everyone wants to get "the best" care possible everywhere in the future. You can replicate an awesome program for everyone everywhere but not an awesome doctor.
There's nothing magical about compassion. It's not like if the doctor just wishes you well hard enough, that'll make you get better. Should we stick to hand-crafted silicon chips, painstakingly etched by a former Swiss watchmaker full of compassion? Should compassionate telephone operators route your phonecall to its destination? If you want compassion, ask for a visit from the hospital chaplain.
Unless, of course, you are the one in need of compassion.
If it's all about metrics, it's easy to decide that a given treatment comes with an expenditure is too great if only 20% of the patients are saved.
That would lead to 100% of the patients who might benefit from the aforementioned treatment being lost. But look at the bright side, you saved a bunch of money.
That sort of reasoning sounds very good if you're good-looking, charming, live in an affluent area, go to the same church as the doctor, etc. There's a very thin line between compassion and corruption, and neither should have any place in the formal decision-making of medicine. If everyone should be given the treatment, then make a rule saying everyone gets the treatment. Don't leave it up to doctors' compassion.
If the money is instead going to save 80% of the patients waiting for some other treatment, then good. If it's going to be used for other means, then your problem is not the machine decision, it's the person who allocated the money in that way.
The biology is a bit more complicated than that, though. Sure, if we had a big database of all the correlation factors in some easily computer-readable format, standardized testing procedures that lined up with these factors, and an army of experts keeping all of this up-to-date, it would be easy to write a script to make it all work. But we don't, and (in my off-the-cuff estimation) such a project could wind up being of Human Genome Project-esque scale.
Watson is giving us a way of doing this when the available data is in a much less accommodating format (namely, scientific literature.) Not, perhaps, the biggest achievement ever, but pretty nifty and useful.
Actually there are armies of experts keeping this all up to date, and some companies make a lot of money selling this as a service. The name of the role is "curator", and they basically map research to genetic variants, keeping a giant library up-to-date.
It is fairly impressive as a product/sales juggernaut, though. A lot of what Watson is doing is fairly standard expert systems + statistics, in a way very "classic AI" style, like the AI diagnosis systems of the '70s that achieved good results but never managed to surpass the deployment & political hurdles needed to get into hospitals. Besides benefiting from increased acceptance of computers in the hospital in general in the intervening years (iPads are everywhere now, and doctors look things up on the internet more than they'd like to admit), Watson has managed to put together the right combination of PR and politics to get it sold. Lots of managers, in both the medical field and elsewhere, seem willing to try out "deploying Watson" in a way they would never agree to "deploy an expert system".
It's hard to judge and yes it's of course a PR piece.
That said there's an interesting part to it.
Where it applies, I'd welcome computer based decision or control (for ex. surgical operations where precision is a must) over humans. I'd trust it with my life more than the humans directly, ultimately. We're just not as reliable as the tools we create.
This is a PR piece - not the article, the activity - to bring forth the idea of computers making decisions about healthcare, based on metrics - not humans, based on metrics and compassion.
The cynic in me sees this as setting a disquieting precedent in the direction of healthcare being distributed according not only to patients prospects and treatment needs, but other factors which a financially liable party, say an insurer, would be interested in - such as the earning potential of the patient.
I'm pretty sure there was a Star Trek episode about this.