This is like using a sledgehammer to put up pictures. You simply do not need the...

micro_cam · on May 6, 2015

I've worked in the area of clinical genomics using whole genome sequencing. Your statement is unfortunately untrue though it perhaps could be if tools were better written.

While it is easy enough to analyze a single genome on your laptop most current popular analytical tools simply fall over when you start looking at hundreds of genomes on even a large server. Even basic stuff like combining multiple genomes into one file with consistent naming of variants can take entirely ridiculous multi terabyte amounts of ram because the tools to do so just weren't written with this scale in mind.

Most of these tools could (and should) be rewritten to do things without loading the whole data set in memory and work natively in a cluster of commodity clusters. There is some resistance to this of course because scientists prefer to use the published and established methods and often feel new methods need to be published and peer reviewed etc.

Until new tools are written and widely adopted to a large shared memory machine is a bandaid many hospitals and research seam eager to adopt.

dunk010 · on May 7, 2015

Yes indeed. And new tools are being written - see the Adam project for an interesting example: https://github.com/bigdatagenomics/adam and the associated variant caller Avocado: https://github.com/bigdatagenomics/avocado. Others are also trying to get the old tools working on Hadoop, for instance Halvade: https://github.com/ddcap/halvade/wiki/Halvade-Manual, Hadoop-BAM https://github.com/HadoopGenomics/Hadoop-BAM, SeqPig: http://seqpig.sourceforge.net/, and the guys at BioBankCloud: https://github.com/biobankcloud. It's going to take quite a while for this stuff to get fleshed out, and for researchers to adopt it. But the sheer weight of data is going to force things in the Hadoop direction eventually. It is inevitable.

a8da6b0c91d · on May 6, 2015

What hard evidence is there that genomics is relevant to cancer treatment, as proven by survival rates? Color me skeptical.

Gatsky · on May 7, 2015

You are right to say this. I treat breast cancer, and I'm doing a PhD on breast cancer genomics, and there is no evidence that high throughput data of any kind, whether it is genomics, transcriptomics, epigenomics, proteomics, metabolomics etc-omics actually helps patients. At the moment, a small panel of biomarkers using technology that is at least 20 years old is all we use to make treatment decisions. Is it adequate? Certainly not, but there is a HUGE amount of carefully collected data in many thousands of patients backing it up.

Not sure who is downvoting you, but they seem to have swallowed the hype wholesale. At the risk of sounding gratuitously negative, I find the discussion of medicine on HN to be of very poor quality, markedly below the general standard.

tom_b · on May 7, 2015

I think there is a distinction to be made here in questioning patient outcomes and questioning the relevance of genomic sequencing in treatment decisions.

Don't you think it is fair to say that high throughput data (whole genome sequencing with variant calling) is still in a state of being evaluated to measure its effectiveness in aiding the treatment decision process but that early results seems to lean towards it becoming part of the standard diagnostic approach?

Genomic sequencing and patient outcomes is a thornier question. My non-practitioner take is that it is too early to tell scientifically, but that there will probably be some benefit to early identification of specific cancer types and choosing treatment. But I think many people would have made a similar statement about mammography and early detection, and absolute mortality appears to not be reduced by adding mammography to the diagnostic procedures, right?

The research value of genomic sequencing seems high enough to make it worthwhile. At least, when I sit in on molecular tumor board reviews (the oncologists at a table looking at called variant results for a specific patient), I hear them commenting about possibly new and unknown variants being of research value.

I am really looking forward to your reply - Internet message boards in general have to be almost the worst way to discuss medicine, but having participation from researchers and practioners like you is tremendously illuminating!

Gatsky · on May 8, 2015

I define genomics as the unbiased interrogation of the genome using high throughput technology. Sequencing one mutant locus using Sanger sequencing does not fall under this definition - I don't think IBM's business model is using Watson to interpret that. So when other people point out that HER2 is a useful genomic marker they are missing the point - HER2 can be determined with immunohistochemistry for example which has been around for 50 years.

I'm not sure what your question is... genomics has research value, for sure, it's great. Is it worth trying to incorporate it into routine care? Yes, probably, if you have enough cash. Should a hospital pay for a black box machine learning algorithm to make recommendations from a highly polluted, often erroneous and hugely incomplete literature corpus? The alternative put forward by people actually doing the science is that we should try and develop large open source databases/repositories about the significance of genomic findings, and then collect the data about what happens to the patients.

tom_b · on May 6, 2015

Hmmm? Genomic breast cancer subtypes that each respond to different chemotherapies?

http://www.nature.com/nature/journal/v490/n7418/full/nature1...

a8da6b0c91d · on May 6, 2015

That doesn't really say anything about proven treatment efficacy.

tom_b · on May 6, 2015

Some information is hidden away in supplemental table 6, which points out candidate drugs to affect different biological pathways for different mutations.

You could also skim

http://www.nature.com/nature/journal/v406/n6797/full/406747a...

for more information about genomic classification of breast cancer.

From a treatment prespective, I would say that just glancing at http://ww5.komen.org/BreastCancer/SubtypesofBreastCancer.htm...

would provide information on treatment decisions generally made by finding appropriate subtype classifications.

I think that it is pretty clear that genomic sequencing of patient normal and tumor tissue to find mutations is going to be standard-of-care sooner rather than later, but it is fair to point out that genomic sequencing is not currently standard-of-care. However, I know of studies currently underway that look at variant calls and the possibility of taking action on those calls in ways the involve specifically adding those results back into the patient medical record.

I am struggling a bit with how to phrase this, but I don't think you can argue against (1) different subtypes of breast cancer are separate diseases and can be classified by genomic sequencing and (2) treatments for these separate diseases are different and have different efficacies.

davecap1 · on May 6, 2015

Cancer is a disease of genetics. You need to know what kind of cancer someone has in order to choose the treatment.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4032883/

dudurocha · on May 6, 2015

Look for BRCA1 and BRCA2 genetic testing and therapy.

dekhn · on May 7, 2015

Herceptin. Trastuzumab inhibits the effects of overexpression of HER2. If the breast cancer doesn't overexpress HER2, trastuzumab will have no beneficial effect (and may cause harm).

he original studies of trastuzumab showed that it improved overall survival in late-stage (metastatic) breast cancer from 20.3 to 25.1 months.[1] In early stage breast cancer, it reduces the risk of cancer returning after surgery by an absolute risk of 9.5%, and the risk of death by an absolute risk of 3% however increases serious heart problems by an absolute risk of 2.1% which may resolve if treatment is stopped.[2]

nl · on May 6, 2015

Oh look! Watson can answer that question: https://youtu.be/UFF9bI6e29U?t=2670

(ok, it's not exactly that question, but you can see how it works)

1971genocide · on May 6, 2015

Multi Terabytes of RAM ?

If it was any other field in computer science people would be really critical of your methodology.

The size of the human genome is 21 MB.

If you are trying to find the co-ordinate of every cancer cell in a human body then sure, You need a lot of RAM.

But the output of the collective field of cancer research doesn't seem to be there yet. So why do you need so much RAM ?

Usually when your problem becomes NP-hard. You switch to simpler models. Have you checked the search space for all simpler models ? Or are you sticking to complex models since it helps you publish papers ?

You also need to understand that hardware only gets you so far, running a cluster has its own costs - network latency. Most often than not, better techniques are required, rather than than say the tremendous improvement in computational power is not good enough.

toomuchtodo · on May 6, 2015

> The size of the human genome is 21 MB.

No.

> In the real world, right off the genome sequencer: ~200 gigabytes

> As a variant file, with just the list of mutations: ~125 megabytes

> What this means is that we’d all better brace ourselves for a major flood of genomic data. The 1000 genomes project data, for example, is now available in the AWS cloud and consists of >200 terabytes for the 1700 participants. As the cost of whole genome sequencing continues to drop, bigger and bigger sequencing studies are being rolled out. Just think about the storage requirements of this 10K Autism Genome project, or the UK’s 100k Genome project….. or even.. gasp.. this Million Human Genomes project. The computational demands are staggering, and the big question is: Can data analysis keep up, and what will we learn from this flood of A’s, T’s, G’s and C’s….?

https://medium.com/precision-medicine/how-big-is-the-human-g...

sgt101 · on May 6, 2015

Also the world of genomics has done fantastic work on compression and if you can compress it further you will probably win a decent award with a ceremony and free booze.

jarvist · on May 6, 2015

Scientific computing requires a lot of memory, and a lot of computer time. I think it's fair to say that the underlying libraries (LAPACK,ScaLAPACK, Intel's MKL) are the most intensively optimised code in the world. Most of the non trivial algorithms are polynomial in both time and memory.

I suspect this Press Release is hinting at a next-generation (cheap, fast) DNA sequencing method. These are derived from Shotgun Sequencing methods, were hundreds of gigabytes of random base pair sequences are reassembled to a coherent genome. The next-generation methods realise cost savings by an even more lossy method of reading smaller fragments of the genome, with much greater computational demands to reassemble.

noonespecial · on May 6, 2015

Of course its PR. Its hard enough to accept that your doctor is a computer, how much harder if its a javascript from healthcare.gov on a tab opposite cat pictures?

The fiction is that, yes its a computer, but its the biggest smartest computer ever! Look it won Jeopardy!

We do exactly the same thing with people. The medical system does a great deal of theater to convince everyone that they are getting "the best care possible" even though a moments thought is all it takes to realize that every facility and every doctor can't be even close to equal or "the best".

This is one step down the road to getting people to accept automated medical care. It will be lousy at first but it probably is the right path to take if everyone wants to get "the best" care possible everywhere in the future. You can replicate an awesome program for everyone everywhere but not an awesome doctor.

throwawayaway · on May 6, 2015

this one goes in your ass, this one goes in your mouth, this one goes under your arm. no, wait.

we'll see if jeopardy staffers frequent these watson hospitals, eating their own dogfood so to speak.

> we'll see if jeopardy staffers frequent these watson hospitals, eating their own dogfood so to speak.

this should have said:

we'll see if ibm watson staffers frequent these watson hospitals, eating their own dogfood so to speak.

Crito · on May 6, 2015

What the fuck would a camera man know about any of this, one way or the other? Why would his opinion be interesting at all?

throwawayaway · on May 6, 2015

ah i see you've spotted my typo. i fixed it now - thank you so much.

nonetheless, no need to be so down on camera men, some of my best friends own cameras i'll have you know.

xamuel · on May 6, 2015

>not humans, based on metrics and compassion

There's nothing magical about compassion. It's not like if the doctor just wishes you well hard enough, that'll make you get better. Should we stick to hand-crafted silicon chips, painstakingly etched by a former Swiss watchmaker full of compassion? Should compassionate telephone operators route your phonecall to its destination? If you want compassion, ask for a visit from the hospital chaplain.

LordKano · on May 6, 2015

There's nothing magical about compassion.

Unless, of course, you are the one in need of compassion.

If it's all about metrics, it's easy to decide that a given treatment comes with an expenditure is too great if only 20% of the patients are saved.

That would lead to 100% of the patients who might benefit from the aforementioned treatment being lost. But look at the bright side, you saved a bunch of money.

xamuel · on May 6, 2015

That sort of reasoning sounds very good if you're good-looking, charming, live in an affluent area, go to the same church as the doctor, etc. There's a very thin line between compassion and corruption, and neither should have any place in the formal decision-making of medicine. If everyone should be given the treatment, then make a rule saying everyone gets the treatment. Don't leave it up to doctors' compassion.

icebraining · on May 6, 2015

If the money is instead going to save 80% of the patients waiting for some other treatment, then good. If it's going to be used for other means, then your problem is not the machine decision, it's the person who allocated the money in that way.

maxander · on May 6, 2015

The biology is a bit more complicated than that, though. Sure, if we had a big database of all the correlation factors in some easily computer-readable format, standardized testing procedures that lined up with these factors, and an army of experts keeping all of this up-to-date, it would be easy to write a script to make it all work. But we don't, and (in my off-the-cuff estimation) such a project could wind up being of Human Genome Project-esque scale.

Watson is giving us a way of doing this when the available data is in a much less accommodating format (namely, scientific literature.) Not, perhaps, the biggest achievement ever, but pretty nifty and useful.

dunk010 · on May 7, 2015

Actually there are armies of experts keeping this all up to date, and some companies make a lot of money selling this as a service. The name of the role is "curator", and they basically map research to genetic variants, keeping a giant library up-to-date.

_delirium · on May 6, 2015

It is fairly impressive as a product/sales juggernaut, though. A lot of what Watson is doing is fairly standard expert systems + statistics, in a way very "classic AI" style, like the AI diagnosis systems of the '70s that achieved good results but never managed to surpass the deployment & political hurdles needed to get into hospitals. Besides benefiting from increased acceptance of computers in the hospital in general in the intervening years (iPads are everywhere now, and doctors look things up on the internet more than they'd like to admit), Watson has managed to put together the right combination of PR and politics to get it sold. Lots of managers, in both the medical field and elsewhere, seem willing to try out "deploying Watson" in a way they would never agree to "deploy an expert system".

gamegoblin · on May 6, 2015

I suspect the Star Trek (Voyager) episode you're referring to is:

http://en.memory-alpha.org/wiki/Critical_Care_%28episode%29

zobzu · on May 6, 2015

It's hard to judge and yes it's of course a PR piece. That said there's an interesting part to it.

Where it applies, I'd welcome computer based decision or control (for ex. surgical operations where precision is a must) over humans. I'd trust it with my life more than the humans directly, ultimately. We're just not as reliable as the tools we create.

nanl2053 · on May 6, 2015

It says in the article that each patient's data is 100GB. Definitely a PR piece, but you aren't going to do this work on a laptop.

FeepingCreature · on May 7, 2015

Well, the question is- does "compassion" lead to better treatment results on average?

Because if not, bring on the robot healthcare.

Karawebnetwork · on May 6, 2015

Watson is a new employee. Of course he isn't going to get all the responsibility yet. People don't trust him.