Everyone here has got the wrong idea about what Watson is doing here.
It maybe doing sequence analysis, but I've never seen anything about that before.
The real strength is the journal article reading mentioned in passing. What it does is read studies on new treatment options as they are released, and then when a person presents symptoms it will do a diagnosis and suggest treatment options along with supporting evidence.
The natural language understanding needed to do the journal understanding is the real unique offering here.
There are videos on YouTube showing the system. I'm on mobile so I'm not looking them up now.
Everyone here has got the wrong idea about what Watson is doing here.
That would be because the article is very misleading, and without some background in the field, can you blame people for thinking that?
Take for instance this paragraph:
However the process is very time-consuming - a single patient's
genome represents more than 100 gigabytes of data - and this needs
to be combined with other medical records, journal studies and
information about clinical trials.
What would take a clinician weeks to analyse can be completed by
Watson in only a few minutes.
The first section seems to imply that the sequencing data and the analysis data are the same thing, which they aren't.
As for the second section, it is simply not true. Clinicians only intervene at certain points in the process, and their work for a single patient certainly does not take "weeks", however the whole process does. But Watson would not cover the whole process, despite the impression being given.
It is just shoddy reporting, taking shortcuts in the wrong places. What saddens me is that the general public will take the article at face value and expect their caretakers to live up to this fiction.
Have a look at https://youtu.be/UFF9bI6e29U?t=2539 (should link to the 42 minute mark), where they go through slides of the Watson Healthcare UI, and it shows how the Watson reasoning system does diagnosis and suggests treatments.
This is like using a sledgehammer to put up pictures. You simply do not need the kind of parallel compute grunt something like watson can provide to do correlation analysis as they describe. You could cobble this together in {dirty secret scripting language of choice} and run it on your laptop.
This is a PR piece - not the article, the activity - to bring forth the idea of computers making decisions about healthcare, based on metrics - not humans, based on metrics and compassion.
The cynic in me sees this as setting a disquieting precedent in the direction of healthcare being distributed according not only to patients prospects and treatment needs, but other factors which a financially liable party, say an insurer, would be interested in - such as the earning potential of the patient.
I'm pretty sure there was a Star Trek episode about this.
I've worked in the area of clinical genomics using whole genome sequencing. Your statement is unfortunately untrue though it perhaps could be if tools were better written.
While it is easy enough to analyze a single genome on your laptop most current popular analytical tools simply fall over when you start looking at hundreds of genomes on even a large server. Even basic stuff like combining multiple genomes into one file with consistent naming of variants can take entirely ridiculous multi terabyte amounts of ram because the tools to do so just weren't written with this scale in mind.
Most of these tools could (and should) be rewritten to do things without loading the whole data set in memory and work natively in a cluster of commodity clusters. There is some resistance to this of course because scientists prefer to use the published and established methods and often feel new methods need to be published and peer reviewed etc.
Until new tools are written and widely adopted to a large shared memory machine is a bandaid many hospitals and research seam eager to adopt.
You are right to say this.
I treat breast cancer, and I'm doing a PhD on breast cancer genomics, and there is no evidence that high throughput data of any kind, whether it is genomics, transcriptomics, epigenomics, proteomics, metabolomics etc-omics actually helps patients. At the moment, a small panel of biomarkers using technology that is at least 20 years old is all we use to make treatment decisions. Is it adequate? Certainly not, but there is a HUGE amount of carefully collected data in many thousands of patients backing it up.
Not sure who is downvoting you, but they seem to have swallowed the hype wholesale. At the risk of sounding gratuitously negative, I find the discussion of medicine on HN to be of very poor quality, markedly below the general standard.
I think there is a distinction to be made here in questioning patient outcomes and questioning the relevance of genomic sequencing in treatment decisions.
Don't you think it is fair to say that high throughput data (whole genome sequencing with variant calling) is still in a state of being evaluated to measure its effectiveness in aiding the treatment decision process but that early results seems to lean towards it becoming part of the standard diagnostic approach?
Genomic sequencing and patient outcomes is a thornier question. My non-practitioner take is that it is too early to tell scientifically, but that there will probably be some benefit to early identification of specific cancer types and choosing treatment. But I think many people would have made a similar statement about mammography and early detection, and absolute mortality appears to not be reduced by adding mammography to the diagnostic procedures, right?
The research value of genomic sequencing seems high enough to make it worthwhile. At least, when I sit in on molecular tumor board reviews (the oncologists at a table looking at called variant results for a specific patient), I hear them commenting about possibly new and unknown variants being of research value.
I am really looking forward to your reply - Internet message boards in general have to be almost the worst way to discuss medicine, but having participation from researchers and practioners like you is tremendously illuminating!
I define genomics as the unbiased interrogation of the genome using high throughput technology. Sequencing one mutant locus using Sanger sequencing does not fall under this definition - I don't think IBM's business model is using Watson to interpret that. So when other people point out that HER2 is a useful genomic marker they are missing the point - HER2 can be determined with immunohistochemistry for example which has been around for 50 years.
I'm not sure what your question is... genomics has research value, for sure, it's great.
Is it worth trying to incorporate it into routine care? Yes, probably, if you have enough cash. Should a hospital pay for a black box machine learning algorithm to make recommendations from a highly polluted, often erroneous and hugely incomplete literature corpus? The alternative put forward by people actually doing the science is that we should try and develop large open source databases/repositories about the significance of genomic findings, and then collect the data about what happens to the patients.
Some information is hidden away in supplemental table 6, which points out candidate drugs to affect different biological pathways for different mutations.
would provide information on treatment decisions generally made by finding appropriate subtype classifications.
I think that it is pretty clear that genomic sequencing of patient normal and tumor tissue to find mutations is going to be standard-of-care sooner rather than later, but it is fair to point out that genomic sequencing is not currently standard-of-care. However, I know of studies currently underway that look at variant calls and the possibility of taking action on those calls in ways the involve specifically adding those results back into the patient medical record.
I am struggling a bit with how to phrase this, but I don't think you can argue against (1) different subtypes of breast cancer are separate diseases and can be classified by genomic sequencing and (2) treatments for these separate diseases are different and have different efficacies.
Herceptin. Trastuzumab inhibits the effects of overexpression of HER2. If the breast cancer doesn't overexpress HER2, trastuzumab will have no beneficial effect (and may cause harm).
he original studies of trastuzumab showed that it improved overall survival in late-stage (metastatic) breast cancer from 20.3 to 25.1 months.[1] In early stage breast cancer, it reduces the risk of cancer returning after surgery by an absolute risk of 9.5%, and the risk of death by an absolute risk of 3% however increases serious heart problems by an absolute risk of 2.1% which may resolve if treatment is stopped.[2]
If it was any other field in computer science people would be really critical of your methodology.
The size of the human genome is 21 MB.
If you are trying to find the co-ordinate of every cancer cell in a human body then sure, You need a lot of RAM.
But the output of the collective field of cancer research doesn't seem to be there yet. So why do you need so much RAM ?
Usually when your problem becomes NP-hard. You switch to simpler models. Have you checked the search space for all simpler models ? Or are you sticking to complex models since it helps you publish papers ?
You also need to understand that hardware only gets you so far, running a cluster has its own costs - network latency.
Most often than not, better techniques are required, rather than than say the tremendous improvement in computational power is not good enough.
> In the real world, right off the genome sequencer: ~200 gigabytes
> As a variant file, with just the list of mutations: ~125 megabytes
> What this means is that we’d all better brace ourselves for a major flood of genomic data. The 1000 genomes project data, for example, is now available in the AWS cloud and consists of >200 terabytes for the 1700 participants. As the cost of whole genome sequencing continues to drop, bigger and bigger sequencing studies are being rolled out. Just think about the storage requirements of this 10K Autism Genome project, or the UK’s 100k Genome project….. or even.. gasp.. this Million Human Genomes project. The computational demands are staggering, and the big question is: Can data analysis keep up, and what will we learn from this flood of A’s, T’s, G’s and C’s….?
Also the world of genomics has done fantastic work on compression and if you can compress it further you will probably win a decent award with a ceremony and free booze.
Scientific computing requires a lot of memory, and a lot of computer time. I think it's fair to say that the underlying libraries (LAPACK,ScaLAPACK, Intel's MKL) are the most intensively optimised code in the world. Most of the non trivial algorithms are polynomial in both time and memory.
I suspect this Press Release is hinting at a next-generation (cheap, fast) DNA sequencing method. These are derived from Shotgun Sequencing methods, were hundreds of gigabytes of random base pair sequences are reassembled to a coherent genome. The next-generation methods realise cost savings by an even more lossy method of reading smaller fragments of the genome, with much greater computational demands to reassemble.
Of course its PR. Its hard enough to accept that your doctor is a computer, how much harder if its a javascript from healthcare.gov on a tab opposite cat pictures?
The fiction is that, yes its a computer, but its the biggest smartest computer ever! Look it won Jeopardy!
We do exactly the same thing with people. The medical system does a great deal of theater to convince everyone that they are getting "the best care possible" even though a moments thought is all it takes to realize that every facility and every doctor can't be even close to equal or "the best".
This is one step down the road to getting people to accept automated medical care. It will be lousy at first but it probably is the right path to take if everyone wants to get "the best" care possible everywhere in the future. You can replicate an awesome program for everyone everywhere but not an awesome doctor.
There's nothing magical about compassion. It's not like if the doctor just wishes you well hard enough, that'll make you get better. Should we stick to hand-crafted silicon chips, painstakingly etched by a former Swiss watchmaker full of compassion? Should compassionate telephone operators route your phonecall to its destination? If you want compassion, ask for a visit from the hospital chaplain.
Unless, of course, you are the one in need of compassion.
If it's all about metrics, it's easy to decide that a given treatment comes with an expenditure is too great if only 20% of the patients are saved.
That would lead to 100% of the patients who might benefit from the aforementioned treatment being lost. But look at the bright side, you saved a bunch of money.
That sort of reasoning sounds very good if you're good-looking, charming, live in an affluent area, go to the same church as the doctor, etc. There's a very thin line between compassion and corruption, and neither should have any place in the formal decision-making of medicine. If everyone should be given the treatment, then make a rule saying everyone gets the treatment. Don't leave it up to doctors' compassion.
If the money is instead going to save 80% of the patients waiting for some other treatment, then good. If it's going to be used for other means, then your problem is not the machine decision, it's the person who allocated the money in that way.
The biology is a bit more complicated than that, though. Sure, if we had a big database of all the correlation factors in some easily computer-readable format, standardized testing procedures that lined up with these factors, and an army of experts keeping all of this up-to-date, it would be easy to write a script to make it all work. But we don't, and (in my off-the-cuff estimation) such a project could wind up being of Human Genome Project-esque scale.
Watson is giving us a way of doing this when the available data is in a much less accommodating format (namely, scientific literature.) Not, perhaps, the biggest achievement ever, but pretty nifty and useful.
Actually there are armies of experts keeping this all up to date, and some companies make a lot of money selling this as a service. The name of the role is "curator", and they basically map research to genetic variants, keeping a giant library up-to-date.
It is fairly impressive as a product/sales juggernaut, though. A lot of what Watson is doing is fairly standard expert systems + statistics, in a way very "classic AI" style, like the AI diagnosis systems of the '70s that achieved good results but never managed to surpass the deployment & political hurdles needed to get into hospitals. Besides benefiting from increased acceptance of computers in the hospital in general in the intervening years (iPads are everywhere now, and doctors look things up on the internet more than they'd like to admit), Watson has managed to put together the right combination of PR and politics to get it sold. Lots of managers, in both the medical field and elsewhere, seem willing to try out "deploying Watson" in a way they would never agree to "deploy an expert system".
It's hard to judge and yes it's of course a PR piece.
That said there's an interesting part to it.
Where it applies, I'd welcome computer based decision or control (for ex. surgical operations where precision is a must) over humans. I'd trust it with my life more than the humans directly, ultimately. We're just not as reliable as the tools we create.
I have a daughter with Nephroblastomatosis, a super rare kind of kidney cancer that often lasts for years before just naturally going away (no one knows why), and dealing with the treatment options is agonizing.
For example, she's been on 3 different kinds of chemotherapy for 12+ months. 2 of the chemotherapy drugs don't seem to do anything. The 3rd kind seemed to result in an immediate reduction of the nephroblastomatosis, but unfortunately this chemo drug is very harmful to your body and there is a limit to how much of the drug you can receive. Now, she is not on chemo anymore.
Every 1.5 months we go in for a checkup, alternating between an MRI and sonogram.
2 times they have found "spots" on her kidneys. But, what to do about them? Additional chemo? Surgery? Wait and see? This is what I want a computer to tell me. What are the risks of surgery? What are the risks of losing more of your kidney? What are the risks of additional chemo? What are the risks of doing nothing? I would rather a computer had this information and gave its recommendation, rather than a group of doctors and us, the parents. We would be using the same logic in our decision, so it seems like the computer could do it more accurately and with zero bias.
I welcome Watson into our health care system. Anything that takes some of the agony out of very very difficult treatment decisions will be a good thing.
I wonder if the Jeopardy appearances helped Watson (by getting the name out) or hindered it (by making people think it's just a parlor trick). Either way, this is seriously fascinating: I'm excited to see what other applications it has.
People aren't stupid, and if it was a parlour trick it was an exceptionally well-executed on. People, even those who don't have a grasp on the computational complexity of language and reasoning, seem legitimately impressed by the performance.
Yes, and it was a real research collaboration - a brilliant insight - to see that the Jeopardy language was constrained and tractable to analysis that would allow a mapping to a solver.
I have not been involved in these efforts or the community closely enough to say, but I think that things like controlled english
The (very rough) process with next-gen sequencing is as follows:
1. one or more samples are taken from the patient (tumor/normal, peripheral blood, bone marrow, etc.)
2. the samples are sequenced, this can take different forms, involve different processes, but basically, specific sections of the genome are extracted, amplified, then turned into digital data
3. the data is aligned to a reference genome
4. the aligned data is scanned for variants (mutations), often in specific genes, and even specific sites of certain genes. This again involves different methods often used in combination
5. the variants are partially filtered out automatically (poor quality, bad reads, spurious, synonymous, etc.)
6. the remaining variants are then reviewed manually, relying on academic papers and historical data
7. what remains is reviewed again, taking the patient's history into account, and potential targeted treatments are inferred
8. the patient and his/her doctor make the final decision
So from what I understand, even in an optimal scenario, Watson would be taking care of steps 6 and 7, which represent maybe 15% of the overall time. I believe the article presented here is rather misleading in this regard, as it seems to suggest that Watson would be taking care of the actual sequencing too.
Additionally, I imagine that in most cases some humans would still need to review Watson's data.
What the article doesn't mention either is access to the actual patient data, which in itself is a political and operational problem, not so much a technical one.
Finally, I imagine Watson would infer the individual variant diagnoses from papers, which again, is far easier said than done (though I do believe that would be the point where Watson shines most). Parsing papers reliably for usable data is really, really hard, in no small part because the academic system is antiquated, as are paper formats. Even if you manage that, you still have to figure out whether the papers in question, or their authors, are actually reliable sources. And even then, you will likely end up with conflicting interpretations which will need to be sorted out.
So while a nice fluff piece for IBM, I have serious doubts about the actually meaningful benefits Watson supposedly brings to the table.
To be clear, I do believe that AI/ML can and do really help, and contrarily to what the article might want you to believe, people are already working with/on it. However making Watson look like it will solve all the current problems with cancer care is disingenuous at best.
I agree with your guess that Watson covers 6/7. From the clinical geneticists I've talked to, variants of unknown significance (where a person differences from the reference, the impact of the difference) are investigated manually through literature searches- often just basic text matches on the variant name.
Regarding paper parsing: there are companies that employ hundreds of PhDs that read papers all day long and enter the data into a database based on ontology (Ingenuity). IBM would license that database and use it as a prior.
I imagine there are some renowned physicians in Europe and Asia who also don't know that Toronto is a Canadian city. Or other arbitrary pieces of trivia for that matter. It doesn't stop them from being successful at their job.
Conversely, knowing which city has airports named after a WWII lieutenant and a WWII battle doesn't protect from malpractice.
At the moment, it's just diagnosing. Hopefully a) watson will provide the reasoning, enabling debugging and obvious errors to be spotted, and b) the doctors will continue to filter watson's answers for some time, shielding it from liability.
"in the future, every decision mankind makes, every decision, is going to be informed by a cognitive system like Watson and, as a result, our lives in this world are going to be better for it."
well if by "better" you mean "more machine-like and regulated like animals in a zoo".
For certain tasks, AI has long been more accurate than doctors. But it isn't used because doctors and patients don't trust it. So perhaps PR is exactly the missing piece of the puzzle.
I don't understand this mindset of not using technology due to a lack of trust. I can easily understand not blindly following an AI's findings, however they should probably be item #1 for the attending doctor to investigate.
The worst case scenario is that the doctor disagrees with the findings and continues investigation. The best case scenario is that you hit the nail on the head much more quickly than if it was an entirely manual process.
One answer is very simple: $$$. Study 12 years of your life to be a specialist only to be ultimately replaced by some installed machine at a Walmart Pharmacy.
> I don't understand this mindset of not using technology due to a lack of trust.
The more complex a tool is, the more likely it is to have flaws. Is it any surprise that the medical industry is slow to trust tools where the feared negative outweigh the few clear positives?
I would rather trust a robot to operate on me rather than diagnose me. The human will be able to adapt and communicate to me while doing so.
Well in that case, why bother going to the hospital at all!
Doctors are a social conduit for understanding symptoms, explaining the reasoning behind eventual diagnosis, and being able to perform it themselves. A computer would not be, and certainly not in a way a distraught patient could communicate with easily. I also don't envision computers being tied to writing prescriptions any time soon, which is the other reason I would use a doctor.
Statistical analysis is not always correct and it's hardly new. I believe it was the primary argument behind germ theory—germ theory wasn't exactly new, but it was hard to refute when looking at how sanitation correlated with better recovery rate.
Using terms like "AI" is exactly the kind of behavior that leads to doctors not trusting it. We should be teaching doctors why stats are a valuable tool, not to trust a black box because of "AI". How do you know when to question it?
It isn't used not because of patient lack of trust - but because of resistance by the medical community, some warranted, some unwarranted , and most of this resistance isn't because of trust issues. And the last thing that will change the minds of good doctors is a PR piece.
The paper "Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err" [1] has references to examples.
> Dawes subsequently gathered a large body of evidence showing that human experts did not perform as well as simple linear models at clinical diagnosis, forecasting graduate students’ success, and other prediction tasks (Dawes, 1979; Dawes, Faust, & Meehl, 1989).
Given that a majority of people (and doctors) fail even the simplest probabilistic reasoning tests, this is not particularly surprising. [2]
> We know from several studies that physicians, college students (Eddy, 1982), and staff at Harvard Medical School (Casscells, Schoenberger, & Grayboys, 1978) all have equally great difficulties with this and similar medical disease problems. For instance, Eddy (1982) reported that 95 out of 100 physicians estimated the posterior probability p(cancer|positive) to be between 70% and 80%, rather than 7.8%
The first documented success was the system MYCIN in the mid-1970s, which beat expert human performance on diagnosing blood infections: http://en.wikipedia.org/wiki/Mycin
I can't seem to find the paper I'm thinking of in some quick Google Scholar searching, but I believe there was a follow-up article that looked into why such a fairly simple system was able to beat humans, when it clearly lacked the range of expertise of the human experts (and was entirely missing information on some real conditions). The paper, if I'm remembering correctly, concluded that the win was almost entirely due to one specific failing displayed by the humans (even expert specialists) but not shared by computers: very bad intuition for conditional probabilities.
A doctor (internist in primary care) friend told me once how he was glad he wasn't a heart surgeon (of some specialty that I don't recall). He said that surgeon in that specialty simply does one type of surgery all the time, as ordered by some other doctors. As far as he could tell, that surgeon did not have to do much mental work, but was rather like a technician.
I wonder when work of doctor (diagnosing a disease) will be handled by a computer. I'd say in 100 years?
We can do diagnosis now. All that is needed is a medical nurse to take symptoms and run them through Watson to get a diagnosis. The idea is that people in poor/remote regions could do this whole process themselves, with the help of a mobile phone and some medical sensors. There are already regions where it would be more beneficial to run with the current medial AI rather than leave things as they are.
Wouldn't work here in the UK.
Watson: "I see you have cancer x - there are three effective drugs for that, but NICE has decided not to pay for them"
Patient: "Fucking NHS bean counting cunts"
Watson: "Indeed. Now, I have calculated it will cost the NHS a few pennies more between now and your death, what with the doctor, hospital, and terminal care, not to mention the endless drugs and things we will spend to slowly let you die."
Patient: "I see, it's like the NHS doesn't have a scoody doo what they are actually doing, how things are spent."
Watson: "Indeed".
It maybe doing sequence analysis, but I've never seen anything about that before.
The real strength is the journal article reading mentioned in passing. What it does is read studies on new treatment options as they are released, and then when a person presents symptoms it will do a diagnosis and suggest treatment options along with supporting evidence.
The natural language understanding needed to do the journal understanding is the real unique offering here.
There are videos on YouTube showing the system. I'm on mobile so I'm not looking them up now.