Long article, but one I found particularly interesting as I have a medical background.
The article presents it as if it was some big revelation in 2005, but I'm not sure it was as big as implied. I mean some of the specifics were, but I trained in my residency between 2000-2003, and we were very much trained to be skeptical of studies, and to question results. Evidence-based medicine was in strong force.
His model predicted, in different fields of medical research, rates of wrongness roughly corresponding to the observed rates at which findings were later convincingly refuted: 80 percent of non-randomized studies (by far the most common type) turn out to be wrong, as do 25 percent of supposedly gold-standard randomized trials, and as much as 10 percent of the platinum-standard large randomized trials.
Non-randomized studies are the most common, of course. They are the cheapest to perform by far. but whenever I read non-randomized study, I think "interesting" but realize it doesn't mean anything. Correlation does not equal causation. These non-randomized studies though are the ones that raise questions and possibilities that later fund/justify more costly randomized-controlled studies.
And something to be aware, which is perhaps part of the purpose of this article, is that there is a reporting/publishing bias. Negative and neutral results simply don't get reported in journals. You form a hypothesis, perform a study, and get negative results -- well, you're not going to try to publish it. Statistically speaking, there's going to be some bell curve distribution about the actual result. So let's say the actual result is "0" (no effect) for a drug or treatment. If you do enough studies, you'll get a few that fall on the positive side, which I presume is what accounts for some of the false positives.
"You form a hypothesis, perform a study, and get negative results -- well, you're not going to try to publish it."
This seems fundamentally incorrect to me. Wouldn't research all around just fail if no one ever reported their negative results, thus dooming many other researches to performing the same fruitless experiments?
It's actually worse than that, if a lot of studies are being done:
Let's say that you only publish when you discover an effect with a p-value of better than 0.05 -- that is, when you believe that, if the effect weren't real, then the probability of observing an effect at least as extreme as the one you got has less than a 5% chance of happening. This is pretty typical.
Let's also say that you and 19 other groups are studying an effect that isn't real: the hypothesis that meditating on pink unicorns will get rid of skin cancer.
By (perfectly reasonable) chance, 19 of your studies reject the Pink Unicorn Hypothesis with p-value = 0.05, and one accepts it -- i.e., one group gets a result that should have happened 1/20 times or less if there is no Pink Unicorn effect.
Since the first 19 groups are silent, and only one group publishes, the only thing we see is the exciting announcement of a possible new skin cancer cure, with no hope for a meta-study that notices that this actually the expected result given the null hypothesis.
No they usually don't. Because the "didn't work for us" is usually not conclusive proof of the contrary.
It does (rarely) happen in Physics where everything is expected to be repeatable, and results from one experiment carry over to similar experiments. It almost never happens in medicine, where the bar of acceptance of a hypothesis is already ridiculously low.
You can publish negative results, but the bar is usually higher. It's easiest if you find some new "positive" reason for the negative result, so you can have a narrative along the lines of: you might think X would work, and here are all the reasons it's plausible, which we used to believe too, but it turns out it doesn't, because of Y.
If you don't have a reason for the failure, just "hmm, didn't seem to work", you can still publish, but it's harder. The next-best case is if you have a large-scale study failing to find a result for something that many other people have claimed should exist, e.g. power-line cancer studies. But if it isn't in that category, it's harder. The fundamental problem is that nobody wants thousands of paper saying "X doesn't cure cancer. X2 also doesn't. X3, once again, does not cure cancer", because the vast majority of Xs don't do Y.
I guess the motivation behind the journal mentioned in the article.
He chose to publish one paper, fittingly, in the online journal PLoS Medicine, which is committed to running any methodologically sound article without regard to how “interesting” the results may be.
edit: though looking at the site, it doesn't seem to present itself with that motivation. It also costs $2900 to publish an article, so there's some financial hurdle to publishing.
The article presents it as if it was some big revelation in 2005, but I'm not sure it was as big as implied. I mean some of the specifics were, but I trained in my residency between 2000-2003, and we were very much trained to be skeptical of studies, and to question results. Evidence-based medicine was in strong force.
His model predicted, in different fields of medical research, rates of wrongness roughly corresponding to the observed rates at which findings were later convincingly refuted: 80 percent of non-randomized studies (by far the most common type) turn out to be wrong, as do 25 percent of supposedly gold-standard randomized trials, and as much as 10 percent of the platinum-standard large randomized trials.
Non-randomized studies are the most common, of course. They are the cheapest to perform by far. but whenever I read non-randomized study, I think "interesting" but realize it doesn't mean anything. Correlation does not equal causation. These non-randomized studies though are the ones that raise questions and possibilities that later fund/justify more costly randomized-controlled studies.
And something to be aware, which is perhaps part of the purpose of this article, is that there is a reporting/publishing bias. Negative and neutral results simply don't get reported in journals. You form a hypothesis, perform a study, and get negative results -- well, you're not going to try to publish it. Statistically speaking, there's going to be some bell curve distribution about the actual result. So let's say the actual result is "0" (no effect) for a drug or treatment. If you do enough studies, you'll get a few that fall on the positive side, which I presume is what accounts for some of the false positives.