A new replication crisis: Research that is less likely to be true is cited more

eob · on May 22, 2021

I will never forget the day a postdoc in my lab told me not to continue wasting time trying (and failing) to reproduce [Top Institution]’s “Best Paper Award” results from the year prior. He had been there when the work was done and said they manipulated the dataset until they got the numbers they wanted. The primary author is now a hot shot professor.

My whole perception of academia and peer review changed that day.

Edit to elaborate: like many of our institutions, peer review is an effective system in many ways but was designed assuming good faith. Reviewers accept the author’s results on faith and largely just check to make sure you didn’t forget any obvious angles to cover and that the import of the work is worth flagging for the whole community to read. Since there’s no actual verification of results, it’s vulnerable to attack by dishonesty.

kqr · on May 22, 2021

When I was in school pre-university, this type of "crap we can't get the what we wanted to happen so let's just fiddle around with it until it seems about right" was very common. I was convinced this was how children learned, so that as adults they wouldn't have to do things that way.

When I got into university and started alternating studying and work, I realised just how incredibly clueless even adults are. The "let's just try something and hope nothing bad happens" attitude permeates everything.

It's really a miracle the civilisation works as well as it does.

The upshot is that if something seems stupid, it probably is and can be improved.

patrakov · on May 22, 2021

In the lyceum where I studied, there was one lab on Physics, where the book that accompanied the lab was deliberately wrong. We were told to perform an experiment that "should" support a certain conclusion, but actually neither the "correct" conclusion nor its opposite could be done because of the flawed setup which measured something slightly different. A lot of students (in some groups, all students) fell into this trap and submitted paperwork with the "correct" conclusion according to the book.

brandmeyer · on May 22, 2021

A CS-specific analogy might be to give the students a compiler that has a bug in it, such that the students' code is deliberately mis-compiled. The standard of evidence to believe that the compiler is buggy is much higher than the standard to believe that my code is buggy.

A lab exercise like that could really just be selecting for chutzpah (feeling charitable) or arrogance (less charitable).

patrakov · on May 22, 2021

Well, that's more evil than my lab. A more direct equivalent in the CS would be an algorithm description in the booklet with one subtly wrong (e.g. proven using some well-hidden circular reasoning) and uncorrectable step. The expectation would be that a good student finds the mistake instead of submitting the implementation of the flawed algorithm, or, for even better matching with my case, proves that the supposed output cannot be obtained from the inputs at all.

tonyarkles · on May 22, 2021

I had an appendectomy just before the final first-year Modern Physics lab and had to come back in to do a make-up lab. Sure enough it was the slightly-messed-up lab where the results should in theory look exponential but come out linear. I, naturally, drew an exponential curve through the points. Lab instructor decided to grade it right there before I left and tore a strip off me.

Very valuable lesson, although it sure did suck at the time.

economusty · on May 22, 2021

What does tore a strip off me mean?

kencausey · on May 22, 2021

Upbraided. https://en.wiktionary.org/wiki/upbraid#English

bussierem · on May 22, 2021

And for those ALSO thinking "what does THAT mean":

He got criticized for it.

vidarh · on May 22, 2021

I've come to think things works as well as it does largely because a whole lot of what people do has no effect either way. I see so much stupidity where the only saving grace is that it is directed into pointless efforts that won't be allowed to do any real damage.

Retric · on May 22, 2021

When you start talking millions of people damage gets subtle.

Robocall scams are very high on the profit:human misery scale, but their hardly going to end civilization. Pollution, corruption, theft etc all make things worse, but we never see the better world without such things so it all feels very abstract. Of course you need to lock your doors etc that’s just the way things are.

api · on May 22, 2021

A bio professor of mine said something that stuck with me: “life doesn’t work perfectly, it just works.”

It has to work well enough to… work… and reproduce. That’s it. It’s not “survival of the fittest.” It’s “survival of a randomized subset of the fit.”

There’s even a set of thermodynamic arguments to the effect that systems are unlikely to exceed such minimum requirements for a given threshold. For example, if we are visited by interstellar travelers they are likely to be the absolute dumbest and most dysfunctional possible examples of beings capable of interstellar travel since anything more is a less likely thermodynamic state.

So much for Star Trek toga wearing utopian aliens.

cutemonster · on May 23, 2021

> if we are visited by interstellar travelers they are likely to be the absolute dumbest and most dysfunctional possible examples of beings capable of interstellar travel

Otoh, they would be aware about that, and they might have spent some time improving how genes (or what they have) and evolutionary selection works for them, so that, say, their species with time becomes brighter and brighter than what's actually needed. If they wanted to do that.

api · on May 23, 2021

How do you improve your genes? Removing obvious deleterious disease mutations is easy, but as soon as you try to “go where there are no roads” you hit the same combinatorial challenge as evolution.

Also more intelligence does not equal better ideas. The world is full of crazy or amoral people with apparently very high IQs. Your average flat Earther probably has an above average IQ.

Improvement is a war against entropy and n^n^n^… combinatorics any way you slice it.

cutemonster · on May 23, 2021

> How do you improve your genes?

Slowly across hundreds and thousands of generations.

By adding evolutionary pressure, for what they want -- it'd be up to those space traveling aliens to decide -- they can change their species, generations into the future.

> Improvement is a war against entropy ...

Reasoning in that way, the humans would not have gotten brighter than the chimpanzee monkeys. There's been evolutionary pressure for the humans to get brighter, and it would be possible for you (I mean the humans), or the space travelers, to add artificial ev. pressure.

Anyway never mind all this, maybe talking about space travelers and the humans and their genes isn't the best way to spend the day. Have a nice day btw

agent008t · on May 22, 2021

The problem is the incentives. To do well, you must publish. To publish, you must have a good story, and ‘we tried this and it didn’t work’ is not one.

So after a certain time spent, you are left with a choice of ‘massaging’ the data to get some results, or not and getting left behind those that do or were luckier in their research.

hef19898 · on May 22, 2021

"We tried this and it didn't work, and here's why we think it didn't" should be among the bests stories to publish. Looking back I learned more from stuff that didn't work, or rather figuring out why it didn't, than from success.

dagw · on May 22, 2021

or rather figuring out why it didn't

That can end up being just as time consuming as doing the research to begin with. Often there is no time and no money to go back and do that. If your 'budget' is 6 month you're going to spend 6 month trying to get your experiment to work. You're not going to 'give up' after 4 month and spend 2 month putting together a "why we failed" paper.

bumby · on May 30, 2021

However, the advantage is, if it is published, it can decrease the likelihood of multiple other attempts to try the same “unique” (and wrong) approach.

j7ake · on May 22, 2021

Even if something did not work, you still need a story for it to be readable.

For example, I imagine that archeological work is extremely high impact if excavation efforts led to discovery of ancient city.

Archeology paper would probably be less interesting if the paper said “we dug this area, found nothing”.

If one were to judge those two papers, obviously the discovery paper is higher impact than the negative result.

dtech · on May 22, 2021

"We chose this area because we believed it should be archeologically interesting based on XYZ. However, we searched through ABC methods and found nothing there" would be valuable for the future. Maybe XYZ isn't as good as we though, maybe ABC couldn't find it. Maybe now some other sod in the future won't try that location.

Not as valuable as a discovery, but very far off zero value. Yet the reward in academia would be near-zero.

Frost1x · on May 22, 2021

Tell that to the people who pay for research and the metrics they use to continue feeding those who perform research.

Ultimately, this is the problem.

katzgrau · on May 22, 2021

Distracting from the main point, "let's just try something and hope nothing bad happens" (trial and error) is precisely the reason civilization made it this far :)

simonh · on May 22, 2021

And in fact evolution. The thing to remember is, in many cases where something bad did happen the evidence got buried or eaten.

tambeb · on May 22, 2021

> It's really a miracle the civilisation works as well as it does.

I think this all the time.

kashyapc · on May 22, 2021

I'm just done with a 3-hour reading session of an evolutionary psychology book by one of the leading scientists in the field. The book is extremely competently written, and is awash with statistics on almost every page, "70% of men this; 30% of women that ... on and on". And much to my solace, the scientist was super-careful to distinguish studies that were replicable, and those that were not.

Still, reading your comment makes me despair. It plants a nagging doubt in my mind, "how many of these zillion studies cited that are actually replicable?" This doubt remains despite knowing that the scientist is one of the leading experts in the field, and very down-to-earth.

What are the solutions here? A big incentive-shift to reward replication more? Public shaming of misleading studies? Influential conferences giving more air-time for talks about "studies that did not replicate"? I know some of these happen at a smaller-scale[1], but I wonder about the "scaling" aspect (to use a very HN-esque term).

PS: Since I read Behave by Sapolsky — where he says "your prefrontal cortex [which plays critical role in cognition, emotional regulation, and control of impulsive behavior] doesn't come online until you are 24" — I tend to take all studies done on university campuses with students younger than 24 with a good spoon of salt. ;-)

[1] https://replicationindex.com/about/

api · on May 22, 2021

Evo psych is questionable to me for more basic reasons. It seems full of untestable just so stories to explain apparent biases that are themselves hard to pin down or prove are a result of nature not nurture.

It’s probably not all bullshit but I would bet a double digit percentage of it is.

kashyapc · on May 22, 2021

I'm conscious that this is a flame-bait topic. That said, no, dismissing the whole field as "questionable" is callous. Yes, there are many open questions, loaded landmines, and ethical concerns in evolutionary psychology research. But there's also copious evidence in its favour. (Reference: David Buss et al.)

Many people might spare themselves at least some misery by educating themselves about evolutionary psychology, including the landmines and open questions.

candiodari · on May 22, 2021

psych is questionable for basic reasons. It is a humanities science. It's purpose is not to figure out the world but to change it. Figure out how to end poverty for example.

Therefore it is not well suited to figure out the world.

You should treat all of it with extreme helpings of salt.

hexane360 · on May 22, 2021

Can't this be applied to wide swaths of hard sciences as well? Lots of scientific work overlaps heavily with engineering, which is all about changing the world.

Also, I don't think ending poverty is a major stated goal of psychology research. . .

dahart · on May 22, 2021

> “how many of these zillion studies cited that are actually replicable?” This doubt remains despite knowing that the scientist is one of the leading experts in the field, and very down-to-earth.

I think the problem is much bigger than simply a binary is it replicable or not. It’s extremely easy to find papers by “leading experts” that have valid data with replicable results where the conclusions have been generalized beyond the experiments. The media does this more or less by default when reporting on scientific results, but researchers do it themselves to a huge degree, use very specific conditions and results to jump to a wider conclusion that is not actually supported by the results.

A high profile example of this is the “Dunning Kruger” effect; the data in paper did not show what the flowery narrative in the paper claimed to show, but there’s no reason to think they falsified the results. Some researchers have reproduced the results, as long as the conditions were very similar. Other researchers have tried to reproduce the results under different conditions that should have worked according to the paper’s narrative and conclusions, but found that they could not reproduce, because there were specific factors in the original experiment that were not discussed in the original paper’s conclusions -- in other words, Dunning and Kruger overstated what they measured such that the conclusion was not true. They both enjoyed successful academic careers and some degree of academic fame as a result of this paper that is technically reproducible but not generally true.

To make matters worse, the public has generally misinterpreted and misunderstood even the incorrect conclusions the authors stated, and turned it into something else. Almost never in discussions where the DK effect is invoked do people talk about the context or methodology of the experiments, or the people who participated in them.

This human tendency to tell a story and lose the context and details and specificity of the original evidence, the tendency to declare that one piece of evidence means there is a general truth, that is scarier to me than whether papers are replicable or not, because it casts doubt on all the replicable papers too.

kashyapc · on May 22, 2021

I fully agree. Thanks for the excellent articulation of the layered complexity involved here, including a chilling example.

pkkm · on May 22, 2021

> The book is extremely competently written, and is awash with statistics on almost every page, "70% of men this; 30% of women that ... on and on". And much to my solace, the scientist was super-careful to distinguish studies that were replicable, and those that were not.

Out of curiosity, what's the title of the book?

kashyapc · on May 22, 2021

Evolutionary Psychology by David Buss[1].

[1] https://www.routledge.com/Evolutionary-Psychology-The-New-Sc...

mistermann · on May 22, 2021

One approach that can be adopted on a personal level is simply changing the way one thinks. For example, switch from a binary (true/false) method of epistemology to trinary (true/false/unknown), defaulting to unknown, and consciously insist on a high level of certainty to reclassify an idea.

There's obviously more complexity than this, but I believe that if even a relatively small percentage of the population started thinking like this (particularly, influential people) it could make a very big difference.

Unfortunately, this seems to be extremely counter to human nature and desires - people seem seem compelled to form conclusions, even when it is not necessary ("Do people have ideas, or do ideas have people?").

Rygian · on May 22, 2021

Honest question: could you go ahead and publish an article titled "Failure to replicate 'Top Institution's Best Paper Award'"?

j7ake · on May 22, 2021

Yes. A famous recent example was stress-induced stem cells:

https://www.nature.com/news/failed-replications-put-stap-ste...

seoaeu · on May 22, 2021

Yes, but you have to convince your readers that you did a more careful and meticulous job than 'Top Institution's Best Paper Award' did. After all, a failure to replicate only means that one of you is wrong, but it doesn't give any hint as to who.

Someone · on May 23, 2021

at least one of you. If disproving the result turns out not to be that simple, you might fall for the same trap.

Viliam1234 · on May 22, 2021

Just don't forget that the guy who wrote the Best Paper will probably review your articles in future.

lalalandland · on May 22, 2021

Maybe all papers should have a replication score ?

jhgb · on May 22, 2021

> a postdoc in my lab told me not to continue wasting time trying (and failing) to reproduce [Top Institution]’s “Best Paper Award” results from the year prior. He had been there when the work was done and said they manipulated the dataset until they got the numbers they wanted.

Isn't that the moment where you try even harder to falsify the claims in that paper? You already know that you'll succeed so it wouldn't be a waste of time in your effort.

gus_massa · on May 22, 2021

The problem with experimental results is that they are difficult to replicate. In software you can "git clone x.git & cd x & make" and replicate the correct or incorrect results. In hardware, it's more difficult.

The main problem is that even if you reproduce their experiment, they can claim that you did some step wrong, perhaps you are mixing it too fast or too slow, or the temperature is not correctly controlled, or that one of your reactive have a contamination that destroy the effect, or magically realize that their reactive that is important.

It's very difficult to publish papers with negative results. So there is a high chance it will not count in your total number of publications. Also, expect a low number of citation, so it's not useful for other metrics like citation count or h.

For the same reason, you will not see publications of exact replications. A good paper X will be followed by almost-replications by another teams, like "we changed this and got X with a 10% improvement" or "we mixed the methods of X and Y and unsurprisingly^W got X+Y". This is somewhat good because it shows that the initial result is robust enough to survive small modifications.

xjlin0 · on May 22, 2021

It's harder to publish negative results.

jhgb · on May 22, 2021

Even good negative results? If it's a problem to publish a negative result debunking an award-winning paper, then that is a problem.

gibba999 · on May 22, 2021

Yes, and in most cases, no one will cite negative results. The positive results continue to be cited even long after debunked.

This is an example which did get cites:

https://journals.sagepub.com/doi/full/10.1111/j.1539-6053.20...

But despite the high visibility, you can see the large number of papers published based on the original myth.

And this refutation doesn't have great methodology (but other ones do). It's mostly cited due to strong language used.

carlmr · on May 22, 2021

Hence the reproducibility crisis.

andi999 · on May 22, 2021

It is not possible (in principle) and it was never intended for peer review to protect against fraud. And this is ok. Usually if a result is very important and forged, other groups try to replicate and fail, after some time the original dataset (which needs to be kept for 10 years I think) will be requested and then things go done from there.

Assuming not good faith for peer review would make academia more interesting, only way would probably for the peer reviewer go to the lab and get live measurements shown. Then check the equipment...

fallingknife · on May 22, 2021

I wonder if it's a better system to just hire smart professors and give them tenure immediately. The lazy ones in it just for the status won't do any work, but the good ones will. Sure, there will be dead weight that gets salaries for life, but I feel like that's a lesser problem than incentivizing bad research.

dagw · on May 22, 2021

The problem isn't just the scientists, it goes all the way up. Let's say we implement your system. Who decides how many 'smart professors' the Type Theory group gets to hire? What if the Type Theory and Machine Learning departments both want to hire a new 'smart professor' but the Computer Science department only has money to hire one more person?

One reasonable approach might be to look at which group has produced the 'best' research over the past few years. But how do you judge that in a way that seems fair? Once you have a criteria to judge that, then people will start to game that criteria.

Or taking a step up, The university needs to save money. How do you judge if the Chemistry department or the Computer Science department should have its funding cut.

No matter how you slice it at some point you're going to need a way for someone to judge which of two departments is producing the 'best' research and thus deserves more money, and that will incentivize people to game that metric.

alexashka · on May 24, 2021

There is no shortage of resources to provide for every person who wants to devote their life to discovering something valuable for all of humankind.

We aren't short on food, shelter, clothes, tech, etc - those are all solved problems.

The problem that isn't solved is stupid people sitting in charge of decisions they don't have the brain make-up to comprehend or manage, making pretend they know what they're doing, holding people far superior to them hostage.

TimPC · on May 22, 2021

Smart isn't the biggest criteria for success as a professor. The PhD degree is a good filter because it trains and tests research aptitude, work ethic, ability to collaborate, ability to focus on a single problem for a long period of time, and others.

One problem is PhD degrees are too costly to those who don't get academic or industrial success from them. But as long as talented people are willing to try to become a professor I don't see the system changing.

achillesheels · on May 22, 2021

Who is to judge the merit of their talent? Shouldn’t their results speak for themselves? And prey tell, what are the results of academia in the age of the digital revolution where there is no obligation to complete a university education with the knowledge of its mathematical scientific foundation?

I think many more are drawn to professorship for a sense of status, ie prestige. It shows in their overwhelming mediocrity, eg the failure of economics to progress to a biologically scientific paradigm.

varjag · on May 22, 2021

Who is going to decide whether professor is smart?

User23 · on May 22, 2021

The other smart professors. Which is exactly how it worked in the now distant past.

rscho · on May 22, 2021

Which is exactly how it still works in many places. You have to be co-opted and have the vote of your peers. This doesn't do anything to ensure those elected are able. It ensures they are politically desirable.

dagw · on May 22, 2021

Won't those professors then just hire people that agree with their pet theories?

achillesheels · on May 22, 2021

Those professors in the distant past flourished in a more spiritual age. They did not treat themselves as professionals nor had a sense of “career”.

varjag · on May 22, 2021

So what you are saying is, peer review.

JumpCrisscross · on May 22, 2021

Honest question: how do we fix this? The obvious solution, prosecuting academics, has an awful precedence attached to it.

lordnacho · on May 22, 2021

Not my turf but I'll chime in.

In the past people who did science could do so with less personally on the line. In the early days you had men of letters like Cavendish who didn't really need to care if you liked what he wrote, he'd be fine without any grants. That obviously doesn't work for everyone, but then the tenure system developed for a similar reason: you have to be able to follow an unproductive path sometimes without starving. And that can mean unproductive in that you don't find anything or in that your peers don't rate your work. There'd be a gap between being a young researcher and tenured, sure.

Nowadays there's an army of precariously employed phds and postdocs. Publish or perish is a trope. People get really quite old while still being juniors in some sense, and during that time everyone is thinking "I have to not jeopardise my career".

When you have a system where all the agents are under huge pressure, they adapt in certain ways: take safer bets, write more papers from each experiment, cooperate with others for mutual gain, congregate around previous winners, generally more risk reducing behaviour.

Perhaps the thing to do is make a hard barrier: everyone who wants to be a researcher needs to get tenure after undergrad, or not at all. (Or after masters or whatever, I wouldn't know.) Those people then get a grant for life. It will be hard to get one of these, but it will be clear if you have to give up. Lab assistants and other untenured staff know what they are negotiating for. Tenured young people can start a family and not have the rug pulled out when they write something interesting.

msg3 · on May 22, 2021

I agree with your diagnosis of the problem, but don't think your solution is a good way forward - immediately after undergrad is way too early to be evaluating research potential and would just shift the hyper competitiveness earlier.

A better solution would be to stop overproducing PhDs. We could reduce funding for PhD students and re-direct that towards more postdoctoral positions - perhaps even make research scientist a viable career choice?

TimPC · on May 22, 2021

Overproducing PhDs seems to be a necessary aspect of how research is conducted in the current university. Most serious lines of work are pursued by a PhD student or Postdoc and advised by a Professor. They need a critical mass of PhD students which is definitely a much larger number than 1 per professorship. This is especially true in fields where industry jobs aren't readily available.

msg3 · on May 22, 2021

I think that's a huge part of the problem though - we've made it so the only way we can get research done is by training a new researcher - even though there's already plenty of trained researchers who are struggling to find a decent job.

I'm suggesting that we re-direct some of the funding for training PhD students into funding for postdoctoral positions (via either fellowships or research grants). Professors would still get their research team, but rather than consisting mostly of untrained PhD students, they'd have a smaller, but more effective team of trained researchers.

oldnews193 · on May 22, 2021

Isn't that the case simply because professors are expected to be highly productive, to the extent where it is not possible to meet the bar without offloading the work to students and switching to a full-time manager?

lmm · on May 22, 2021

> I agree with your diagnosis of the problem, but don't think your solution is a good way forward - immediately after undergrad is way too early to be evaluating research potential and would just shift the hyper competitiveness earlier.

Immediately after undergrad is how it used to work in the golden days of science, more or less.

If the competitiveness is the problem maybe tenure should be a lottery that you enter once at a fixed stage, preferably before you're expected to start publishing in journals.

msg3 · on May 22, 2021

I think we had a far smaller number of people going to university back in the "golden days of science" - not sure you can really compare.

A tenure lottery seems like an extreme option - there has to be a middle ground between what we have now and something entirely random.

ivan_gammel · on May 22, 2021

The system that produces PhDs isn’t that bad. It is a good way to create research portfolio useful for employment in private sector. We need to pay less attention to the title though - this is not a distinguishing achievement for life.

RhysU · on May 22, 2021

Correct, it's not a laurel to rest on.

The act of producing a doctoral dissertation usually leaves something of a mark on one's outlook, skills, etc. I claim it is a _distinguishable_ achievement for life.

achillesheels · on May 22, 2021

Yet the principle of pursuing knowledge is not for pecuniary interests. So your judgment demonstrates the temporal shift of the Western University towards rubber stamping people’s vocational aptitude. This leads to corruption, of course.

uberswe · on May 22, 2021

This is one of the many reasons I like Universal Basic Income. Having UBI would let researchers take risks and have something to fall back on if needed and could reduce some of the pressure

TimPC · on May 22, 2021

I don't think UBI works well here because in most fields the level of success that the precarious group experiences in industry is substantially higher than a guaranteed minimum. A lot of people have identity aspects tied to their university affiliation and don't want to stop working with the university in part for that reason.

dagw · on May 22, 2021

No matter what level we put UBI at, it will almost certainly be less than a third of what a researcher salary would be. Also it's not just about the money. Losing your job means losing access to a lab, access to data, access to grant money and basically everything you need to actually do research.

ivan_gammel · on May 22, 2021

Solution is to publish data, not „papers“ first and assign it a replication score - how many times it was verified by independent research. The paper can follow with the explanation, but citations will no longer be important - what will matter is the contribution to the replication score (will also work as an incentive to confirm other‘s results).

m_mueller · on May 22, 2021

I think it would be gamed just like the current system. Instead of citation rings you just get replication rings.

ivan_gammel · on May 22, 2021

If someone gets contradicting result the replication score of the entire ring can be nullified or in case of intentional manipulation with data negated.

m_mueller · on May 22, 2021

But you have the same basic problem as now - you’d need some sort of science police to control it, which goes against the scientific process. Essentially it’s a problem of establishing trust in an untrusted system. Putting it that way actually makes it sound like a blockchain problem. Maybe there could be some incentive system to replicate work based on smart contracts, but I don’t know how you could ensure the replicating parties to be independent.

ivan_gammel · on May 22, 2021

Scientific progress today heavily depends on financial support of society, so as a whole it cannot be completely decentralized and independent. People want to know how their money are spent and want to have guarantees that science will not create something awful. This means that policing of science is inevitable and important part of the system. It is not a question if we need science “police”, it is a question how it should look like. Today it is decentralized: someone maintains the list of media publishing in which will count for citation index, there are ethical committees and scientific boards, lawmakers regularly tell what can be done and what should not etc. How this will change if there will be a new system of incentives in place, we can only imagine: it can be a good or a bad thing, but as long as the system remains democratic, all problems should be easy to fix.

headmelted · on May 22, 2021

This seems like the right answer.

Don’t (credible) journalists have an honour system of getting at least three sources for a story?

Can’t we make researchers get at least two more confirmations from separate teams for something far more important?

PeterisP · on May 22, 2021

A key function of scientific publication is to inform other researchers in the field about potentially interesting things as quickly as resonable. Getting "two more confirmations from separate teams" is a very high bar, as it's not about just asking a source, it's asking someone else to do all the same work again. Not only we don't require it before publication, we don't expect it to happen for the vast majority of publications ever. Important studies get replicated, but most don't get repeated ever. A partial explanation of the original article's observation is the (very many!) papers that don't have much citations and don't fail to replicate because nobody cared enough to put the work in to try.

If publication would require two more confirmations from separate teams, that would mean (a) doing the work in triplicate, so you get three times less results for the same effort; (b) the process would take twice as long as I spend a year doing the experiment and then someone else can start and spend a year doing the same experiment, and only then it gets published; (c) there's a funding issue - I have somehow got funding to spend many months of multiple people on this, but who's paying the other independent teams to do that?; (d) it's not a given that there are two other teams capable of doing the exact same research, e.g. if you want to publish a study on the results of an innovative surgery procedure, it's plausible that there aren't (yet!) any other surgeons worldwide who are ready to perform that operation, that will come some time after the publication; (e) many types of science really can't get a separate confirmation - for example, we have only one Large Hadron Collider, you can't re-do archeological digs, event-specific on-site sociological data gathering can't really be repeated, etc; so you have to take the data at face value.

ivan_gammel · on May 22, 2021

What you describe is absolutely right, it is important to have this kind of communication. If publications were only the means to communicate, that would serve the purpose and won't be a problem. The problem is that they are considered having a second purpose - to create scientific reputation, based on which society allocates funds and prioritizes the research. The original article illustrates how wrong this approach can be, substituting the ability to produce scientific facts with the good story telling.

achillesheels · on May 22, 2021

Maybe research that cannot be replicated ought not be pursued? Aren’t there better directions for a society’s calorie outputs?

waheoo · on May 22, 2021

There aren't many credible journalists left. Maybe like 5.

Bombthecat · on May 22, 2021

Somewhere in there i see a blockchain pitch.

alboaie · on May 22, 2021

Having a scientific blockchain looks like a decent idea... but of course it will not suffice and will be gamed. The real causes of the mess are complexity of the world compared with our minds and tools and the lack of epistemologic undestanding as society,institutions, culture. Science can't be more than a nice and usefull collection of heuristics. Otherwise is just the Scientism religion lurking arround and pretenting to read God's mind. Metarationality concepts could offer an exit from the inevitable mess.

logimame · on May 22, 2021

What do you mean by 'metarationality'? It's a term I've never seen before and I'm curious about it.

dorgo · on May 22, 2021

Game theory might be better suited.

ivan_gammel · on May 22, 2021

I personally love this comment as a quintessence of startup mentality.

seoaeu · on May 22, 2021

Ok, I got 12, 18, 45. Does anyone want to verify my results? If so, I'll write up a paper describing what they mean...

Hopefully it is clear that that data is useless without some written text explaining what it means. Given that for hundreds of years the accepted way of presenting that explanatory text was by writing papers, I don't see any reason to abandon that. Tweaking our strategies for replication (after a description of the experiment has been published!) and reputation don't seem to contradict that.

remus · on May 22, 2021

Im not sure prosecuting academics is particularly obvious: you'd need to prove malicious intent (rather than ignorance) which is always difficult.

For me a better solution would be to properly incentivise replication work and solid scientific principles. If repeating an experiment and getting a contradictory result carried the same kudos as running the original experiment then I think we'd be in a healthier place. Similarly if doing the 'scientific grind work' of working out mistakes in experimental practice that can affect results and, ultimately, our understanding of the universe around us.

I think an analogy with software development works pretty well: often the incentives point towards adding new features above all else. Rarely is sitting down and grinding through the litany of small bugs prioritised, but as any dev will tell you doing that grind work is as important otherwise you'll run in to a wall of technical debt and the whole thing will come tumbling down.

ZeroGravitas · on May 22, 2021

Open source and Free Software is (despite it being a cliche for programmers to over apply it) a good model to compare with.

You have big companies making Billions with the work of relatively poorly paid nerds. But as soon as you make it possible for the nerds to claim all the profits of the work then you have a whole class of people whose job is to insert themselves as middlemen and ruin it for everyone, both customers and developers.

So basically the aim is to limit the degree to which you can privately profit from science, and expand the amount of science you can easily build on. You still get enough incentives for progress, the benefits accrue to society as a whole, and competition and change is enabled without powerful gatekeepers controlling too much in their own interests.

eob · on May 22, 2021

I really don’t know.

One perspective is that, “knowledge generation wise,” the current system really does work from a long term perspective. Evolutionary pressure keeps the good work alive while bad work dies. Like that [Top Institution] paper: if nobody else could reproduce it, then the ideas within it die because nobody can extend the work.

But that comes at the heavy short term cost of good researchers getting duped into wasting time and bad researchers seeing incentives in lying. Which will make academia less attractive to the kind of people that ought to be there, dragging down the whole community.

higerordermap · on May 22, 2021

This is a recent HN thread and Post you might find interesting.

https://nintil.com/newton-hypothesis

https://news.ycombinator.com/item?id=25787745

Due to career and other reasons, there is a publish or perish crisis today.

Maybe we can do better by accepting not everyone can publish ground breaking results, and it's okay.

There are lots of incompetent people in academia, who later go to upper positions and decide your promotions by citation counts and how much papers you published. I have no realistic ideas how to counter this.

ordu · on May 22, 2021

> Honest question: how do we fix this?

We need to create new a social institution of Anti-Science, which would work on other stimuli correlated with the amount of refuted articles. No tenures, no long-term contracts. If anti-scientist wished to have income it would need to refute science articles.

Create a platform allowing to hold a scientific debate between scientists and anti-scientists, for a scientist had an ability to defend his/her research.

No need to do anything special to prosecute, because Science is a very competitive, and availability of refutations would be used inevitable to stop career progressions of authors of refuted articles.

mistermann · on May 22, 2021

This seems like a pragmatic and workable idea. We could even have the same type of thing for journalism and "facts" in general, it would be a step up from the current tribal meme/propaganda war approach we rely upon.

zhdc1 · on May 22, 2021

Data and code archives, along with better methods training.

Data manipulation generally doesn't happen by changing values in a data frame. It's done by running and rerunning similar models with slightly different specifications to get a P value under .05, or by applying various "manipulations" to variables or the models themselves for the same effect. It's much easier to identify this when you have the code that was used to recreate whatever was eventually published.

LeonB · on May 22, 2021

Registering the methods/details before performing the experiments is another technique that is used.

dagw · on May 22, 2021

Sure, but often there are perfectly valid reasons to change your methodology half way through a project when you know a lot more about the thing you are trying to do than you did before you started.

Nasrudith · on May 22, 2021

I don't think prosecution is the right tool but if we were going down that road material misrepresentations only would fit with anti-fraud standard for companies. Just drawing dumb, unpopular, or 'biased' conclusions shouldn't be a crime but data tampering would fall into the scope. Not a great idea as it would add a chilling effect, lawyer-friction and expenses and still be hard to enforce for little direct gain.

I personally favor requirements which call for bundling raw datasets with the "papers". The data storage and transmission is very cheap now so there isn't a need to restrict ourselves to just texts. We should still be able to check all of the thrown out "outliers" from the datasets. An aim should be to make the tricks for massaging data nonviable. Even if you found your first data set was full of embarassing screw ups due to doing it hungover and mixing up step order it could be helpful to get a collection of "known errors" to analyze. Optimistically it could also uncover phenomenon scientests thought was them screwing up like say cosmic background radiation being taken as just noise and not really there.

Paper reviewing is already a problem but adding some transparency should help.

unishark · on May 22, 2021

Leveraging the prestigious papers to win grant proposals is where they need to get them. Citations aren't what gets you a job or tenure at a R1 research school, it's the grants that the high-impact papers help you win.

You don't have to convict people for full-on fraud. If you are caught using an obvious mistake in your favor or using a weak statistical approach, the punishment can be you are not allowed to apply for grants with a supervisor/co-PI/etc who's role is to prevent you from following that "dumb" process in the future.

mrweasel · on May 22, 2021

We could use public funding to do the work OP tried to do.

Something like a well funded ten year campaign to do peer review, retrying experiments and publishing papers on why results are wrong.

I have a co-worker who had a job than involved publishing research papers. Based on his horror stories it seems like the most effective course of action is to attack the credibility of those who fudges results.

ComodoHacker · on May 22, 2021

With added bounty for discovering bad faith.

roenxi · on May 22, 2021

The single biggest impediment to "fixing this" is that you haven't identified what "this" is or in what manner it is broken.

There will always be cases of fraud if someone deeps deeply enough into large institutions. That doesn't actually indicate that there is a problem.

Launching in to change complex systems like the research community based on a couple of anecdotes and just-so stories is a great way not actually achieving anything meaningful. There needs to be a very thorough, emotionally and technically correct enumeration of what the actual problem(s) are.

nostrebored · on May 22, 2021

A couple of anecdotes is a very disingenuous way to frame the replication crisis. Heavily cited fraudulent research impacts public policy, medicine, and technology development. This means it's everyone's business.

roenxi · on May 22, 2021

The problem you're describing there is a public policy one, not something to do with the scientific community. Public policy should be implemented with a trial at the start and a "check for effectiveness" step at the end because there is no way to guarantee the research it is being based on is accurate. Statistically, we expect a big chunk of research to be wrong no matter what level of integrity the scientists have.

mike_hearn · on May 22, 2021

"Statistically, we expect a big chunk of research to be wrong no matter what level of integrity the scientists have" - that's the actual problem under discussion here.

Research is heavily funded because people believe it's something more than a random claim making machine. You say governments should assume research is wrong and then try to replicate any claim before acting on it. But you end up in a catch 22: if the research community is constantly producing wrong claims there's no reason to believe your replication attempt is correct, as it will presumably be done by researchers or people who are closely aligned.

Additionally inability to replicate is only one of many possible problems with a paper. Many badly designed studies that cannot tell you anything will easily replicate. A lot of papers are of the form "Wet pavements cause umbrella usage". That'll replicate every single time, but it's not telling you anything useful about the world. Merely trying to fix things with lots of replication studies thus won't really solve the problem.

TimPC · on May 22, 2021

Research is far better than a random claim making machine even if some of it has errors that have caused the replication crisis. It's easy to overstate the level of the problem even though it's fairly severe at this point.

"Wet pavements cause umbrella usage" is something where I'd want to see your specific examples because it's easy to get a correlational study of that nature but very hard to design a causal one. The correlational studies are usually accurate and often useful for other research.

cycomanic · on May 22, 2021

I would argue the whole framing of the "replication crisis" is another example of the problem with "overselling" research results. Yes there is a problem with some research in some areas of science not being replicatable. However, the vast majority of research in many fields does not have this problem. Framing this as a "crisis" overstates the problem and gives the impression that the majority of research can't be replicated.

cjfd · on May 22, 2021

By waiting until scientists address this? Note that the 'replication crisis' is something that originated inside science itself, so, despite there being problems science has not lost its self-correcting abilities. The scientists themselves can do something by insisting on reliable and correct methods and pointing it out wherever such methods are not in use. It is also not like there are no gains in doing this. Brian Nosek became rather famous.

mike_hearn · on May 22, 2021

The replication crisis is not being addressed. It's being discussed occasionally within the academy, but a cynic might wonder if that's because writing about the prevalence of bad papers is a way to write an interesting paper (and who is checking if papers about replication themselves replicate?). It's been discussed far longer and more extensively by the general public but those discussions aren't taken seriously by the establishment, being as they are often phrased in street terms like "you can find an expert to tell you anything" or "according to scientists everything causes cancer so what do they know?". And of course the higher quality criticism gets blown off as mere "skepticism" or "conspiracy theories" and anyone who tries to research that is labelled as toxic.

So a lot of people only notice this in the rare cases when someone within the academy decides to write about it. This can make it seem like science is self correcting, but it appears in reality it's not. When measured quantitatively there is no real improvement over time. Alvaro de Menard has written extensively on this topic and presented data on the evolution of P values over the last decade:

https://fantasticanachronism.com/2020/09/11/whats-wrong-with...

Additionally as he observes at the end of his essay, the problems are due to bad incentives, so the only true changes can come from changes to incentives. However those incentives are set by the government. Individual scientists cannot themselves change the incentives. The granting agencies are entirely oblivious to the problems and the scale of their ambition is in no way equal to the scale of their problem:

"If you look at the NSF's 2019 Performance Highlights, you'll find items such as "Foster a culture of inclusion through change management efforts" (Status: "Achieved") and "Inform applicants whether their proposals have been declined or recommended for funding in a timely manner" (Status: "Not Achieved") .... We're talking about an organization with an 8 billion dollar budget that is responsible for a huge part of social science funding, and they can't manage to inform people that their grant was declined! These are the people we must depend on to fix everything."

vbezhenar · on May 22, 2021

Scientists with a proven track record should have life-long funding of their laboratory without any questions asked. So they can act as they want without fear of social repercussions. Of course some money will be wasted and the question of determining whether a track record is proven is still open, but I think that's the only way for things to work (except when the scientist himself have enough money to fund his own work).

msg3 · on May 22, 2021

I think this would be a positive step, but to play devil's advocate, what happens when this superstar scientist retires? If I'm a researcher in his lab, does my job just disappear? If so, I'm still going to feel pressure to exaggerate the impact of my research.

mike_hearn · on May 22, 2021

I've been spending a lot of time on 'bad science' as a topic lately (check my comment history or blog for some examples). I think what you're proposing is the opposite of what's required.

Firstly, the problem here is not an epidemic of scientists who feel too financially insecure to do good work. Many of the worst papers are being written by people with decades-long careers and who lead large labs. Their funding is very secure. They are doing bad work anyway for other reasons, sometimes political or ideological, more often because doing bad work results in attention, praise and power. Or sometimes because they don't know how to explain their chosen question, but don't want to admit that scientifically they failed and don't know where to go next.

Secondly, as you already realized your proposal relies on identifying which scientists have a proven track record, but the whole problem is that science is flooded with fraudulent/garbage claims which are highly cited ("proven") and which were written by large teams of supposedly respectable scientists at supposedly respectable institutions. Any metric you can invent to decide who or what has a proven track record is going to be circular in this regard. To Rumsfeld the problem, we are surrounded by "unknown knowns". You say this is an open question but to me that's a fatal flaw.

So the problem is actually the inverse. You say at the end, well, scientists who can fund their own work are an exception. Obviously in most cases scientists don't need to do this, they can also be funded by companies. Most computer science research works this way. Better CPUs and hardware is done almost entirely by companies. AI research has been driven by corporate scientists, and so on. In contrast academic funding comes primarily from government agencies that distribute money according to the desires of academics. This means a tiny number of people control large sums of money, and they are accountable to nobody except themselves. There are no systems or controls on academic behavior except peer review, which is largely useless because the peers are doing the same bad things as everyone else.

Viewed from an economic perspective academia is a planned reputation economy. The state is the source of all resource allocation decisions (academics being effectively state employees in most fields). There's also a deeply embedded Marxist worldview: universities have no working mechanisms to detect fraud, because of an implicit assumption that deep down when market forces are gone everyone is automatically honest and good. The hierarchy is stagnant; the same institutions remain at the top for centuries. A good reputation lets them select the people with the reputation for being smart (e.g. by school grade), so that reputation accrues to the institutions, which lets them keep selecting intake by reputation and so on. Supposedly Oxford and Cambridge are the best UK universities, they always have been, and they always will be. In a competitive, free market economy they would face competition and other institutions would seek to figure out what their secret is and copy it, like how so many companies try to copy the Toyota Way. In science this doesn't happen because there's nothing to copy: these institutions aren't actually different.

This implies a simple solution, just privatize it all. It would be wrenching, just like it was when the USSR transitioned to a market economy, just like it was when China (sort of) did the same. But one thing the 20th century teaches us is that you can't really fix the problems of a planned economy by tinkering with small reforms at the edges. The Soviets weren't able to fix their culture with glasnost and perestroika. They eventually had to give up on the whole thing. Replacing the current reputation economy with a real economy, with all the mechanisms that economic system has evolved (markets, prices, regulators, court cases, fraud laws etc), seems like a more direct and obvious approach to making things better, even if it may sound extreme.

toomim · on May 23, 2021

Oh hey, Mike Hearn! I've long been a fan of yours in Bitcoin. It's good to see you're interested in 'bad science' lately as well -- this is a topic I've also been working on for the last N years along with Bitcoin. I hope we get to interact more in the future. :)

My envisioned solution is similar to yours, here. But rather than "privatize science", which I think most people will interpret as "move to industrial research", my rallying cry is a little more like "hey scientists, stop depending on public funding, let's find creative ways to get the science done."

I also like to point out that money is often not the missing factor as much as community. This has always been true. Mendel discovered genetics by experimenting on beanstalks in his garden at his monastery. It cost him very little to do it, and he only stopped the research when his community told him to stop wasting time on beans and get back to the important accounting work that impacted the church's politics at the time.

You might think that maybe science was cheap in the past, but that today you need lots of money, to get the lab equipment, etc. However, science always has a cutting edge of cheaply evaluable questions. We recently hosted a DIY Synthetic Biologist (currently on the homepage of https://invisible.college) who showed the actual costs of his work, and his laboratory equipment was far, far, cheaper than the "cost" of his time. We can get far more science done with "amateur scientists" (remember that "ama" means love, and an amateur scientist is one doing science for love) by creating a scientific community outside the institutions for interested parties to work together, pool their brainpower and resources, and come up with great novel work.

And if anyone else agrees with me on this, please let me know so we can forces. I'm toomim@gmail.com, and am doing work on invisible.college.

mike_hearn · on May 23, 2021

Hello! Absolutely, drop me an email any time you like.

I absolutely agree that a lot of science can be done very cheaply. Some of the most impactful papers were done by people who weren't in an institutional framework, even in the modern era (Satoshi being an obvious example). Additionally it seems most of the really problematic fields are ones where the budget gets dispersed over large number of people writing very cheap low budget papers, hence millions of social science papers with tiny sample sizes.

I'm a big supporter of industrial research though. Many great papers come out of industrial labs. Modern computing is practically defined by such research. The big advances all seem to come from big corporate labs (Xerox PARC, Bell Labs, Google, DeepMind, IBM, Sun, Microsoft, etc). The research is powerful because it's funded by people who expect some sort of meaningful results and supervise the work to ensure it doesn't go completely off the rails. Academic institutions have developed this totally hands off attitude that makes research more or less unaccountable to any standard beyond "will it get published", which in turn can be rephrased as "are the claims interesting".

toomim · on May 23, 2021

Great! Thank you for the invitation! :)

> The big advances all seem to come from big corporate labs

That's an interesting claim, and I'd encourage you to find some statistics to verify this hypothesis, because in my experience, that doesn't ring true.

From my subjective perspective, it seems that academic and industrial research labs innovate at roughly the same rate per-capita. I was a PhD student when Microsoft was dominant, hiring the best faculty from all top-4 CS schools (CMU, Berkeley, MIT, Stanford), and they certainly produced a lot of papers, and did seem to dominate conferences, but the actual innovation in computing came from Apple and startups, which did not have "research labs". Microsoft, including its giant industrial research lab, certainly was not the driver of innovation in computing!

And here are some numbers to back that up: Microsoft's R&D budget in 2011 was 10x the budget of the entire NSF -- for all sciences. Yet, Microsoft was clearly not producing more than 10x the scientific output of all NSF-funded academic science.

So it would help to have some statistics for the claim that industrial research innovates more than academic research. They certainly pay more, and often hire more people, but per-capita they don't seem any more productive or healthier than academics.

mike_hearn · on May 27, 2021

Ah, right. That gets us into the definitions of innovation and research.

Apple does very little research, in the conventional scientific sense we're discussing here, I think that's pretty uncontroversial. They produce few if any papers. They are (or were, under Jobs) very good at coming up with new ideas that strongly appeal to the buyer and which got them a reputation for innovation, but which probably wouldn't be considered clever enough to be research papers. At least not top tier papers.

For example, exposé is a widely imitated feature and was considered very innovative at the time, but it wouldn't be seen as serious computer science. The iPhone is/was widely considered innovative but had basically no new research tech in it, given that capacitive touch screens weren't developed by Apple. It was just a really nicely implemented mobile computer. Actually the innovations in the iPhone are nearly all packagings of tech developed by third party firms that Apple then buys or buys exclusivity rights too. At least, that's true in my view.

Microsoft's R&D budget I think is also a victim of definitions. Software firms normally report all product development as R&D, right? I think these days they may even report datacenter builds as R&D. We can see this on Microsoft's investor website:

"In addition to our main research and development operations, we also operate Microsoft Research. Microsoft Research is one of the world's largest computer science research organizations"

i.e. the kind of university type "scientific" research we're discussing here is only a sideshow in Microsoft's R&D budget.

You're right to call me out though; I don't have any stats to prove that industrial research does more than academic research. It's not a statistical argument to begin with, just my own own perception ("all seem to"). I read a lot of CS papers and the best ones have corporate email addresses at the top - the second best, a mix of corporate and university addresses, the third best, only university addresses. If you asked the man on the street to name the biggest innovations in computing in the past 20 years they'd probably say things like, uh, smartphones, YouTube, AI, blockchain, etc etc. All things that have little connection to universities, with AI being the closest but it was Google that revived that whole field and has been pushing it forward ever since. Neural nets weren't receiving much investment by the academic community before that.

Anyway, that's CS. CS really isn't the problem here. The pseudo-science is elsewhere.

enriquto · on May 22, 2021

at least in some parts of computer science the solution is easy: do not ever publish results without the public source code of all the experiments.

clavigne · on May 22, 2021

I did peer review for a number of scientific papers that include code. Almost every time, I was the only reviewer that even look at the code.

In most cases, peer reviewers will just assume that authors claiming the "code is available" means that a) it is reproducible and b) it is actually there.

As a counter example, this recent splashy paper

https://www.nature.com/articles/s41587-021-00907-6

claims the code is available on github, but the github version ( https://github.com/jameswweis/delphi ) contains the actual model only as a Pickle file, and contains no data or featurization.

So clearly, the peer reviewers didn't look at it.

enriquto · on May 22, 2021

that. The main task of the reviewers should be to re-run all the experiments on their own computer and check the results.

clavigne · on May 22, 2021

re-running is definitely too much work for most scientific papers, at least in ML and computational sciences were experiments might take 1000s of core-hours or gpu-hours, but that's usually not necessary. In addition, just running the code can spot really bad problems (it doesn't work) but easily miss subtle ones (it works but only for very specific cases).

I think it's more important for reviewers to read the source, the same way one would read an experimental protocol and supplementary information, mainly checking for discrepancies between what the paper claims is happening and what is actually being done. In the above example, a reviewer reading the code would have spotted that the model isn't there at all, even though it runs fine.

seoaeu · on May 22, 2021

Providing source code is a good thing, but a lot of people confuse re-running experiments with replicating them. If you take the authors' source code and re-run it, then any bugs are going to invalidate your results too. The only way to actually have confidence in the paper's results are to rewrite the software from scratch.

In fact, I'd actually go further and question what kinds of errors could possibly be caught be running the same software that the authors did? Any accidental bugs will remain, and any malicious tampering with the experiment data is exceedingly unlikely to be caught even with a careful audit of the code.

astrange · on May 22, 2021

That isn't possible if you're using commercially licensed source from other people, drivers for scientific instruments, lacking copyright assignment for some of it, etc. Same reason many commercial projects can't be open sourced even if the company wanted to.

enriquto · on May 22, 2021

So people who used proprietary software will not be able to publish. Sounds like a win-win to me!

astrange · on May 22, 2021

Your definition of free software is more restrictive than the FSF’s.

enriquto · on May 22, 2021

of course i was simplifying... but it seems obvious to me that enforcing automatic reproducibility in peer reviewed publications can only be a good thing in the long run

refurb · on May 22, 2021

My personal opinion is this problem fixes itself over time.

When I was in graduate school papers from one lab at Harvard were know to be “best case scenario”. Other labs had a rock solid reputation - if they said you could do X with their procedure, you could bet on it.

So basically we treated every claim as potential BS unless it came from a reputable lab or we or others had replicated it.

la_fayette · on May 22, 2021

An approach to how go about it is to include a replication package with the paper, including the dataset... This should be regarded as standard practice today, as sharing something was never easier. However, adding a replication package is still done by the minority of researchers...

Bombthecat · on May 22, 2021

You can't, except trying to fix human nature..

diegoperini · on May 22, 2021

Instead of adding a punishment, maybe we should remove the reward. How, that I don't know.

open0 · on May 22, 2021

More transparency in some form, requiring researchers to publish code and data openly for instance.

Wowfunhappy · on May 22, 2021

I can understand why journals don’t publish studies which don’t find anything. But they really should publish studies that are unable to replicate previous findings. If the original finding was a big deal, its potential nullification should be equally noteworthy.

andi999 · on May 22, 2021

While I would have agree with that when I was younger. I learned there is a lot of possibilities why PhD students (the guys who do studies) fail to replicate anything (and I am talking about fundamental solid engineering).

dekhn · on May 22, 2021

this was exactly my experience and I remember the paper that I read that finally convinced me. It turns out the author had intentionally omitted a key step that made it impossible to reproduce the results, and only extremely careful reading and some clever guessing found the right step.

There are several levels of peer review. I've definitely been a reviwer on papers where the reviewers requested everything required and reproduced the experiment. That's extremely rare.

Bellamy · on May 22, 2021

Why are so afraid to reveal the name and institution?

mycologos · on May 22, 2021

Their username is publicly linked to their real-life identity. Revealing the name and institution has a reasonable chance of provoking a potentially messy dispute in real life. Maybe eob has justice on their side, but picking fights has a lot of downsides, especially if your evidence is secondhand.

DoreenMichele · on May 22, 2021

From what I have read, peer review was a system that worked when academia and the scientific world were much smaller and much more like "a small town." It seems to me like growth has caused sheer numbers to make that system game-able and no longer reliable in the way it once was.

amvalo · on May 22, 2021

Why not just name the paper :)

MichaelMoser123 · on May 22, 2021

may i ask what field of knowledge the manipulated paper was from? Your page lists CS/NLP, so that field may also be linguistics or neurology (linguistics which would be easier to swallow for me) https://scholar.google.com/citations?user=FMScFbwAAAAJ&hl=en

Some wider questions would be: Are there similar problems in Mathematics/physics versus the life sciences/other social sciences? Are there the same kind of problems across different fields of study?

Also i wonder if replication issues would be less severe if there was a requirement to publish the software and raw data that any study is based on as open source / data. It is possible that a change in this direction would make it more difficult to manipulate the results (after all it's the public who paid for the research, in most cases)

obviouslynotme · on May 24, 2021

I worked at a prestigious physics lab working for the top researcher in a field. It absolutely happens there and probably everywhere.

The only way to fix replication issues is to give financial and career incentives for doing replication work. Right now there are few carrots and many sticks.

MichaelMoser123 · on May 25, 2021

Thanks! So all this is probably happening accross the board, amazing.

achillesheels · on May 22, 2021

Frankly, sir, it is the reason you wish your anecdote to remain anonymous that such perfidy survives. If these traitors to human reason and the public’s faith in their interests serving the general welfare - after all who is the one feeding them? - became more public, perhaps there would be less fraudulence? But I suppose you have too much to lose? If so, why do you surround yourself in the company of bad men?

lasfter · on May 22, 2021

The issue is that the authors of bad papers still participate in the peer-review process. If they are the only expert reviewers and you do not pay proper respect to their work, they will squash your submission. To avoid this, papers can propagate mistakes for a long time.

Personally, I'm always very careful to cite and praise work by "competing" researchers even when that work has well-known errors, because I know that those researchers will review my paper and if there aren't other experts on the review committee the paper won't make it. I wish I didn't have to, but my supervisor wants to get tenured and I want to finish grad school, and for that we need to publish papers.

Lots of science is completely inaccessible for non-experts as a result of this sort of politics. There is no guarantee that the work you hear praised/cited in papers is actually any good; it may have been inserted just to appease someone.

I thought that this was something specific to my field, but apparently not. Leaves me very jaded about the scientific community.

MattGaiser · on May 22, 2021

What is it that makes you have a nice career in research? Is it a robust pile of publishing or is it a star finding? Can you get far on just pure volume?

I want to answer the question "if I were a researcher and were willing to cheat to get ahead, what should be the objective of my cheating?"

lasfter · on May 22, 2021

I suppose it depends on how you define nice? If you cheat at some point people will catch on, even if you don't face any real consequences. So if you want prestige within your community then cheating isn't the way to go.

If you want to look impressive to non-experts and get lots of grant money/opportunities, I'd go for lots of straightforward publications in top-tier venues. Star findings will come under greater scrutiny.

rscho · on May 22, 2021

Not outright cheating, but cooking results to seem better/surprising and publishing lots of those shitty papers is the optimal way to build a career in many fields. In medicine, for example.

lasfter · on May 22, 2021

Cooking results seems like outright cheating to me.

DistressedDrone · on May 22, 2021

It's more complicated than that. It can look something like this: https://xkcd.com/882/

gowld · on May 22, 2021

For grants and tenure, 100 tiny increments over 10 years are much better for your research career then 1 major paper in 5 years that is better than all of them put together.

If you want to write a pop book and on TV and sell classes, you need one interesting bit of pseudoscience and a dozen followup papers using the same bad methodology.

vngzs · on May 22, 2021

This sounds inseparable from the replication crisis. The incentives are clearly broken: they are not structured in a manner that achieves the goal of research, which is to expand the scope and quality of human knowledge. To solve the crisis, we must change the incentives.

Does anyone have ideas on how that may be achieved - what a correct incentive structure for research might look like?

patcon · on May 22, 2021

Ex-biochemist here, turned political technologist (who's spent a few years engaged in electoral reform and governance convos)

> the goal of research, which is to expand the scope and quality of human knowledge.

But are we so certain this is ever what drove science? Before we dive into twiddling knobs with a presumption of understanding some foundational motivation, it's worth asking. Sometimes the stories we tell are not the stories that drive the underlying machinery.

For e.g., we have a lot of wishy-washy "folk theories" of how democracy works, but actual political scientists know that most of the ones people "think" drive democracy, are actually just a bullshit story. According to some, it's even possible that the function of these common-belief fabrications is that their falsely simple narrative stabilizes democracy itself in the mind of the everyman, due to the trustworthiness of seemingly simple things. So it's an important falsehood to have in the meme pool. But the real forces that make democracy work are either (a) quite complex and obscure, or even (b) as-of-yet inconclusive. [1]

I wonder if science has some similar vibes: folks theory vs what actually drives it. Maybe the folk theory is "expand human knowledge", but the true machinery is and always has been a complex concoction of human ego, corruption and the fancies of the wealthy, topped with an icing of natural human curiosity.

[1]" https://www.amazon.ca/Democracy-Realists-Elections-Responsiv...

akiselev · on May 22, 2021

> I wonder if science has some similar vibes: folks theory vs what actually drives it. Maybe the folk theory is "expand human knowledge", but the true machinery is and always has been a complex concoction of human ego, corruption and the fancies of the wealthy, topped with an icing of natural human curiosity.

The Structure of Scientific Revolutions by Thomas Kuhn is an excellent read on this topic - dense but considered one of the most important works in the philosophy of science. It popularized Planck's Principle paraphrased as "Science progresses one funeral at a time." As you note, the true machinery is a very complicated mix of human factors and actual science.

eru · on May 22, 2021

See also the Myth of the Rational Voter https://www.goodreads.com/book/show/698866.The_Myth_of_the_R...

obviouslynotme · on May 24, 2021

Modern real science is driven by engineering that is driven by an industry that is is driven by profit and nature. If you are reading a paper that isn't driven by that chain of incentives, then the bullshit probability shoots way up. If someone somewhere isn't reading your paper to make a widget that is sold to someone to do something useful, then you can say whatever you want.

PeterisP · on May 22, 2021

I've thought about it a lot and I don't think it might be achieved.

The trouble is that for the evaluators (all the institutions that can be sources of an incentive structure) it's impossible to distinguish an unpublished 90%-ready Nobel prize from unpublished 90%-ready bullshit. So if you've been working for 4 years on minor, incremental work and published a bunch of papers it's clear that you've done something useful, not extraordinary, but not bad; but if you've been working on a breakthrough and haven't achieved it, then there's simply no data to judge. Are you one step from major success? Or is that one step impossible and will never be achieved? Perhaps all of it is a dead end? Perhaps you're just slacking off on a direction that you know is a dead end, but it's the one thing you can do which brings you some money, so meh? Perhaps you're just crazy and it was definitely a worthless dead end? Perhaps everyone in the field thought that you're just crazy and this direction is worthless but they're actually wrong?

Peter Higgs was a relevant case - IIRC he said in one interview taht for quite some time "they" didn't know what to do with him as he wasn't producing anything much, and the things he had done earlier were either useless or Nobel prize worthy, but it was impossible to tell for many years after the fact. How the heck can an objective incentive structure take that into account? It's a minefield.

IMHO any effective solution has to scale back on accountability and measurability, and to some extent just give some funding to some people/teams with great potential, and see what they do - with the expectation that it's OK if it doesn't turn out, since otherwise they're forced to pick only safe topics that are certain to succeed and also certain to not achieve a breaktrhough. I believe European Research Foundation had a grant policy with similar principles, and I think that DARPA, at least originally, was like that.

But there's a strong entirely opposite pressure from key stakeholders holding the (usually government) purses, their interests are more towards avoiding bad PR for any project with seemingly wasted money, and that results in a push towards these broken incentive structures and mediocrity.

cycomanic · on May 22, 2021

I would go a step further and say that the value of specific scientific discoveries (even if no bullshit is involved) can often not be evaluated until decades later. Moreover, I would argue that trying to measure scientific value is in fact an effort to try to quantify something unquantifiable.

At the same time, academics have been increasingly been evaluated by some metrics to show value for money. This has let to some schizophrenic incentive structures. Most professor level academics are spending probably around 30% of their time on writing grants, evaluating grants and reporting on grants. Moreover, the evaluation criteria also often demand that work should be innovative, "high risk/high reward" and "breakthrough science", but at the same time feasible (and often you should show preliminary work), which I would argue is a contradiction. This naturally leads to academics overselling their results. Even more so because you are also supposed to show impact.

The main reason for all this IMO is the reduced funding for academic research in particular considering the number of academics that are around. So everyone is competing for a small pot, which makes those that play to the (broken) incentives, the most successful.

eru · on May 22, 2021

Well, perhaps we can learn from how the startup ecosystem works?

For commercial ventures, you also have the same issue of incremental progress vs big breakthroughs that don't look like much until they are ready.

As far as I can tell, in the startup ecosystem the whole thing works by different investors (various angels and VCs and public markets etc), all having their own process to (attempt to) solve this tension.

There's beauty in competition. And no taxpayer money is wasted here. (Yes, there are government grants for startups in many parts of the world, but that's a different issue from angels evaluating would-be companies.)

Nasrudith · on May 22, 2021

Start ups are at an entirely different phase that have something research does not - feedback via market success. The USSR already demonstrated what happens when you try to run a process dependent upon price signals with their dead end economic theory attempts to calculate a global fair price.

You get what you measure for applies here. Now if we had some Objective Useful Research Quality Score t could replace the price signals. But then we wouldn't have the problem in the first place, just promote based on OURQS.

eru · on May 22, 2021

Let people promote with their own money based on what subjective useful researche quality score they feel like.

Haga · on May 22, 2021

Startups have misaligned incentives in a monopoly ruled world? Build a thousand messenger variations to get acquired by Facebook, comes to mind. So economic thinking might be harmful here?

mixmastamyk · on June 1, 2021

Your comments are mostly dead. I didn't see anything wrong with them in a cursory glance.

eru · on May 22, 2021

Why? If that's what society values, that's what society gets. Who are we to judge?

A 0.1% chance to build an app that's gonna be useful to hundreds of millions of people is better than what most career scientists manage.

eru · on May 22, 2021

> Does anyone have ideas on how that may be achieved - what a correct incentive structure for research might look like?

Perhaps start with removing tax payer money from the system.

Stop throwing good money after bad.

stewbrew · on May 22, 2021

You don't make a nice career in a vacuum. With very few exceptions, you don't get star findings in a social desert. You get star findings by being liked by influential supervisors who are liked by even more influential supervisors.

caenn · on May 22, 2021

There's a book called science fictions that pretty much goes over the standard bullshit packages in modern science.

lelanthran · on May 22, 2021

> I want to answer the question "if I were a researcher and were willing to cheat to get ahead, what should be the objective of my cheating?"

"Academic politics is the most vicious and bitter form of politics, because the stakes are so low."

https://en.wikipedia.org/wiki/Sayre%27s_law

perl4ever · on May 22, 2021

>Lots of science is completely inaccessible for non-experts as a result of this sort of politics

As a non-expert, this is not the type of inaccessibility that is relevant to my interests.

"Unfortunately, alumni do not have access to our online journal subscriptions and databases because of licensing restrictions. We usually advise alumni to request items through interlibrary loan at their home institution/public library. In addition, under normal circumstances, you would be able to come in to the library and access the article."

This may not be technically completely inaccessible. But it is a significant "chilling effect" for someone who wants to read on a subject.

sampo · on May 22, 2021

If your main interest is reading papers and not being political about it, just use sci-hub to read the papers.

perl4ever · on May 23, 2021

Having skimmed the Wikipedia page on it, I'm unsure about the legalities and potential consequences.

lnwlebjel · on May 22, 2021

Some journals allow you to specify reviewers to exclude. True that there is no guarantee about published work being good, but that is likely more about the fact that it takes time to sort out the truth than about nefarious cabals of bad scientist.

I think the inaccessibility is for different reasons, most of which revolve around the use of jargon.

In my experience, the situation is not so bad. It is obvious who the good scientist are and you can almost always be sure that if they wrote it it's good.

TimPC · on May 22, 2021

In many journals it's abuse of process to exclude reviewers you don't like. Much of the times this is supposed to be used to declare conflicts of interest based on relationships you have in the field.

baby · on May 22, 2021

Why do people need to publish? The whole point of publishing was content discovery. Now that you can just push it to a preprint or to your blog what’s the point? I’ve written papers that weren’t published but still got cited.

PeterisP · on May 22, 2021

I need money to do research, available grants require achieving specific measurable results during the grant (mostly publications fitting specific criteria e.g. "journal that's rated above 50% of average citation rating in your subfield" or "peer reviewed publication that's indexed in SCOPUS or WebOfScience", definitely not a preprint or blog), and getting one is also conditional on earlier publications like that.

In essence, the evaluators (non-scientific organizations who fund scientific organizations) need some metric to compare and distinguish decent research from weak, one that's (a) comparable across fields of science; (b) verifiable by people outside that field (so you can compare across subfields); (c) not trivially changeable by the funded institutions themselves; (d) describable in an objective manner so that you can write up the exact criteria/metrics in a legal act or contract. There are NO reasonable metrics that fit these criteria; international peer-reviewed-publications fitting certain criteria are bad but perhaps least bad from the (even worse) alternatives like direct evaluation by government committees.

dash2 · on May 22, 2021

Simple cetacean count of the paper itself is probably a better metric than journal, though it’s certainly not perfect either.

(I am leaving cetacean cunt in because it’s a funny autocorrect.)

(And now I’m leaving the above in, because it’s even funnier. Both genuine.)

ISL · on May 22, 2021

When you are looking for a job, are up for promotion/tenure, or applying for grants, a long publication record in prestigious journals is helpful.

Spooky23 · on May 22, 2021

Metrics. You can’t manage what you can’t measure!

ajmurmann · on May 22, 2021

"When a measure becomes a target, it ceases to be a good measure." - Marilyn Strathern

mbg721 · on May 22, 2021

At some point, there's not going to be enough budget for both the football coach and the Latin philology professor. We should hire another three layers of housing diversity deans just to be safe.

seibelj · on May 22, 2021

What’s crazy to me is nothing should stop an intelligent person from submitting papers, doing research, etc. even outside the confines of academia and having a PhD. But in practice you will never get anywhere without such things because of the politics involved and the incestuous relationship between the journals and their monetarily-uncompensated yet prestige-hungry army of researchers enthralled to the existing system.

lnwlebjel · on May 22, 2021

If you add 'self funded' to this hypothetical person, then it would not matter if they play any games. Getting published is really not that hard if your work is good. And if it is good it will get noticed (hopefully during the hypothetical person's lifetime). Conferences have less of these games in my experience and would help.

Also, I know of no researchers personally who are enthralled by the existing system.

seibelj · on May 22, 2021

Can you name a single person with a high school or BS degree published in nature or other high impact journals? If not, why is this the case?

dr_kiszonka · on May 22, 2021

I think one of the most famous examples is that of Grosset, who published his work on statistical significance under the pen name "Student.”[0] I wish I could give you a more recent example, but I don't pay attention to authors' degrees much, unless a paper is suspicious and from a journal I am unfamiliar with.

If I am reading between the lines correctly, you are implying there are few undergrads publishing in high caliber journals because of gatekeeping. As a reviewer, I often don't even know the authors' names, let alone their degrees and affiliations. It is theoretically possible that editors would desk reject undergrads' papers, but: a) I personally don't think a PhD is required to do quality research, especially in CS, and I know I am not the only person thinking that; b) In some fields like psychology and, perhaps, physics many junior PhD students only have BS degrees, which doesn't stop them from publishing.

I think that single-authored research papers by people without a PhD are relatively uncommon because getting a PhD is a very popular way of leveling up to the required expertise threshold and getting research funding without one is very difficult. I don't suspect folks without a PhD are systematically discriminated against by editors and reviewers, but, of course, I can't guarantee that this universally true across all research communities.

0. https://en.wikipedia.org/wiki/William_Sealy_Gosset

seibelj · on May 22, 2021

I believe that “good” research, i.e. that which would be referenced by other “good” researchers and useful in obtaining government grants, reported in the press, and so on is indeed gatekeeped. Some subjects such as mathematics and computer science have had much progress in preprints and anyone can publish anonymously and make a mark. But the majority of subjects are blocked to those already connected, especially soft sciences like sociology, psychology, and economics.

I think the entire academic enterprise needs to be burnt down and rebuilt. It’s rotten to the core and the people who are providing the most value - the scholars - are simultaneously underpaid and beholden to a deranged publishing process that is a rat race that accomplishes little and hurts society. Not just in our checkbook but also in the wasted talent.

seoaeu · on May 22, 2021

The status quo isn't perfect, but I think you are severely exaggerating how bad things are. The fact that nearly all scientific publishing is done by people who are paid to do research (grad students, research scientists, professors, etc.) isn't evidence of gatekeeping. It just means that most people aren't able/willing to work for free.

It also isn't any sort of conspiracy that government grants are given out to people with a proven history of doing good research, as evaluated by their peers.

Der_Einzige · on May 22, 2021

I personally as a BS holder only along with a (at the time) high school senior published a paper in a top 6 NLP conference. I had no help or assistance from any pH.d or institution.

Maybe not quite as prestigious as nature, but NLP is pretty huge and the conference I got into has average h index of I think 60+

Proof: https://www.aclweb.org/anthology/2020.argmining-1.1/

dmingod666 · on May 22, 2021

In the same vein though, can you think of any person that wants to publish research and is actively being denied being able to publish..

The people you mention are probably making YouTube videos and writing blog posts about their findings and are reaching a broader audience..

TimPC · on May 22, 2021

I know a person who got published in high school. They did so by working closely with multiple professors on various projects. You don't have to do a PhD to do this especially if you're a talented and motivated youngster.

vnorilo · on May 22, 2021

I have personally recommended for publication papers written by people who do not have a master's degree. In most cases I did not know that at the time of review, but it did not occur to me to care about it when I did.

dmingod666 · on May 22, 2021

Myers Briggs test is an example, pseudo scientific tests with questionable origin..

"Neither Myers nor Briggs was formally educated in the discipline of psychology, and both were self-taught in the field of psychometric testing."

angrais · on May 22, 2021

I can name several high school students who conducted studies and led first author papers to leading HCI venues. They were supervised by academics though. Would that suffice?

cycomanic · on May 22, 2021

This has nothing to do with gatekeeping. I agree that the current publication and incentive system is broken, but it's completely unrelated to the question if outsiders are being published. The reason why you see very little work from outsiders is because research is difficult. It typically requires years of full time dedicated work, you can't just do it on the side. Moreover, you need to first study and understand the field to identify the gaps. If you try to identify gaps on your own, you are highly likely to go off into a direction which is completely irrelevant.

BTW I can tell you the the vast majority of researchers are not "enthralled" by the system, but highly critical. They simply don't have a choice but to work with it.

dash2 · on May 22, 2021

I think this is a bit naive. One thing that stops a smart person doing research without a PhD is that it takes a long time to learn enough to be at the scientific frontier where new research can be done. About a PhD’s length of time, in fact. So, many people without a PhD who try to do scientific research are cranks. I don’t say all.

TimPC · on May 22, 2021

Some quality journals and conferences have double blind reviews now. So the work is reviewed without knowing who the work belongs to. It's not so much the politics of the system as the skills required to write a research paper being hard to learn outside of a PhD. You need to understand how to identify a line of work in a very narrow field so that you can cite prior work and demonstrate a proper understanding of how your work compares and contrasts to other closely related work. That's an important part of demonstrating your work is novel and it's hard to do (especially for the first time) without expert guidance. Most students trying this for the first time cite far too broadly (getting work that's somewhat related but not related to the core of their ideas) and miss important closely related work.

BurningFrog · on May 22, 2021

It's time to start over with some competing science establishments!

eru · on May 22, 2021

There's lots of good science done in the commercial sector.

(There's lots of crowding out happening, of course, from the government subsidized science. But that can't be helped at the moment.)

BurningFrog · on May 23, 2021

Sure, but I'm dreaming of a whole parallel reformed "New-niversity" system that replaces outdated and wasteful practices with systems that are more productive.

It will probably have to be started by some civic minded billionaires. I don't think the established system can reform itself.

jMyles · on May 22, 2021

Thanks for sharing.

What you've described sounds like something that is not, in any sense, science.

From your perspective, what can be done to return the scientific method to the forefront of these proceedings?

exporectomy · on May 22, 2021

You're like that Chinese sports boss who was arrested for corruption and complained that it would be impossible to do his job without participating in bribery. Just because you stand to personally gain from your corrupt practices doesn't excuse them. If anything, it makes them morally worse!

lasfter · on May 22, 2021

I don't tell lies about bad papers, only give some peremptory praise so that reviewers don't have ammunition to kill my submission. E.g. if a paper makes a false contribution X and a true contribution Y, I only mention Y. If I were to say "So-and-so claimed X but actually that's false" I would have to prove it, and unless it's a big enough issue to warrant its own paper, I don't want to prove it. Any ways, without having the raw data, source code etc for the experiments, there is no way for me to prove that X is false (I'm not a mathematician). Then the reviews will ask why I believe X is not true when peer-review accepted it. Suddenly all of my contributions are out the window, and all anybody cares about is X.

The situation is even worse when the paper claiming X underwent artifact review, where reviewers actually DID look at the raw data and source code but simply lacked the attention or expertise to recognize errors.

I'm not taking bribes, I'm paying a toll.

exporectomy · on May 28, 2021

I think you're artificially increasing the citation count for those papers which unfairly relatively disadvantages all their competitors.