Hacker News new | past | comments | ask | show | jobs | submit login
Bad scientific code beats code following "best practices" (2014) (yosefk.com)
230 points by luu 10 months ago | hide | past | favorite | 324 comments



Scientist and programmer here, and my experiences are the opposite. I value keeping things "boringly simple", but I desperately wish there was any kind of engineering discipline.

First is the reproducibility issue. I think I've spent about as much time simply _trying_ to get the dependencies of research code to run as I have done writing or doing research in my PhD. The simple thing is to write a requirements.txt file! (For Python, at least.)

Second, two anecdotes where not following best practices ruined the correctness of research code:

- Years ago, I was working on research code which simulated a power-grid. We needed to generate randomized load profiles. I noticed that each time it ran, we got the same results. As a software engineer, I figured I had to re-set the `random` seed, but that didn't work. I dug into the code, talked to the researcher, and found the load-profile algorithm: It was not randomly generated, but a hand-coded string of "1" and "0".

- I later had the pleasure of adapting someone's research code. They had essentially hand-engineered IPC. It worked by calling a bash script from Python, which would open other Python processes and generate a random TCP/IP socket, the value of which was saved to an ENV variable. Assuming the socket was open, the Python scripts would then share the socket names of other filenames for the other processes to read and open. To prevent concurrency issues, sleep calls were used throughout the Python and Bash script. This was four Python scripts and two shell scripts, and to this day, I do not understand the reason this wasn't just one Python script.


My problem with this discussion is that a lot of people just say "I'm a scientist (or I'm working with scientists) and I'm observing X so I can say 'scientists blahblahblah'".

Different scientific research fields are using widely different computer software environment, and have their own habits and traditions. The way a biologist uses programming has no reason to be similar to the way an astrophysicist does: they have not at all experienced the same software environment. It may even be useless to talk about "scientist" in the same field as two different labs working in the same field may have very different approaches (but it's more difficult if there are shared framework).

So, I'm not at all surprised that you observe opposite experience. The same way I'm not surprised to see someone saying they had the opposite experience if someone says "European people are using a lot of 'g' and 'k' in their words" just because they observed what happened in Germany.


I don't think there is much variance in quality of software among (radically different) fields of science.

One of the most poorly engineered products I work with was created by a few academic CS guys. The core algorithms are sophisticated and ostensibly implemented well, but the overall product is a horrible mess.

The incentives of academia make this obvious. You need to write some code that plausibly works just enough to get a manuscript out of it, but not much else. Reproducibility is not taken that seriously, and "productization"/portability/hardening is out of the question.


That's a strange comment. It looks like the goal of these software are not at all for "productization" and that even if the author wanted to, there will be no point of porting it or hardening it.

It feels like Software Developers are "brainwashed": a good software is a software that is good for what software developers need to do. But a good software is a software that is good for what people who needs it need to do. If the academic people don't need "productization" and still code for productzation, then they are doing a bad job.

Inversely, software made by software developers may be really bad in academic sector, as it is the subject of the article: they overengineer when not needed, they complicate things just for portability or future use that will never happen, ...

It's a bit like if someone says "these cooks who are preparing omelettes are bad cooks: they are mixing white and yolk together, while real cooks who do meringue always remove the yolk. So the proper way to cook eggs is by removing the yolk".

(another aspect that I've noticed is that software developers talk about unit test and code review and are offended if they see scientists not using them properly, but they don't even realise that the goal of these is fulfilled in an arguably more efficient way by other processes that exist in academia. For example, there is sometimes no unit test, but the creator of the algo is also the main user and they will notice directly if a new change has broken something. Or as another example, scientific collaboration often implies several teams working independently and writing their own implementation from scratch, so if team A and team B gets different results from the same inputs, they will investigate and find out the problem in the code)


> It's a bit like if someone says "these cooks who are preparing omelettes are bad cooks...

No, it's like serving the food on the floor without dishes. It doesn't matter whether the food is fancy or simple.

We need not defend the poor practices induced by the corrupt incentives of academia (publish or perish). Ideally, scientific methods and products of research should be highly reproducible (which implies a certain level of quality, in practice).


The reason the code of scientist is like it is is not "corrupt incentives of academia", it is just because the goals and the context is different.

First, scientific code is exploratory: you start building it without knowing which way will work. For example, you end up with 10 different fit functions, all using different approach (multi-gaussian, kde of control region, home-made shape based on a sum of Gumbel and Gaussian fcts, ...), and during your study, you do your test to conclude which one is the best. Because of that, "building for the future the multi-gaussian fit fct" is just plain stupid: you are spending time for something that have a high probability (>75%) of being totally dropped next week. You also need to build for huge flexibility. For example, having some "god object" allows to pivot very quickly when suddenly you discover a new parameter to add to the equation, rather than to refactorise all the function one by one to add a new parameter of a very specific type to all of them. In this context, the person who codes with strong static typing is not being very smart: in research, their code will change very often and they are just building a tool that will waste their time.

Second, the code is the tool, not the product. The scientific conclusion is the product. What needs to be reproducible is the conclusion. You apparently don't understand the goal of reproducibility. Reproducibility is to prove that 2 persons who redo it from scratch obtain the same results, even if they use different tool. By saying that it is important that a scientist B can just blindly run the software of a scientist A, you just screw the reproducibility advantage: if the software has a bug, scientist B will say "I get the same result, so, it is confirmed" while the result is incorrect. At CERN, people from the CMS experiment and people from the Atlas experiment are FORBIDEN to share their code for this reason. Even reading the code can be bad because it can bias the reader. Then, of course, there is a balance and I recommend to share the code with publication, but the reader should be educated enough to know that the code is the last resort to look at if they have a question when trying to reproduce the result.

Third, the way scientists work and collaborate creates to some extend "intrinsic" code review and unit testing. Several teams build software in parallel to check the same thing, and will investigate by comparing their result. And scientists are the first user of their software, intensively: they will never build a feature that they will not themselves use in practice later. So, some of the good practices of software dev are just not well designed for these situations.

At the end, you are right: serving the food on the floor is bad. The software developer "good practice" are serving the food on the floor because they assumed the dish was where it is not.

Finally, one thing to keep in mind is that some software dev people are just full of themselves with big ego. They love to think they are smarter than everyone else. If they are applying good practices themselves, they love feeling superior by thinking the ones not applying them are inferior and that they are so smart. It's a way of rationalising their traditions, and it's then easy to see plenty of ways that confirm this belief.

Don't get me wrong, plenty of good practices are really useful, and plenty of scientific code is just badly done. But you just need to have a balance: instead of blindly applying tradition just because it flatter their ego, people should just take a step back and think of why X and Y is better in some context and see how it translates in another context.


> Finally, one thing to keep in mind is that some software dev people are just full of themselves with big ego.

> instead of blindly applying tradition just because it flatter their ego

I understand this is really the core of your complaint; we agree that reproducibility is important (for more than one reason), but you don't understand software engineering as a practice, and judge it based on strawmen and perhaps bad experiences with poor, ego-driven developers. Every field has big egos, especially highly competitive ones like finance and academia.

Good engineering is absolutely not about "blindly applying tradition".


> I understand this is really the core of your complaint

If you think that it is the core of my complaint, then you did not understood at all.

This element is just there to say that it's easy to not understand the situation of the scientists and to conclude that it's just because they are stupid or they are not doing the things right.

You were the one talking about "poor practices induced by the corrupt incentives of academia", but this is just incompatible with the reality: in region where the "publish or perish" has no impact (and these regions exist, only people who know academia superficially don't realize that), we still see the same software development "mispractice". This is why I'm talking about ego: it is so easy for you to just rationalize different practice as "being somehow forced to do a bad job".

> we agree that reproducibility is important

I don't think you understand what I call "reproducibility". In science, reproducibility means FROM SCRATCH! You know the result, and you rebuild your own experiment. If you copy the experiment of the first author, you will copy the experimental error too.

Reproducibility is one of the reason some of the thing you pretend is a good practice is in fact not ideal in the context of science (as I've said, I myself push for sharing a clean code, but it is just that some of the important goal for usual software development don't exist in the same way in the scientific development context).

> but you don't understand software engineering as a practice

I understand very well why the good practice are indeed very important in the context of the usual software development. Again, you are just rationalising: you see m not agreeing with you, so you invent that I don't understand why good practices are used. It's again an easy way to not challenge what you consider as obvious.

> Good engineering is absolutely not about "blindly applying tradition".

Exactly. People who are jumping on the conclusion "if the scientists are not using these practice, then they are doing a bad job" are bad engineers.

You keep talking about good engineering practice as if these practices are always the correct way to go. Except it is not the case. As given in some example, wasting time on a "clean" architecture on something that the author is smart enough to know that it has a high probability to disappear in one week is not a "good practice". Yet, you are still talking to me as if this practice is "obviously good". It is very good for usual software development, when you have a large set of users and you want to maintain the software for years. But it makes no sense when the software is for the author of the software only and that everyone agrees it would be better to write a proper implementation of the final conclusion rather than to mix the goal and build a robust tool before exploring.

So, yes, you are blindly following the tradition: you just refuse to even consider that maybe what you have adopted in your mind as "good" may not be the best in a different context.


Some of your points stand. Though, in the iron triangle of speed, quality, scope, it turns out that quality and speed are linked. Quality is a requirement for speed.

So, it seems more akin to making meringue with yolks. Eventually maybe it will work, but if you knew what you are doing and cared, it would be done better faster.

The criticism though of losing sight of the goal is valid. That happens.


"Quality" for scientists is not the same specificities as "quality" for a software developer.

The goal of the software of the scientist is different, so the definition of what is "quality" is different.

The article illustrates that: a lot of "software done following the good software developer practice" ends up being of bad quality for the job, and end up wasting a lot of time.

Another aspect is the context: the iron triangle is also something built for a specific context. Of course a code that contains very very flexible function will have problems after 5 years of development and usage which will lead to drastic decrease of speed. But scientific code should not be used 5 years later (scientific code is to prove hypothesis, once the article published, you should not use this code, because this code, by construction, contains plenty of hypothesis testing that have been demonstrated not useful). So, the reason "speed" is related to "quality" is different in science.


> "Quality" for scientists is not the same specificities as "quality" for a software developer.

I disagree here. I believe quality is intrinsic to the product. There may be different attributes to the "quality" of the thing that one person may value more than another, but the intrinsic quality is the same. An over-engineered solution is rarely good. Don't use a power-boat to go across a small swimming pool, and don't use a shoddy makeshift raft to across a giant lake.

> The goal of the software of the scientist is different, so the definition of what is "quality" is different.

I disagree here, but I think I see your point. The general goal for everyone (the goal of software) is to accomplish some task. Software is fundamentally just a tool. Software for the sake of software is bad.

This makes me think of an analogy where a person is trying to redo the plumbing on their kitchen sink. Compared to a firm that will do whacky crazy things and leave the situation worse than when they started, and walk away with their bills payed and job half done. Compared to this, certainly a competent novice is better. Though, sometimes it is really important to know about certain O-rings, not only would the "true" professional do the job slightly faster & more methodically than the novice, but they'll know about that O-ring that would only become a problem in the winter. At the same time, sometimes there are no such O-rings, and a simple job is just ultimately a relatively simple job.

To that last extent, hiring software engineers can sometimes be scary. When you get very intelligent people and pay them to "solve complex problems," they do tend to build things that are very flexible, FAANG-scalable, very industry latest, AI powered, for the sake of it lasting 5 years; rather than starting simple - Gall's law: “A complex system that works is invariably found to have evolved from a simple system that worked." Or, those engineers may start there because they know those are the "best practices" without yet experiencing why those are best practices, and when to apply those practices and when not to apply those practices.

[1] http://principles-wiki.net/principles:gall_s_law


> I believe quality is intrinsic to the product.

You are switching definition: in this discussion, people have defined "quality" in one way, which is not relevant for the academic sector. Then you arrive with a different definition of quality to pretend that what they say is correct.

If you define "quality" as something intrinsic to the product, then the "good practice to do quality work" are not good to do quality work, because these practice leas to shit software (as illustrated again and again by plenty of people working in the academic sector or working close to them, like the author of the article here).

That's the problem in this discussion: "quality is good", "this practice is good for quality because in context X, the important attributes of quality are A and B", "therefore this practice is good in context Y even if the important attributes of quality are C and D".

I'm saying "what you call quality is A and B, and in science, quality is not A and B".

> This makes me think of an analogy ...

I agree with your analogy, but the person who know the O-ring is THE SCIENTIST. The software developers, and the "good practice rules" that they are applying, don't know anything about O-ring, they don't know how to install kitchen sink. Software developers don't know how to build scientific software, they don't even understand that scientific software have different needs and need different "good practice" than the ones they have learn for their specific job, which is a strongly different context.

It's like saying: a good practice in veterinarian is to give product X to the sick dog, so, we should follow the veterinarians good practice when we do medicine to humans, because veterinarians are doing products that help living things feel better, so it's the same right?

Your example with Gall's law is pretty clear: YOU DON'T WANT A SOFTWARE THAT LAST 5 YEARS IN RESEARCH. You NEED, and really need, a software that explore something that has a very big chance to lead to "no, it was incorrect", and that if it leads to "yes, it's correct", will be destroyed after the paper publication (because the publication contains the logic, so anyone can rebuild the implementation in a way that satisfy Gall's law if they want). Gall's law is totally correct, but they are not talking about "research software", they are talking about a different object, and the scientists are not building such object.


Absolutely my experience as well. Scientists write code that works, but is a pain to reproduce in any sort of scalable way. However it’s been getting better over time as programming is becoming a less niche skill.


The problem I've run into over and over with research code is fragility. We ran it on test A, but when we try test B nothing works and we have no idea why because god forbid there is any error handling, validation, or even just comprehensible function names.


This is partly because, in my opinion, some "best practices" are superstitions.

Some practice was best because of some issue with 80s era computing, but is now completely obsolete; problem has been solved in better ways or has completely disappeared thanks e.g. to better tooling or better, well, practices. e.g. Hungarian notation. Yet it is still passed down as a best practice and followed blindly because that's what they teach in schools. But nobody can tell why it is "good", because it is actually not relevant anymore.

Scientific code has no superstitions (as expected I would say), but not for the best reasons; they didn't learn the still relevant good practices either.


I wish we communicated the intent of the “best practice” instead of the practice itself.


Actually, when I’ve followed those guidelines, it’s because the tech lead graduated in the 1980s, almost certainly learned it all on the job, but has always done it that way. Others just do what they’ve done before. School talked about those things, but not in a “this is the right way” sort of thing.


There is no best practice. It is good to know the tools. In dojo, do that crazy design pattern shit and do crazy one long function. Do some C#, Java, JS, Go, Typescript, Haskell, Ruby, Rust (not necessarily those but a big variety). I want the next person to understand my code - this is very important. Probably more important than time spent or performance. If spending another 10% refactoring to make it easier to understand, even if just adding good comments, it is well worth it. Make illegal state impossible, if you can (e.g. don't store the calculated value, and if you do then design it so it can't be wrong!). Make it robust. Pretend it'll page you at 2am if it breaks!


Such as what? I don't really know of any such superstitions that are based on nothing.

I see a lot of opinion/taste presented as something more, but I really can't think of superstitions.


OOP madness? XML? Web scale databases?

Perhaps not superstition but certainly fundamentalist/hype-based thinking.


Chasing hyped up fads seems like the opposite of superstitions from the 80s, no?


I saw an example the other day that escapes annoyingly escapes my mind now, as it has been sort of overwritten by the "why the heck do some people name Makefiles with a capital M!?" pet peeve.

But I'd say a bit of everything listed in TFA. For instance global variables are the type of thing which makes a little voice say "if you do that, something bad will eventually happen". The voice of experience sometimes say things like that, though.


I don't know if I'd call it a superstition exactly, but there's a subset of people who are fine with foo1.plus(foo2) and bar1.plus(bar2) where foo and bar are different types, but for some reason, "foo1 + foo2" and "bar1 + bar2" is "confusing" or somehow evil. It feels a bit like they're superstitious about it. I get a similar vibe from people who have an aversion to static type inference.


It is important to have popular and powerful tools that can reduce amount of code for things like caching and building.

For example, Snakemake (os-independent make) with data version control based off of torrent (removing complication of having to pay for AWS, etc) for the caching of build steps, etc would be a HUGE win in the field. *No one has done it yet* (some have danced around the idea), but if done well and correctly, it could reduce the amount of code and pain in reproducing work by thousands of lines of code in some projects.

It's important for the default of a data version control to be either ipfs or torrent, because it's prohibitive to make everyone set up all these accounts and pay these storage companies to run some package. Ipfs, torrent, or some other centralized solution is the only real solution.


Today's "best practice" is tomorrow's worst practice.


[flagged]


Business epistemology is not about knowing Truth, it is about knowing currently useful information and practices, and it is expensive to validate or generate this knowledge.

Hence we get the same thing over and over again until someone convinces people there is a better way, or simply does something different and makes more money.


Now don't go blaming capitalism. It's the software engineers tendency to always be on the lookout for the silver bullet that will fix one of the currently broken things so it doesn't hurt so much that drives this endless performance theater.

In this environment there is an endless flood of salesmen - some benevolent, some charlatans - trying to sell the next fix.

The problem is as pointed out by Alan Kay among others that software engineering as a discipline seems to be a pop culture that always is uncritically on the lookout for the next methodology promoted by it's inventor. Without a hint of criticism or self reflection.


You’d have as much credibility if you blame it on fluoridation of water as blame it on capitalism.


What makes you say that the current culture and fads in some software companies has anything to do with capitalism?


It was the same in communist countries, even more so. If anything, we’re having less of this now that we switched to capitalism than before.


You'll probably get downvoted for mentioning this, but capitalism is a big factor here.

It's not so much about selling books and farming engagement. It's about commodification of labor. The "best practices" tend to cater to the lowest common denominator to get at least something done. This lowers bargaining power of labor, drives down wages and increases capital's share of the pie.

This is inevitable in capitalism. Happens to all crafts. This is why we can have cheap crap but not nice things.


Two more to the scientists' tab:

1. No tests of any kind. "I know what the output should look like." Over time people who know what it should look like leave, and then it's untouchable.

2. No regard to the physical limits of hardware. "We can always get more RAM on everyone's laptops, right?". (You wouldn't need to if you just processed the JSONs one at a time, instead of first loading all of them to the memory and then processing them one at a time.)

Also the engineers' tab has a strong smell of junior in it. When you have spent some time maintaining such code, you'll learn not to make that same mess yourself. (You'll overcorrect and make another, novel kind of mess; some iterations are required to get it right.)


Yes, the claim that the scientists' hacked-together code is well tested and even uses valgrind gave me pause. It's more likely there are no tests at all. They made a change, they saw that a linear graph became exponential, and they went bug hunting. But there's no way they have spotted every regression caused by every change.


Agree with those two problems on the scientist side. I would also add that they often don't use version control.

I think a single semester of learning the basics of software development best practices would save a lot of time and effort in the long term if it was included in physics/maths university courses.


> I would also add that they often don't use version control.

Working for corporate R&D, I once received a repo on a flash drive. The team would merge changes manually by copy-pasting.

I should've just turned around and left.


1 and 2 are features. Re 1, if someone doesn’t know what the output should look like they shouldn’t be reusing the code. Re 2, just think a bit more about it and you’ll realize fretting over ram that isn’t needed until it’s needed is actually just premature optimization.


Sounds like the non-programmers are good at what they are supposed to be good at (solving the actual problem, if perhaps not always in the most elegant manner) while the programmers should be producing a highly maintainable, understandable, testable and reliable code base (and potentially have problems with advanced algorithms that rely on complicated theorems), but they are not. The OP has a case of bad programmers - the techniques listed as bad can be awesome if used with prudence.

A good programmer has a very deep knowledge of the various techniques they can use and the wisdom to actually choose the right ones in a given situation.

The bad programmers learn a few techniques and apply them everywhere, no matter what they're working on, with whom they are working with. Good programmers learn from their mistakes and adapt, bad programmers blame others.

I've worked with my share of bad programmers and they really suck. A good programmer's code is a joy to work with.


Right and I think "scientists" simply are more intelligent than average Joe Coder. Intelligent people produce better software.

It is easy to learn some coding, not so easy to become a scientist.

To becomes a scientist you must write and get your PhD-thesis approved, which must already be about scientific discoveries you have made while doing that thesis. Only people with above average IQ can accomplish something like that, I think.


Being intelligent in one domain doesn’t automatically make you good in any others. Exceptional biologists can be astoundingly bad at maths, and the other way around. Like most skills, being good at writing software requires not only intelligence, but lots of experience too. Maybe smarter people will pick it up faster, but they aren’t intrinsically better.

It’s a bit surprising you’d have to explain such a basic conclusion here.


In my experience getting a PhD doesn't require above average intelligence, it does require a lot of perseverance and a good amount of organisation though.

I honestly think most skilled tradespeople are more intelligent than me and my PhD holding colleagues.


> Right and I think "scientists" simply are more intelligent than average Joe Coder. Intelligent people produce better software.

The vast majority of papers I read on topics I know are complete bullshit. Maybe making a PhD was more elitist before, but now it surely isn't.

If we define "scientist" as anyone who publishes papers, then they have the same problem as software engineering: it's mostly made by juniors.


I agree with the feelings of the author, most software is overengineered (including most of my software).

That being said, most scientific code I've encountered doesn't compile/run. It ran once at some point, it produced results, it worked for the authors and published a paper. The goal for that code was satisfied and than that code somehow rusted out (doesn't work with other compilers, hadn't properly documented how it gets build, unclear what dependencies were used, dependencies were preprocessed at some point and you can't find the preprocessed versions anywhere to reproduce the code, has hardcoded data files which are not in the published repos etc.). I wouldn't use THAT as my compass on how to write higher quality code.


Yeah somehow I suspect this author hadn't yet had to deal with colab notebooks.


Yeah, well gnome 2 also doesn't compile or run on my machine. It ran once at some point, but one is considered a "worse" class of software.


I'm a scientist programmer working in a field comprised by biologists and computer scientists, and what I've experienced is almost exactly the opposite of the author.

I've found the problems that biologists cause are mostly:

* Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months

* Writing completely unreadable code, even to themselves, making it impossible to maintain. This means they always restart from zero, and projects grow into folders of a hundred individual scripts with no order, depending on files that no longer exists

* Foregoing any kind of testing or quality control, making real and nasty bugs rampant.

IMO the main issue with the software people in our field (of which I am one, even though I'm formally trained in biology) is that they are less interested in biology than in programming, so they are bad at choosing which scientific problems to solve. They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.


>They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.

Ultimately I’d say the core issue here is that research is complex and those environments are often resource strapped relative to other environments. As such this idea of “getting shit done” takes priority over everything. To some degree it’s not that much different than startup business environments that favor shipping features over writing maintainable and well (or even partially) documented code.

The difference in research that many fail to grasp is that the code is often as ephemeral as the specific exploratory path of research it’s tied to. Sometimes software in research is more general purpose but more often it’s tightly coupled to a new idea deep seated in some theory in some fashion. Just as exploration paths into the unknown are rapidly explored and often discarded, much of the work around them is as well, including software.

When you combine that understanding with an already resource strapped environment, it shouldn’t be surprising at all that much work done around the science, be it some physical apparatus or something virtual like code is duct taped together and barely functional. To some degree that’s by design, it’s choosing where you focus your limited resources which is to explore and test and idea.

Software very rarely is the end goal, just like in business. The exception with business is that if the software is viewed as a long term asset more time is spent trying to reduce long term costs. In research and science if something is very successful and becomes mature enough that it’s expected to remain around for awhile, more mature code bases often emerge. Even then there’s not a lot of money out there to create that stuff, but it does happen, but only after it’s proven to be worth the time investment.


>Ultimately I’d say the core issue here is that research is complex and those environments are often resource strapped relative to other environments. As such this idea of “getting shit done” takes priority over everything.

That conforms to my experience


maintainable prototypes are overengineered


The rule-of-thumb of factoring out only when you've written the same code three times rarely gets a chance here, because as soon as you notice a regularity, and you think critically about it, your next experiment breaks that regularity.

It's tempting to create reusable modules, but for one-off exploratory code, for testing hypotheses, it's far more efficient to just write it.


Indeed, and whatever code is used to publish a paper is a prototype, and unlikely to be reused, ever. Sometimes it is, but rarely.


Is there any metrics which proves that making maintainable code is slower? Because in my experience there is no difference.


I have tons of examples of code where I did the simplest thing to solve the problem. Then later needed a change. I could refactor the entire thing to add this change or just hack in the change. Refactoring the entire thing takes more work than the hack so hack it is unless I forsee this is going to matter later. Usually it doesn't


That’s just anecdote, just like mine. Even simple lack of experience or lack of skills can cause that (which were definitely in my case). Also, I’m quite sure that a terrific coder can create maintainable code faster than an average one bad code. That’s why I asked some statistical data about that.


>I've found the problems that biologists cause are mostly 1. Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months

That's not on them though. That's on the state of the tooling in the industry.

Most of the time, dependencies could just be a folder you delete, and that's that (node_modules isn't very far from that). Instead it's a nightmare - and not for any good reason, except historical baggage.

The biologists writing scientific programs don't want "shared libraries" and other such BS. But the tooling often doesn't give them the option.

And the higher level abstractions like conda and pip and poetry and whatever, are just patches on top of a broken low level model.

None of those should be needed for isolated environments, only for dependency installation and update. Isolated environments should just come for free based on lower level implementation.


While I agree tooling could be better, while in grad school I found that a lot of academics / grad students don't know that any of the tooling even exists and never bothered to learn if and such tooling existed that could improve their life. Ditto with updating their language runtimes. It really seemed like they viewed code as a necessary evil they had to do to achieve their research goal.


I was going to write a response but you've put what I would have said perfectly. The problem, at least in academia, is the pressure to publish. There is very little incentive to write maintainable code and finalise a project to be something accessible to an end user. The goal is to come up with something new, publish and move on or develop the idea further. This alone is not enough reason not to partake in practices such as unit tests, containerisation and versatile code but most academic code is written by temporary "employees". PhD's a in a department for 3-4 years, Post Doc's are there about the same amount of time.

For someone to shake these bad practices, they need to fight an uphill battle and ultimately sacrifice their research time so that others will have an easier time understanding and using their codes. Another battle that people trying to write "good" code would need to fight is that a lot of academics aren't interested in programming and see coding as simply as means to an end to solve a specific problem.

Also, another bad practice few bad practices to add to the list:

* Not writing documentation.

* Copying, cutting, pasting and commenting out lines of code in lieu of version control.

* Not understanding the programming language their using and spending time solving problems that the language has a built in solution for.

This is at least based on my own experience as a PhD student in numerical methods working with Engineers, Physicists, Biologists and Mathematicians.


Sometimes I don’t blame people for committing the ‘sin’ of leaving commented code; unless you know that code used to exist in a previous version, it may well have never existed.


It can be very warranted. For a client I'm working with now I'll routinely comment out big swaths of code as they change their mind back and forth every month or so on certain things. They won't even remember it used to exist.


These patterns appear in many fields. I take it as a sign that the tooling in the field is underdeveloped.

This leads to a split between domain problem solvers, who are driven to solve the field's actual problems at all costs (including unreliable code that produces false results) and software engineers, who keep things tidy but are too risk-averse to attempt any real problems.

I encourage folks with interests in both software and an area of application to look at what Hadley Wickham did for tabular data analysis and think about what it would look like to do that for your field.


Unreliable code that produces false results does not solve the field's actual problems, and is likely to contribute to the reproducibility problem. It might solve the author's immediate problem of needing to publish something.

Update: I guess I misinterpreted OP's intent here, with "unreliable code that produces false results" being part of the field's actual problems rather than one of the costs to be borne.


I meant that the drive to solve problems at all costs can be self-defeating if you overextend yourself by making unreliable code that produces false results.


May be biology (or really, may be not) but honestly it's just the nature of the beast. Literally fortran is the oldest language, it's just the attitude and spirit is different than that of software development.


journals, research universities/institutions, and grant orgs have the resources and gatekeeping role to encourage and enforce standards, train and support investigators in conducting real science not just pseudoscience, but these entities are actively disowning their responsibility in the name of empty "empowerment" (of course because rationally no one has a real chance of successfully pushing through a reform, so the smart choice is to just not rock the boat)


Can you elaborate on your thoughts regarding Wickham?


Not the person you are replying to, but here are my thoughts:

He wrote the tidyverse package/group of packages which includes/is tightly associated with ggplot. It is an extensive set of tools for analyzing and plotting data. None of it is can't be done in base or or with existing packages, but it streamlined the process. It is an especially big improvement when doing grouped/apply functions, which, in my experience, is a huge part of scientific data analysis.

For many R users (especially those trained in the past 5 years or so) tidyverse and ggplot are barely distinguishable as libraries as opposed to core R features. I personally don't like ggplot for plotting and do all my figures in base R graphics, but the rest of tidyverse has dramatically improved my workflow. Thanks to tidyverse, while my code is by no means perfect (I agree with all the aforementioned criticisms of academic coding, especially in biology/ecology), it is cleaner, more legible, and more reproducible in large part thanks to tidyverse.


My interpretation:

Good APIS, preferably declarative, allows the scientist to write concise code.

Win.


I work in an R&D environment with a lot of people from scientific backgrounds who have picked up some programming but aren't software people at heart. I couldn't agree more with your assessment, and I say that without any disrespect to their competence. (Though, perhaps with some frustration for having to deal with bad code!)

As ever, the best work comes when you're able to have a tight collaboration between a domain expert and a maintainability-minded person. This requires humility from both: the expert must see that writing good software is valuable and not an afterthought, and the developer must appreciate that the expert knows more about what's relevant or important than them.


> As ever, the best work comes when you're able to have a tight collaboration between a domain expert and a maintainability-minded person. This requires humility from both: the expert must see that writing good software is valuable and not an afterthought, and the developer must appreciate that the expert knows more about what's relevant or important than them.

I do work in such an environment (though in some industry, and not in academia).

An important problem in my opinion is that many "many software-minded people" have a very different way of using a computer than typical users, and are always learning/thinking about new things, while the typical user has a much less willingness to be permanently learning (both in their subject matter area and computers).

So, the differences in the mindsets and usage of computers are in my opinion much larger than your post suggest. What you list are in my experience differences that are much easier to resolve, and - if both sides are open - not really a problem practice.


> They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.

You can't solve the first 3 issues without having people who care about software quality. People not caring about the quality of the software is what caused those initial 3 problems in the first place.


And you can't fix any of this as long as "software quality" (the "best practices") means byzantine enterprise architecture mammoths that don't even actually fix any of the quality issues.


There are crazy over-engineered solutions with strict requirements and insane dependency management with terrible trade-offs and compromises. I've worked in the aerospace field before, so I've seen how terrible this can be. It's also possible to have unit tests, have a design and have documentation without the above and would go a long way to solve the original 3 issues.


Yeah, if only scientists would put the same care into the quality of their science...


> Yeah, if only scientists would put the same care into the quality of their science...

I guess we see survivorship bias here: the people who deeply care about the quality of their science instead of bulk producing papers are weeded out from their scientific jobs ... :-( Publish or perish.


I only worked briefly in software for research, and what you described matched my experience, but with a couple of caveats.

Firstly, a lot of the programs people were writing were messy, but didn't need to last longer than their current research project. They didn't necessarily need to be maintained long-term, and therefore the mess was often a reasonable trade-off for speed.

Secondly, almost none of the software people had any experience writing code in any industry outside of research. Many of them were quite good programmers, and there were a lot of "hacker" types who would fiddle with stuff in their spare time, but in terms of actual engineering, they had almost no experience. There were a lot of people who were just reciting the best practice rules they'd learned from blog posts, without really having the experience to know where the advice was coming from, or how best to apply it.

The result was often too much focus on easy-to-fix, visible, but ultimately low-impact changes, and a lot of difficulty in looking at the bigger picture issues.


> There were a lot of people who were just reciting the best practice rules they'd learned from blog posts, without really having the experience to know where the advice was coming from, or how best to apply it

This is exactly my experience too. Also, the problem with learning things from youtube and blogs is that whatever the author decides to cover is what we end up knowing, but they never intended to give a comprehensive lecture about these topics. The result is people who dogmatically apply some principles and entirely ignore others - neither of those really work. (I'm also guilty of this in ML topics.)


> Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months

I'm not sure what "uninstallable" code is, but why does it matter? Do scientists really need to know about dependencies when they need the same 3 libraries over and over? Pandas, numpy, Apache arrow, maybe OpenCV. Install them and keep them updated. Maybe let the IT guys worry about dependencies if it needs more complexity than that.

> Writing completely unreadable code, even to themselves, making it impossible to maintain. This means they always restart from zero, and projects grow into folders of a hundred individual scripts with no order, depending on files that no longer exists

This is actually kind of a benefit. Instead of following sunk cost and trying to address tech debt on years-old code, you can just toss a 200-liner script out of the window along with its tech debt, presumably because the research it was written for is already complete.

> Foregoing any kind of testing or quality control, making real and nasty bugs rampant.

Scientific code only needs to transform data. If it's written in a way that does that (e.g. uses the right function calls and returns a sensible data array) then it succeeded in its goal.

> They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.

Sooo...another argument in favor of the way scientists write code then? Isn't "getting shit done" kind of the point?


Yeah these problems with "engineer code" the author describes, they are real, but it's a well known thing in software engineering. It's exactly what you can expect from junior developers trying to do their best. More experienced programmers have gone through the suffering of having to work on such code, like the author himself, and don't do these mistakes. Meanwhile, experienced scientists still write terrible code...


I'm a software engineer working with scientist-turned-programmers, and what I've experienced is also exactly the opposite of the author. The code written by the physicists, geoscientists and data scientists I work with often suffers from the following issues:

* "Big ball of mud" design [0]: No thought given to how the software should be architected or what the entities that comprise the design space of the problem are and how they fit together. The symptoms of this lack of thinking are obvious: multi-thousand-line swiss-army-knife functions, blocks of code repeated in dozens of places with minor variations, and a total lack of composability of any components. This kind of software design (or lack of design, really) ends up causing a serious hit to productivity because it's often useless outside of the narrow problem it was written to solve and because it's exceedingly hard to maintain or add new features to.

* Lack of tests: some of this is that the scientist-turned-programmer doesn't want to "waste time" writing tests, but more often it's that they don't know _how_ to write good tests. Or they have designed the code in such a way (see above) that it's really hard to test. In any case--unsurprisingly--their code tends to be buggy.

* Lack of familiarity with common data structures and algorithms: this often results in overly-complicated brute-force solutions to problems being used when they needn't have and in sub-par performance.

This quote from the author stood out to me:

> I claim to have repented, mostly. I try rather hard to keep things boringly simple.

...because it's really odd to me. Writing code that is as simple as it can be is precisely what good programmers do! But in order to get to the simplest possible solution to a non-trivial problem you need to think hard about the design of the code and ensure that the abstractions you implement are the right ones for the problem space. Following the "unix philosophy" of building small, simple components that each do one thing well but are highly composable is undoubtedly the more "boringly simple" approach in terms of the final result, but it's a harder to do (in the sense that it may take more though and more experience) than diving into the problem without thinking and cranking out a big ball of mud. Similarly reaching for the correct data structure or algorithm often results in a massively simpler solution to your problem, but you have to know about it or be willing to research the problem a bit to find it.

The author did at least try to support his thesis with examples of "bad things software engineers do", but a lot of them seem like things that--in almost every organization I've worked at in the last ten years--would definitely be looked down on/would not pass code review. Or are things ("A forest of near-identical names along the lines of DriverController, ControllerManager, DriverManager, ManagerController, controlDriver") that are narrowly tailored to a specific language at a specific window in time.

> they care too much about the quality of their work and not enough about getting shit done.

I think the appearance of "I'm just getting shit done" is often a superficial one, because it doesn't factor in the real costs: other scientists and engineers can't use their solutions because they're not designed in a way that makes them work in any other setting than the narrow one they were solving for. Or other scientists and engineers have trouble using the person's solutions because they are hard to understand and badly-documented. Or other scientists and engineers spend time going back and fixing the person's solutions later because they are buggy or slow. The mindset of "let's just get shit done and crank this out as fast as we can" might be fine in a research setting where, once you've solved the problem, you can abandon it and move on to the next thing. But in a commercial setting (i.e. at a company that builds and maintains software critical for the organization to function) this mindset often starts to impose greater and greater maintenance costs over time.

[0] https://en.wikipedia.org/wiki/Anti-pattern#Big_ball_of_mud


> Lack of familiarity with common data structures and algorithms

This part I 100% agree with. I adapt a lot of scientific code as my day-to-day and most of the issues in them tend to be making things 100x slower than they need to be and then even implementing insane approximations to "fix" the speed issue instead of actually fixing it

>"Big ball of mud" design

Funny enough this was explicitly how my PI at my current job wants to implement software. In his opinion the biggest roadblock in scientific software is actually convincing scientists to use the software. And what scientists want is a big ball of mud which they can iterate on easily and basically requires no installation. In his opinion a giant Python file with a requirement.txt file and a Python version is all you need. I find the attitude interesting. For the record he is a software engineer turned scientist, not the other way around, but our mutual hatred for Conda makes me wonder if he is onto something ...

>I think the appearance of "I'm just getting shit done" is often a superficial one, because it doesn't factor in the real costs: other scientists and engineers can't use their solutions because they're not designed in a way that makes them work in any other setting than the narrow one they were solving for.

For the record my experience is the exact opposite. The crazy trash software probably written in Python that is produced by scientists are often the ones more easily iterated on and used by other scientists. The software scientists and researchers can't use are the over-engineered stuff written in a language they don't know (e.g. Scala or Rust) that requires them to install a hundred things before they are able to use it.


> The mindset … might be fine in a research setting

A vast amount of software is written for research papers that would be useful to people other than the paper’s authors. A lot of software that is in common use by commercial teams started off in academia.

One of the major issues I see is the lack of maintenance of this software, especially given all the problems written in your post and the one above. If the software is a big ball of mud, good luck to anyone trying to come in and make a modification for their similar research paper, or commercial application.

I don’t know the answer to this, but I think additional funding to biology labs to have something like a software developer who is devoted to making sure their lab’s software follows reasonably close to software development best practices would be a great start. If it’s a full time position where they’d likely stick around for many years, some of the maintenance issues would resolve themselves, too. This software-minded person at a lab would still be there even after the biology researchers have moved on elsewhere, and this software developer could answer questions from other people interested about code written years ago.


This is the goal of the RSE field, but it's often still quite rare :(

https://us-rse.org/


That's fantastic, I haven't heard of this group before! I wish there was a lot more effort spent here.

This seems like a much better way to spend one's software development time and experience than, say, ad-tech... at least in my humble opinion :)


This was my exact experience working in biomedical hpc.


    >    * Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months
This is definitely true, but I've searched *far and wide* , and unfortunately it's not a simple task to get this right.

Ultimately, if there were a simple way to get data in the correct state in an os-independent, machine independent (from raspberry pi to HPC the code should always work), concise, and idempotent way - people would use it. There isn't. But the certainly could be.

The solution we desperately need is a basically a pull request to a simple build tool (make, Snakemake, just, task, etc) that makes this idempotent and os-independent setup simple. Snakemake works on windows and Unix, so that's a decent start.

One big point is matching data outputs to source code and input state. *Allowing ipfs or torrent backends to Snakemake can solve this problem.*

The idea would be to simply wrap `input/output: "/my/file/here"` in `ipfs()`, wherein this would silently check if the file is locally cached to return, but if not go to IPFS as a secondary location to check for the file, then if the file isn't at either place, calculate it with the run command specified in Snakemake. It's useful to have this type of decentralized cache, because it's extremely common to run commands that may take several months on a supercomputer that give files that may only be a few MBs (exchange correlation functional) or only a few GBs (NN weights) so downloading the file is *immensely* cheaper to do than re-running the code - and the output is specified by the input source code (hence git commit hash maps to data hash).

The reason IPFS or torrent is the answer here is for several reasons: 1) The data location is specied by the hash of the content - which can be used to make a hash map of git commit hashes of source code state that map to data outputs (the code uniquely specifies the data in almost all cases, and input data can be included for the very rare cases it doesn't) 2) The availability and speed of download scales with popularity. Right now, were at the mercy of centralized storage systems, wherein the download rate can be however low they want it to be. However, LLM NN weights on IPFS can be downloaded very fast when millions of people *and* many centralized storage providers have the file hosted. 3) The data is far more robust to disappearing. Almost all scientific data output links point to nothing (MAG, sra/geomdb - the examples are endless). This is for many reasons such as academics moving and the storage location no longer being funded, accounts being moved, or simply the don't have enough storage space for emails on their personal Google drive and they delete the database files from their research. However, these are often downloaded many times by others in the field - the data exists somewhere - so it just needs to be accessible by decentralizing the data and allowing the community to download the file from the entire community which has it.

One of the important aspects to include in this buildtool would be to ensure that, every time someone downloads a certain file (specified by the git commit hash - data hash map) or uploads a file after computing it, they host the file as well. This way the community grows automatically by having a very low resource and extremely secure IPFS daemon host all of the important data files for different projects.

Having this all achieved by the addition of just 6 characters in a Snakemake file might actually solve this problem for the scientific / data science community, as it would be the standard and hard to mess up.

The next issue to solve would be popularize a standard way to get a package to work on all available cores/gpu/resources, etc on a raspberry pi to HPC without any changes or special considerations. Pyspark almost does this, but there's still more config than desirable for the community, and the requirement of installing OS-level dependencies (Java stuff) to work on python can often halt it's use completely (if the package using pyspark is a dependency of a dependency of a dependency, wet lab biologists [the real target users] *will not* figure out how to fix that problem if it doesn't "just work"[TM])


What you’re describing sounds like DVC (at a higher-ish—80%-solution level although my brain switched off at the mention of IPFS).

https://dvc.org/

See pachyderm too.


Of course, it's absolutely DVC. The problem is that I've never seen a DVC solution that solves the problem by making the hosting decentralized. So all of the huge problems I listed still exist even with these DVC packages. What's more is, even in addition to the cost of the hosting, some of the DVC packages cost money on top of that. So, when a researcher deletes a file to make room for others on their storage provider and/or moves institutions and their account gets deleted, the data is gone. The only way around this is to use torrent or ipfs.

Also, I'm not sure what your issue with ipfs is; If it's 'I saw something something crypto one time' - it's a really poor argument. IPFS works completely independently of any crypto - it has nothing really to do with it. The solution can also be torrent - I don't care too much - it's just possible that IPFS can run with far less resource usage on lower power, etc because it's more modern (likely uses better algorithms in the protocol, deals with modern filesystems better, with better performance, hopefully have better security, etc) and it's likely easier to implement. But it doesn't matter if it's torrent because it would work essentially the same way.


> making the hosting decentralized

> I'm not sure what your issue [is] with ipfs

IPFS is cool. I played around with it a few years ago. I even had a similar idea during my masters (and then discovered IPFS).

Decentralization is my issue here -- it's not necessary and it would be more of a blocker than a solution.


I'm not sure I understand?

If you want to host the data on Dropbox, Dropbox becomes a part of the network and is hosted by the community and Dropbox or whatever service you like. The problem is that it is very prohibitive to use services like AWS. For instance, downloading all of Arxiv.org from AWS is around $600 last time I checked. Not every student trying to run an experiment is going to have $600 laying around to run the experiment. But with torrent/ipfs it would be free to download.

It only adds. I don't understand how it subtracts.


I just handed in my PhD in computer science. Our department teaches "best practices" but adherence to them is hardly possible in research:

1) Requirements change constantly, since... it's research. We don't know where exactly we're going and what problems we encounter.

2) Buying faster hardware is usually an option.

3) Time spent on documentation, optimization or anything else that does not directly lead to results is directly detrimental to your progress. The published paper counts, nothing else. If a reviewer ask about reproducibility, just add a git repository link.

4) Most PhD students never worked in industry, and directly come from the Master's to the PhD. Hence there is no place where they'd encounter the need to create scalable systems.

I guess Nr. 3 is has the worst impact. I would love to improve my project w.r.t. stability and reusability, but I would shoot myself into the foot: It's no publishable, I can't mention it a lot in my thesis, and the professorship doesn't check.


Putting some effort into (3) can increase your citations (h-index). If people can’t use your software then they will just find some other method to benchmark against or build on.

Here you are not improving your time to get out an article, but reducing it for others - which will make your work more influential.


> 3) Time spent on documentation, optimization or anything else that does not directly lead to results is directly detrimental to your progress.

Here's is where I disagree. It's detrimental in the short term, but to ensure reproducibility and development speed in the future you need to follow best practices. Good science requires good engineering practices.


The point is, it's not prioritized since it's not rewarded. Grad students are incentivized to get their publications in and move on, not generate long-term stable engineering platforms for future generations.


An experimental research system does not have to be a complete practical system, it can focus on a few things to prove a point, support a scientific claim.


Indeed. It doesn't have to consistently work, be easy to modify, be efficient, be well documented, etc., and in general usually won't be since there is no reward for any of these. It just has to "prove a point" (read: provide sufficient support for the next published paper, with paper reviewers caring far more about the paper's text than any associated code or documentation).

Anyone who spends lots of time trying to make research-relevant code projects with solid architecture / a well designed API / tests / good documentation / etc. is doing it as a labor of love, with the extra work as volunteer effort. Very occasionally a particularly enlightened research group will devote grant money to directly funding this kind of work, but unfortunately academia by and large hasn't found a well organized way to support these (extremely valuable) contributions, and lots of these projects languish, or are never started, due to lack of support.


Universities are full of smart people, who know what works best for them. I doubt they would ignore extremely valuable work.


What's your point? In practice, people doing work on solid research infrastructure code don't get social or financial support, don't get tenure, often can't keep academic jobs, and typically end up giving up and switching to (highly paid and better respected) industry work. Sometimes that code ends up supported as someone's part-time hobby project. If you hunt around you can find this discussed repeatedly, sometimes bitterly. Some of the most important infrastructure projects end up abandoned, with no maintainers.

In practice, most research code (including supposedly reusable components) ends up getting written in a slipshod ad-hoc way by grad students with high turnover. It typically has poor choice of basic abstractions, poor documentation, limited testing, regressions from version to version, etc. Researchers make do with what they can, and mainly focus on their written (journal paper) output rather than the quality or project health of the code.


Never had a paper rejected for lack of reproducibility though. And as long as I am working for the PhD and not the long term career, it's still better to focus on the short term. I don't like it, but I feel that's where I ended up :(


I agree. Been doing devops recently but back at some coding at work and I wrote the function as simple as I could, adding complexity but only as needed.

So it started as a MVC controller function that was as long as your arm. Then it got split up into separate functions, and eventually I moved those functions to another file.

I had some genuine need for async, so added some stuff to deal with that, timeouts, error handling etc.

But I hopefully created code that is easy to understand, easy to debug/change.

I think years ago I would have used a design pattern. Definitely a bridge - because that would impress Kent Beck or Martin Fowler! But now I just want to get the job done, and the code to tell a story.

I think I pretend I am a Go programmer even if I am not using Go!


Congrats, you used design patterns.


Still have the self made pat on my back


Yeah nah.

There was the flawed model out of Imperial College (IIRC) during the early covid days that showed up how wrong this attitude is.

It was so poorly written that the results were effectively useless and non-deterministic. When this news came out, the scientists involved doubled down and instead of admitting that coding might be hard, and getting in a few experts to help out might be useful, actually blamed software engineers for how hard it is to use C++.


In other words, programmers tend to over-engineer, and non-programmers tend to under-engineer. Despite all the arguments here about who’s making the biggest messes, that part is not surprising at all.

Both are real problems. Over-abstraction and over-engineering can be very expensive up front and along the way, and we do a lot of it, right? Under-engineering is cheaper up front but can cause emergencies or cost a lot later. Just-right engineering is really hard to do and rarely ever happens because we never know in advance exactly what our requirements and data really are.

The big question I have about scientific environments is why there isn’t more pair-programming between a scientist and a programmer? Wouldn’t having both types of expertise vetting every line of code be better than having each person over/under separately? Ultimately software is written by teams, and it’s not fair to point fingers at individuals for doing the wrong amount of engineering, it’s up to the entire team to have a process that catches the wrong abstraction level before it goes too far.


it's exclusively because engineers are more expensive than grad students


Can you elaborate? What is answered by engs vs grad students? What grad students are we talking about?


Programmers want to embed domain terms everywhere. They look at scientific code and expect to see variables names containing "gravity," "velocity," etc.

Scientists need code to conform to the way they examine, solve, and communicate problems. I asked for an explanation of a particular function and was sent a PDF and was told to look at a certain page, where I found a sequence of formulas. All of the notation matched up, with the exception that superscripts and subscripts could not be distinguished in the code. To a programmer, the code looked like gibberish. To the scientists working on the code, it looked like a standard solution to a problem, or at least the best approximation that could be given in code.

You see the inverse problem when it comes to structuring code and projects: programmers see standard structures, expected and therefore transparent; scientists see gibberish. Scientists look at a directory called "tests" and think of a variety of possible meanings of the word, none of them what the programmer intended.


The programmer's naming approach has the virtue of being self-explanatory, and thus more maintainable. Scientists don't care about maintainability. Their bar is reproducibility, and even for that they don't expect it to be as painless as an automated test.


even the variable names used by programmers are abbreviations for a longer description. longer than one letter, but still shorter than a sentence


Unless they're old skool enterprise Java programmers!


While I think there are a couple of valid points, in general my feeling is that the author is setting up a straw man to attack.

Most of the “programmer sins” are of the type that more seasoned engineers will easily avoid, especially those with experience working with scientific code. Most of these mistakes are traps I see junior developers falling into because of inexperience.


I think we have a case of survivorship bias.

A considerable majority of the science-non-SWE crowd are de facto incapable of writing more than 100 lines of runs-in-my-notebook code.

Hence, if a change/bug is necessary, it is much likelier to fall under a SWE jurisdiction, and hence is much more likely to be industrial code.

Add to that a further confounder (tiptoeing a "no true Scotsman" here): academia is not a first choice of workplace for strong SWEs.


Read this on mobile and the identifier longWindedNameThatYouCantReallyReadBTWProgrammersDoThatALotToo overflowed into the margins - I regard this not as a bug but a feature which helped make the author’s point :-)


That’s why I fell in love with Objective C. The libraries used a lot of those expressive descriptions for attributes and methods.

I never understood nor understand people who nest their inner loops in an entangled mess of hardly distinguishable digits, which is error prone.

Same for method names.

I try to use speaking out loud to some of my methods: What do you do? And if the answer is getValue I believe it needs renaming.


We use this technique as a guide in our company. If someone (knowledgable) would ask "What does this method call do?" and the method name does not answer that, your PR doesn't go in the master.

E.g. getString(path) for loadConnectionStringFromDisk(configFilePath), tryConnect(30) for testSqlConnection(timeoutInSec), even the reader now knows what happens here and what input is expected.


> Invariably, the biggest messes are made by the minority of people who do define themselves as programmers.

After 15 years of writing JavaScript professionally I know that is a lie. The biggest messes are made by the majority of people hired that cannot really program.


I guess this could be an economics thing. Stereotypically maintenance of scientific codebases in general is not very lucractive, and the mental kick (IMHO you need to enjoy high performance numerical computing to be truly good at it) can be had for much better compensation doing stuff like cad or game engines. So I would imagine if the author has lots of experience of "professional programmers" maintaining their scientific codebase the talent pool from which they are sampled is not necessarily optimal for high output individual contributors.

My intent is not to put down maintainers of scientific software! It's super cool and super important.

I see the damage a person decades in an industry can do when they cluelessly and energetically start to test and implement a new shiny thing on an industrial codebase.

When the product brings in hundreds of millions a year, there is incentive to patch up the damage so you can have future releases and continue the business. I'm not sure how much resources a scientific codebase maintenance could use just to patch up a mountain of architectural and runtime damage.


The goal of 95% of JavaScript in the wild is as mild as respond to user interactions and put text on screen. Its beyond trivial simple, but almost nobody is well trained to either the language or browser. As a result most people come in with assumptions of how things should work as determined by their education or experiences in unrelated languages and boy are most of those assumptions wildly incorrect. On top of that most JS developers tend to skew extremely young and are wildly insecure about complex data structures.

The result is a complete inability to program. Most people need really large tools to do more than 80% of the heavy lifting and they just write a few instructions on top of it. The perspective then becomes you need more advanced technologies to do cool things, because everything is too scary or mysterious otherwise.


Want to give any examples or reasoning rather than state pure opinion? I’m not a fan of using “lie” when you believe something isn’t true. Lie implies intentional dishonesty, and there’s absolutely no reason to suspect the author doesn’t believe what they said. Their experience certainly could have involved larger messes made by programmers than scientists. Just say you think it’s not true, and why, even if lie seems funny or you don’t mean to imply dishonesty.

It appears that you are not even talking about the same problem as the author. You seem to be talking about people who all define themselves as programmers, some of whom have more experience than others. The author wasn’t talking about new-hire programmers, they were talking about experienced physicists, chemists, biologists, etc., who have been doing some programming, possibly for a long time.

Either way, most of my experience is with all-programmer teams, and I have to say I’ve seen the experienced programmers make far bigger and costlier messes. The people who can’t really program might always make a lot of messes, but they make very small messes, and nobody puts them in charge of teams or lets them do that much process critical work without oversight or someone re-writing it. I’ve watched very good very experienced programmers make enormous mistakes such as engaging in system-wide rewrites that turn everything into a mess, and that cost many millions of dollars, only to take years longer than they estimated, and to come out the other end admitting it was a mistake. There was also the time a senior programmer tried to get really clever with his matrix copy constructor, and caused an intermittent crash bug only in release builds that triggered team-wide overtime right before a deadline. He was incredulous at first when we started to suspect his code, and I had to write a small ad-hoc debugger just to catch it. I calculated the dollar cost of his one line of cleverness in the several tens of thousands of dollars.


Most people that write JS professionally cannot program, or at least cannot program in JavaScript though not programming at all is more generally true. More than 90% of people doing this work, for work, are fully reliant upon multiple artificial layers of abstraction. For example if you take away a developer's favorite framework they suddenly become hopelessly irredeemable. Even with their favorite framework if you ask most developers to write original functionality beyond merely putting text on screen, such as a common CRUD app, they are hopelessly lost.

This becomes immediately clear when you confront developers about this. Most of their answers will be irrational qualifiers which might make sense to them, but from a perspective of objectivity and product delivery its really mind blowing. In most cases the insanity stems from poor preparation followed by what then becomes unrealistic expectations.

Just as a real world experiment ask a front end developer to write to the DOM directly. The DOM is the compile target of the browser accessed via standard API which can be mastered in less than 4 hours of practice. Despite that prepare to be under impressed and dazzled by the equivocations, unfounded assumptions, red herrings, and so forth. The DOM is just an in-memory data structure with a standard API, but seems large data structures scare people.

---

All a person really needs to know to be good at this language:

* Functions are first class citizens. This means a function can be expressed or referenced any where a primitive can be used. This is incredibly expressive.

* Lexical scope is native. This means lexical scope is always universally on, not hidden behind syntax, and can never be turned off. This is also incredibly expressive.

* OOP is optional. The language never forces OOP conventions upon the developer, which is great because the concept of polyinstantiation, on which OOP is based, greatly increases complexity.

* The language is multi-callstack. This is commonly referred to the event loop, and allows executing externalizing instructions without locking the language.

* A casual understanding navigating data structures.

That being said anybody can build large, fast, robust applications in JavaScript using only functions, statements/expressions, events, and data structures. TypeScript interfaces help tremendously as well. Despite this most developers need all kinds of vanity to make sense of the most simple tasks and anything original is like asking people to crawl across the Sahara.

> Want to give any examples or reasoning rather than state pure opinion?

Its based upon 15 years of doing that work professionally for multiple employers. By far the biggest messes in this language come from the absence of confidence in the developers writing in it. I imagine scientists, non-professional programmers, writing messy software are at least passionate enough about their subject matter to do it well enough the first time so they aren't spending the rest of their existing fixing bugs, regressions, and performance traps of their own creation.

Perhaps the word lie was incorrect and something like wrong in practice would have worked better.


This is so true I don't think I ever read something so true.

It's not even scientists vs software developers. It's people who are really into software development and clean code.

They say the program needs a total rewrite and proceed to add 20 layers of inheritance and spreading out every function over 8 files.

Ever since I make sure to repeat my mantra every week to developers:

How maintainable code is is measured in how many files you have to edit to add one feature.


>They say the program needs a total rewrite and proceed to add 20 layers of inheritance and spreading out every function over 8 files.

Anyone who in 2023 still thinks inheritance is a good idea for anything other than a few very specialised use-cases is not somebody who seriously cares about the craft of software development, not somebody who's put any effort to study programming theory and move beyond destructive 1990s enterprise Java practices. Widespread usage of inheritance inevitably makes code harder to reason about and refactor, as anyone who's compared code in Java to code for similar functionality in Rust or Go would see (both Rust and Go deliberately eschew support for inheritance due to the nightmares it can cause).


(Ab)use of any paradigm (I'll need a shower for using that word) can result in nightmares. Inheritance has its place and it is definitely useful in more than "few specialised cases". It can get out of hand and it can become a nightmare. Composition has its place and it is definitely not better than inheritance except in "few specialised cases". It can also result in nightmare, just wait till adoption of Rust and go is at the level of Java and C++ in enterprise environment and you will see. Writing clean and maintainable code should be the best practice and writing obfuscated code for performance and security should be reserved for "few specialised cases" but most developers and languages prefer the short and obfuscated to clear and (slightly) longer. RUst and go are perfect examples of why software development is an immature engineering discipline that favors "cool" and "terse" to clear and expressive...and no, C and C++ are not "good old times", they are old and slightly worse but not much worse, or I should clarify, go and rust and not much better because they still do not allow user (programmer) to express the intent clearly and instead force the reader of the code to sound like a person with severe speech impediment.


>Writing clean and maintainable code should be the best practice and writing obfuscated code for performance and security should be reserved for "few specialised cases"

Except, we can (fairly objectively) reason about performance and security, while 'clean code' and 'maintainability' are arbitrary, with vague guidelines at best.

Throwing out those first characteristics in name of the latter ones is just irrational.

(Not to even mention that performant and safe code still can be 'clean')


There's probably a Someoneorother's Law or Something Fallacy about this, because it's a common problem, especially among people who fancy themselves More Rational (and thus More Intelligent) than others:

You are assuming that the only things that matter are those that can be objectively measured (and measured simply and straightforwardly, with well-known metrics today).

Developer frustration, which will increase when having to deal with messy, unmaintainable code, is a real thing, even if it's harder to measure than performance and security. Not only does it create real stress and thus harm to the developers, it also slows development in ways that are going to be much less consistent and predictable than what's needed to write clean, maintainable code in the first place.

(Also, of course, there are at least some fairly well-accepted standards of clean, maintainable code, even if some aspects of those aren't entirely agreed on by everyone, and painting them as completely arbitrary, subjective things is just wrong.)


> There's probably a Someoneorother's Law or Something Fallacy about this

I once got your for a couple of hours researching the origin of that famous phrase "you can't improve what you don't measure", so I could blame it correctly.

The idea is quite old, of course, and popular to the 19th century rationalists. But the format people keep teaching around today seems to be a strawman created by Deming, at the 80's, in a speech about how stupid that idea is.

Anyway, I guess we need some Othersomeoneorother's Law about how you just can't make a good point against an idea without someone taking your point, preaching it unironically, and making a movement in support of the idea.


>You are assuming that the only things that matter are those that can be objectively measured.

No. Feelings do matter. But the problem is, what do You do when You have 2 people with conficliting feelings?

>Also, of course, there are at least some fairly well-accepted standards of clean code.

Are there though?

>even if some aspects of those aren't entirely agreed on by everyone, and painting them as completely arbitrary, subjective things is just wrong

Even if I grant You that there are some guidelines that are respected by overwhelming majority, that still doesn't prevent them from being arbitrary.


> But the problem is, what do You do when You have 2 people with conficliting feelings?

Hopefully, you try to work it out like adults, rather than just declaring that your way is the only rational way, and anyone else's feelings need to pound sand.

Furthermore, this isn't primarily about "feelings" in the sense of "this hurt my feelings;" this is primarily about adding unnecessary stress to developers' lives. Stress is something that is scientifically proven to increase susceptibility to diseases and cancers, and reduce lifespans, so it seems to me that this should be enough objective and rational evidence that we should be genuinely trying to reduce it.

> Are there though?

Well, I think most people would agree that putting an entire C file on one line is a pain to work with, even if skipping the "unnecessary" whitespace does save a little space.

And naming your variables alphabetically based on the order you use them in (eg, `int alpha`, `char bravo`, `std::string charlie` makes the code hard to maintain.

"Well, but that's just obvious stuff! No one would ever do that!"

I guarantee you someone would do just about any boneheaded thing you can imagine in programming unless told not to, either out of spite or because their brain really just works that way.

Just because you've made a bunch of assumptions about how people would or should code doesn't mean that those assumptions are any less arbitrary than anything else.

> that still doesn't prevent them from being arbitrary.

...But that's the thing. They're not. Just because they're not deeply well-researched to ensure that this particular set of coding standards measurably increases performance and decreases stress while maintaining code doesn't mean that maintainability is an arbitrary thing. It just means that it hasn't been adequately studied yet.

...or maybe it just means you haven't[0] looked[1] enough[2] yet, and the research that's out there hasn't yet had time to coalesce into any kind of industry-wide action.

Furthermore, it sounds very much like you're saying that coding standards like K&R, or C++ Core Guidelines, or PSR-2, are entirely arbitrary. They're clearly specified, they're written down and easy to reference, they codify plenty of aspects of coding style—but are all coding standards, no matter how well-respected, completely arbitrary?

[0] https://www.researchgate.net/publication/299412540_Code_Read...

[1] https://www.hindawi.com/journals/sp/2020/8840389/

[2] https://www.researchgate.net/publication/303870101_Software_...


> Except, we can (fairly objectively) reason about performance and security, while 'clean code' and 'maintainability' are arbitrary, with vague guidelines at best.

Ok, how does one best reason about performance and security with messy unmaintainable code?

You barely need to try even shallow reasoning about a code base at all, before it's clean vs. messy, and maintainable vs. unmaintainable status will feel very objective and pertinent.


>Ok, how does one best reason about performance and security with messy unmaintainable code?

The same way one does it with 'clean' code: Using profiling tools. Security is a bit less straight forward, but still.

>will feel very objective.

Keyword: Feel. And while most people could probably agree on terrible code being terrible, the 'less terrible' the code is, the more this argument becomes a feeling. And then we hit up point where it's no longer possible to discuss things using objective arguments - how will a senior java developer, who is used to heavy OOP style coding reason with a senior C developer for whom such OOP heavy style is the oposite of 'clean'?

Ofcourse, the example is (too) simplistic, but even in this thread You have people arguing about "big functions" vs splitting things up. And unlike the performance, which You can alaways just point to the raw numbers, here You can't rely on any sorts of 'objectivity'


Certainly you can point to numbers re performance. But reasoning about performance is another thing if the code is a mess.

Messiness and unmaintainability does not scale. You can argue its not a completely objective thing to measure. That is true. And yet objectively, it really matters.

Not everything that is important comes with a clear number. Which is why experience and good judgement matter too. Two more things without numbers.


Again though: What do You do when someone thinks Your code is a mess, but You think it's good?


What kind of question is that?

It is impossible to answer without all the details of the situation. Different situations will have completely different answers. Who, what, where, how, when, why, ...?


Mmm. Maybe lets stop and address the question of "what are the key properties of inheritance?". Because I haven't seen a single use for it in maybe a decade or so and I'm not sure what it is you think you're defending.

Usually what people want is an interface; ie, a somewhat generic way of saying "this thing knows how to draw itself", "this thing supports printing" or "this thing can fizzle wuzzles like all the other wuzzle fizzlers". That is essential to coding.

But Java style inheritance carries a lot of baggage in excess of that; and some of it is just bad news. In practice it is a brittle assumption that a Foo is also and always a precise superset of Bar. And usually when that is true the relationship is so shallow having a dedicated concept of inheritance is wasteful, it may as well be an interface and a shared file of code.

TLDR; Inheritance is too many ideas mixed together; most useful and a couple bad. It is a better idea to present the different facets of inheritance to be selected a la carte. Mumble mumble Rich Hickey talks.


Yep. Inheritance is 3 or 4 different features in a trenchcoat, and most of them are bad.

Interfaces are good.

Method overloading for specialization or for creating mini-DSLs (Template Method pattern) is often problematic and is better replaced by composition, or by having the "overloaded" methods in a separate class.

Implementation Inheritance is certainly the worst form of "code reuse", and there's a reason people recommend composition over it since the 90s.

Using it for hierarchies (Dog inherits from Mammal, Mammal inherits from Animal) is just terrible and a joke at this point.


The problem with inheritance is that no one seems to know where to use it best. Everyone just goes by feel and feelings frequently turn out to be wrong.

Composition is nice because it’s very simple and we can understand it mathematically. If you’re trying to understand inheritance mathematically then you’re basically left with using it only for algebraic structures (groups and rings and fields and vector spaces). But then you don’t really need inheritance there if you just have plain types and operator overloading.


Operator overloading leads to less readable, maintainable code IMHO, because you have to go off and figure out if the operator means a special thing in a given context.

Certainly adds to complexity in C++.


It works well when you stick to math and follow mathematical laws. Overloading the addition operator to allow you to add two vectors is great, as long as you make sure you don’t break the laws of vector addition in your vector space.

Overloading addition to mean something else entirely? That’s a problem!

It would be great if type systems could allow us to set up these laws and enforce them at compile time, but then you go down the whole rabbit hole of automated theorem proving.


Rust and Go both provide features to implement code in an OO-ish way: traits and interfaces. I just code in Rust as a hobby but I code in go professionally. The go codebases I work on at work are bloated messes of abstractions and duck typing, often meant to enforce some absurd standard of unit-testing. They can easily be as bad as any "enterprise Java" stereotype you wish to invoke.


I hate to say this but this attitude of "it's 2023, inheritance is so 2008" is telling. How do you know the attitudes regarding best practices today are not going to be as bad as 20 layers of inheritance?


Wrong.

How maintainable code is is measured in how well you know where to change something, and how certain you are that it did the right thing without side effects.

The fatal error of the linked article is that bad scientific code often suffers from correctness problems - not just theoretical concerns, but the "negates the main point of this paper" kind of thing.


What if a new person takes your place and they do not know it and are not certain in anything?


When you hire a new person, there is usually some transfer of knowledge. Based on the above definition such transfer should be quick assuming that the person knows the platform and dependencies. A long time explaining the code is an indicator of needlessly complicated code


That's how you measure job security, which is a slightly different concept than maintainability.


To be honest it's mostly not their fault. Most people want to do the right thing and that's what they're taught. Doing things differently is frowned upon, and most people don't want to stick their neck out and say the emperor is naked.

Recently at work some people argued "things" (methods, classes, even files) should have a limit in size. I think that's valid thinking because you want to strive to having smaller components that you can reuse and compose, if you are able to do that. But what happened is that people started creating dozens of little files containing a function each and then importing those. To be it's obvious that this is now a lot worse because the complexity is still the same, just spread out across dozens of files. But most people were somehow convinced that they're "refactoring" and that they're doing best practices of keeping things small.


Yes, the main problem I've seen with focusing too much on length (of functions, files, or whatever else) is that people start spending tons of time rearranging the big messy drawer into n smaller messy drawers while totally avoiding the difficult work that needs to be done to actually organize the drawer(s).

All other things being equal, smaller functions and smaller files are a little bit better, but what really matters is architectural and conceptual complexity. Keeping those in check is all about using the right data structures (and doing painful refactors when you realize you've got the wrong ones). It has almost nothing to do with how files and functions are organized.


They

Who are these people, really? I mean, I was that for a rather brief period of like a year, somewhere 3-4 years into programming IIRC. I met others like it. But in the end people seem to learn and grow out of it, and rather quickly so, because it becomes clear what the issues are quickly. Really good learning by mistake though, wouldn't have wanted to miss out on it.

mantra

That's imo just another mistake to make: wrapping a rather strict and narrowly scoped principle in a paradigm-like-must-be-followed mantra hurts the right-tool-for-the-job idea, which feels vastly superior.

Firstly the idea that unmaintainable code is necessarily an issue is already wrong to start with, in my book. Obviously it's not ideal and where appropriate - meaning nearly always - should be avoided at pretty much all cost, but I have enough examples where it does not matter at all. As in: code which hasn't been touched in 20 years and probably won't ever be touched. Does it look like a nightmare? Yes. Does it work correctly? Yes. Does it need changing? No. So, is it an issue where spending time (or having spent time) on it would make anything other than the programmer's peace of mind (well, or ego perhaps) better? Clear no.

Secondly: of course I get where you're going with such definitions but it again lacks the very much needed nuancing. I can write unmaintainable code in one file but where you need to change 20 different locations. You could than claim that your mantra still applies because the code should have been split over 20 files, but yeah, that's what you get with mantras :) Likewise depending on the feature it's perfectly possible many files have to be changed but that doesn't necessariy mean the code is hard to maintain. Could try to claim that it wasn't very well architected to start with, maybe, but welcome in the real world where not everything can be thought of from the very start except in small toy applications.


Then a single file project is the most maintainable software project?


> How maintainable code is is measured in how many files you have to edit to add one feature.

Trivially false: if all of the source code is kept in one file then you only ever have to edit one file. No matter how many lines of code are in that file, any such system would be maximally maintainable by your definition.

Maintainability is not so clear-cut. It's currently an imprecise measure of organizational clarity that makes it straightforward to extend a system in ways that are needed or will be needed.


Am I missing something? Opening files is not my most intensive work as a developer.


It's a decent measure of complexity: It's not that "opening files" themselves is work-intensive. But having a lot of files smells of overengineered code. One long, yet simple function has less cognitive overhead than spreading the function across multiple classes or functions or call hierarchies (themselves spread over multiple files).


> One long, yet simple function has less cognitive overhead than spreading the function across multiple classes or functions or call hierarchies

Not if you are encapsulating and naming effectively...

Why read 100 lines when you can read 20 and find concerns in one routine you are concerned with?

Function calls can be expensive. However, optimization can come whenever you need it, and if what you need is one call vs 5, it is trivial to move that code back into a single routine.


> Not if you are encapsulating and naming effectively...

No, and this is one of the reasons inheritance has lost popularity. Splitting some functionality across many files adds significantly to the cognitive load of figuring out what code is actually even running. After you trace that information out, you need to keep it all straight in your head while debugging whatever you’re working on. That’s even more problematic when you’re debugging, which implies you already don’t really understand what the program is doing.

And that’s in the case where things are named well. When they’re inevitably accidentally named in confusing or incorrect ways that can contribute to the bug itself and cause the code to be even more confusing.

Extreme levels of encapsulation has its own issues when, actually, the original author is wrong and you really do need some public access to some member. No one writing code is clairvoyant, so excessive encapsulation is common.


> Splitting some functionality across many files adds significantly to the cognitive load of figuring out what code is actually even running.

This is the crux, if your goal is to figure out what code is running, if you can keep the program in your head, if you have small simple programs splitting things up is harmful.

But there is this murky line, different for everyone, and even different for the same person from day to day, where even with the best intent, no matter how good you are, you can't keep the program in your head.

At that point, you need to give up the idea that you can. Then you change perspective and see things in chunks, split up, divide and conquer, treat portions as black boxes. Trust the documentation's, pre and post conditions. Debugging becomes verifying those inputs and returns; only diving into the code of that next level when those expectations are violated.


But at some point you HAVE to be able to look at the program from above. If you abandon the hope of understanding the code in the bigger scope, how can you ever meaningfully modify it? (Ie add a big feature and not just tweak some small parameters)


The rather unsatisfying answer, is it depends.

It depends on the change. It depends on the code organizational structures. It depends on the consistency of the code. It depends on the testing setup. It depends on the experience of the person changing it. It depends on the sensitivity of the functionality. It depends on the team structures.


There is however one reason that trumps them all: the actual reason the code was split.

Separating the code of your SQL server, HTTP server, Crypto Library, Framework, Standard Library, from your CRUD code is perfectly fine, and people understand this concept well, and even the most fervent anti-Clean-Code person won't complain about this separation existing.

But there is a good reason we separate those things from our CRUD codebase: it's because they can function separately fine, they're reusable, they're easy to isolate/reproduce problems, and they're at a totally different abstraction level.

The problem is separating code from the same level of abstraction, such as breaking a business logic class into many for mainly aesthetic reasons, such as method/class/module length, or to avoid having comments in the code (again as recommended by Clean Code), things that people are mentioning here in this thread.

EDIT: As someone said above, "20 files with 20 functions each does not cause high cognitive load, if the scope of each file and each function makes sense". In the end it's not the length or the number of methods/classes that matter, but how well separated they are. Having hard rules does not automatically make for good code, and it's often quite the opposite.


One last thing to consider, if you are writing little a CRUD app, it can be very simple, you can keep it in your head.

However, Can you?

You are using black box code from a web server, a sql database, the operating system, crypto libraries, and a ton more; You don't dive into that source code except in extraordinary circumstances, if you even can. In a large program, you end up treating code owned by you or your company as the same way.

In this scenario you are still making large meaningful changes by focusing on the level of abstraction you are at.


I like long simple functions because it makes them easy to reason about when debugging.

Rarely does having more functions solve “does this do what I expect.”


Maybe it boils down to how well you are able to navigate a code base.

With a full-featured language specific IDE, it is very easy to navigate through even complicated spaghetti. It makes debugging call traces simple, with a GUI.

However, many other file viewers and editors make this much more complicated, and it can be frustrating to follow code that is making heavy use of modularization.

If you are grepping your way through a deeply modular code base it can quickly become difficult to keep track of anything.


>> With a full-featured language specific IDE, it is very easy to navigate through even complicated spaghetti.

If you need a fancy IDE to navigate around code in order to understand it, that might be crappy or poorly organized code.

Not a dig at nice IDEs, just code that requires one to navigate and understand.


Yeah. This is one of those hammer cases. If you’ve got a fancy IDE then the temptation is to use it. Similar to the issue of game programmers being given top of the line gaming PCs with frequent upgrades. They then struggle to understand why the game they just released runs like crap on most people’s modest computers.


Not many people can understand a very large code base without taking notes, using an IDE, or similar tooling.

> Not a dig at nice IDEs, just code that requires one to navigate and understand

A nice IDE helps you reason about code, no matter what the underlying architecture is. That is why there is a market for them.


Not if you are encapsulating and naming effectively...

Encapsulation is hard and a lot of what people call encapsulation isn’t. For example, taking a global variable and moving it to a class is not encapsulation. You have to actually do the hard work of removing the dependency on global shared state. Just changing everything to mutate the new global through an accessor to a “god” object that gets passed everywhere is accomplishing nothing at all. Worse than nothing, you’re complexifying without fixing the root problem: global mutable state.


It's funny how Singletons became such a meme pattern, and how about 80% of people in interviews only know about it when asked about patterns.

A cleverly-named way of disguising global mutable state does not make it better.


> "Not if you are encapsulating and naming effectively..."

When you only have to superficially skim the code, that works.

If there are incorrect abstractions, such as logging, transaction logic or manual error handling mixed with "well named function calls", then it is already very problematic even to skim.

If you have to debug, it quickly becomes torture. Especially if state is involved and shared between multiple methods or classes.

If you have to reimplement the code: you're probably fucked.


Adding indirection makes code less maintainable.


That's not correct as an unqualified statement. Sometimes indirection adds exactly the flexibility you need and that would otherwise require duplication (like generic collections/containers).

The more correct statement is that using more or less indirection than you need makes code less maintainable.


> Adding indirection makes code less maintainable.

This is why I hated Fortran (77 in particular) as an applications language (for tasks like scientific computing people seemed to use saner portions of it). Computed go tos were the bane of my existence.


Working with a 300 line method is not fun, believe me. Everything is in one place and you don't have to change many files, yes, but due to the cognitive load, it's so much more effort to maintain it.


There are some things that should be in one long function (or method).

Consider dealing with the output of a (lexical) tokeniser. It is much easier to maintain a massive switch statement (or a bunch of ifs/elseifs) to handle each token, with calls to other functions to do the actual processing, such that each case is just a token and a function call. Grouping them in some way not required by the code is an illusory "gain": it hides the complexity of the actual function in a bunch of files you don't look at, when this is not a natural abstraction of the problem at all and when those files introduce extra layers of flow control where tricky bugs can hide. Or see the "PLEASE DO NOT ATTEMPT TO SIMPLIFY THIS CODE" comment from the Kubernetes source[0]. A 300 line function that does one thing and which cannot be usefully divided into smaller units is more maintainable than any alternative. Attempting to break it up will make it worse.

That being said, I agree that nearly all 300 line functions in the wild are not like this.

[0] https://github.com/kubernetes/kubernetes/blob/ec2e767e593953...


There's a happy middle path here I think. Long functions are hard to grok. Spreading the logic across 20 files also increases cognitive load. There's a balance to strike.


Long functions are not hard to grok, if they have a logical flow and stay reasonably close to a common level of abstraction (which can be high or low, doesn't matter). You just read top to bottom and follow the story.

20 files with 20 functions each does not cause high cognitive load, if the scope of each file and each function makes sense. You easily find the file+function you need, whenever you need to look something up and the rest of the time it is as if the rest didn't exist.

Good code can come in any shape. It is not shape itself that is important, it is the "goodness" that is important.


Yep. Long functions can be easier to read and simpler to follow than multiple methods, if they're well documented.

Carmack has a good essay about it.

http://number-none.com/blow/john_carmack_on_inlined_code.htm...


My heuristic is that if logic is repeated at least 3 times, it's good to pull out into its own function, and even then you still need to consider liskov substitution principle.


Right, but the article seems to imply that all code should be in a single file.

It seems the author indeed is not a SW Engineer and thus does not really grok the benefit of "modules".

This of course depends on the size of the program. Small program "fits" into a single module.

And I think that scientific programs are basically small and simple because they don't typically need to deal with user-interaction at all, they just need to calculate a result.

Further I think scientific programs rely heavily on existing libraries, and writing a program that relies heavily on calls to external libraries produces simple, short programs.

Scientists produce science, engineers produce code-libraries.


"It seems the author indeed is not a SW Engineer".

This is a pretty ridiculous notion if you just cursorily glance over the page. It is quite clear that this guy is more of a software engineer than most with that title will ever be. Hint: a blog that contains a post with a title like 'Coroutines in one page of C' is a software engineer.


I dont trust devs that praise clean code as much as i dont trust people that say they eat clean.


That's a slippery mantra. I can see putting everything in one file.


Everything in one file can be good organization for many even relatively large projects.


works until it doesn't


> Simple-minded, care-free near-incompetence can be better than industrial-strength good intentions paving a superhighway to hell. The "real world" outside the computer is full of such examples.

Overengineering is insidious - "It is difficult to get a man to understand something, when his salary depends upon his not understanding it". A team can sell a solution better than a single person fixing something without making a big deal out of it. You get organizational clout and inertia on your side when you make something big and expensive.

And then complex systems are by nature hard to reason about and by extension hard to critique.

So many things come down to "complexity is the enemy".


Prime Finance (at Amazon) did that.

Wrote fancy math on a PhD economist’s laptop.

Then when we added testing when developing a platform, we found out we’d been 5% off in allocating Prime revenue between organizations — and had the correct amount gone to Retail, the 2018 hiring freeze might have been avoided. (According to a very angry Wilke.)

Whoops.

Turned out we did need a team and all those guardrails, processes, code reviews, etc.

There’s a time and a place for “move fast and break things” — but real trouble can come from taking those academic practices into the real world.


Did no-one check the maths? There's a difference between having a full test suite, and someone trying some choice values, but I'd expect both would pick up something like that.


The PhD economists on our team looked at it.

Economists from other teams looked at it and signed off.

There was manual testing — ie, trying some “choice values”.

We discovered their error in convexity when our test suite allowed us to randomly sample the models at scale. (Actually, I had questions before that — but unsurprisingly, when it was just me questioning a PhD economist, the lowly SDE was ignored.)

Good intentions aren’t enough; you need mechanisms.


If simple checks could find all bugs we would not have so much buggy software.


any code base that evolves over time will have complexity. It can be either manageable or unmanageable complexity. Either sacrifice maintainability for early development velocity, or plan for medium-to-long term velocity. You CAN have both velocity and maintainability, but the engineers will be expensive. Fast, good, cheap - pick two. A tale as old as bits.


I think it's not a zero-sum game, you can win by judiciously avoiding incidental complexity.


Best practices tend to be overkill for small codebases that have few users, which encompasses the majority of scientific code.

Sheer tenacity is typically sufficient for scientific codebases.


Is this article just two strawmen fighting?


> I've been working, ... in an environment dominated by people with a background in math or physics who often have sparse knowledge of "software engineering". ... Invariably, the biggest messes are made by the minority of people who do define themselves as programmers.

Interesting switch in language here from "software engineering" to "programmers". There is of course a long history of debate on these terms, whether there is a meaningful distinction, and what qualifies as engineering versus programming.

Wherever you stand on this debate, there are a number of practices of software developers that tend to be used more towards the "engineering" side. Two of the most essential in my mind are peer code reviews and automated testing of changes (with tests, linters, type-checkers, code formatters, profilers, fuzzers, etc.).

This post doesn't talk about any of these practices or whether the so-called "programmers" messing up the scientific code are using them. I'd say if the people messing up the code are not actually advocating for using software development tools to write better code they are not actually applying software engineering practices to their code.


I would never call myself "electrical engineer" or "mechanical engineer" because I did not study that. But everyone who is paid to write some amount of code calls themselves "software engineer".

Not sure how much of a problem it is, but I am frustrated when "other" engineers try to teach me about how networking works and never once consider that my intuition may possibly have more value than theirs, because I actually studied networking. Not that I am always right of course, but if we have an electrical engineering argument, I naturally get into a stance where I assume they know better and can teach me useful stuff.


I've seen this:

- Multiple/virtual/high-on-crack inheritance:

  add each function/class has 7 template specialization parameters and 3 macros which expand to templates which expand to macros
  
- Lookup using dynamic structures from hell – dictionaries of names where the names are concatenated from various pieces at runtime, etc.

  think maps of maps of maps loaded from configs of configs
  
- Dynamic loading and other grep-defeating techniques

  obviously everything has to be a "plugin"
  
- A forest of near-identical names along the lines of DriverController, ControllerManager, DriverManager, ManagerController, controlDriver ad infinitum – all calling each other

  a 1000 times yes! yes!!! DriverController inherits from ControllerManager which extends ManagerController which contains a DriverManager which agregates 3 ControllerManager from diferent namespaces. I've seen a DriverController function going through 12 levels of stack passed between 3 threads to eventually call back a function from the same DriverController
  
- Templates calling overloaded functions with declarations hopefully visible where the template is defined, maybe not

  if a template technique exists, it had to be used!
  
- Decorators, metaclasses, code generation, etc. etc.

  Of course they define their own DDL with xmls parsed by a combination of Python and awk which generates C++ macros which are used in templates to dynamically load plugins which hold maps of maps of function pointers to create3 events dispatched on a pool of threads


reminds me of java spring library

and on top of it those DriverControllers and ManagerControllers keep getting depracated


I really do not understand these memes about overengineered FactoryFactoryFactories. I have 10 YOE, did I just get lucky? I've worked at enterprise Java shops as well, but even there I'd call the software pragmatic. Are these overengineered monstrosities REALLY still a thing, or is it "just" people suffering in legacy projects? Even the juniors I worked with were following KISS and YAGNI.


Yeah I think in the last 10 years things have definitely changed. One of the last Java projects I worked on was in 2013 and the lead was ex-Google; he deliberately pulled in the simplest Java libs to get the job done and we didn’t over engineer anything. Contrast that with 1990-2010, the era of struts and enterprise java beans, things were definitely different back then.


Yes, some people read Clean Code and think every file should have less than 20 lines. I recently inherited a React project that has all single-use utility functions, graphql queries and component types extracted out to different files. Having to edit five+ files to change things in one component is a nightmare experience and slows down changes a lot.


This is the battle I am currently fighting.

Specifically when working with numbers (something scientists do), i feel it is more important to know the numbers and have an intuition of what is important to get right and what is less important. Rather than spreading unit tests and the following abstractions all over the place.

But scientists can create a hot mess too. This is when they don't care about the code enough to minify it. I.e. reducing the code to what is necessary.

Number of lines of code matter.

So the perfect blend here is not necessarily the person that follows all engineering practices. But the person has the domain knowledge but that knows enough about software practices to be able to write concise code.

If you are either missing out on the domain knowledge - or the ability to reduce the code - then it will derail.

Shipping the notebook to engineering department will not be the solution. They will break the code into pieces, follow best practices but miss out on the important.

And another aspect of this practice is that if someone finds a problem, the scientist will not be able to modify or re-run the engineered version of the notebook.


>Multiple/virtual/high-on-crack inheritance 7 to 14 stack frames composed principally of thin wrappers, some of them function pointers/virtual functions, possibly inside interrupt handlers or what-not Files spread in umpteen directories

Scientific code? You just described 99% of "enterprise" java code above.


That's from the list of software engineer sins, yes, so that tracks with your opinion.


Only thing I've struggled with, is when real software engineers whip up "enterprise" code for even the simplest and most trivial programs. If you've heard of the infamous "enterprise hello world/fizzbuzz", then imagine that type of structure.

I guess it stems from the ideology that it is better to do lots of groundwork now, in case the program blows up and needs to scale. Which is somewhat true...but in the world I work in, it is only true for maybe 1% of programs we write.

So in the majority of cases, if I need to fork some software at work and do easy modifications, I do prefer the one-file programs, compared to some behemoth where almost everything is boilerplate, spread over multiple source files, folders, etc.


Ulp. As a scientist who is a hobby "programmer", this struck close to home. I've got one project[1] with a huge mess of functions calling each other. It started out with good intentions, but then gradually descended as I wrote more and more hacks to add new analyses & robustness checks. I swear I meant well!

I think there's a genuine tension between writing good code and "shipping" a paper. At least, when I program "as a programmer" I think my code is mostly higher quality.

[1] https://github.com/hughjonesd/why-natural-selection/blob/mas...


I will believe that when scientists will stop being too embarrassed to publish their code. Do people not remember mrc-ide/covid-sim?


No?


Some people incorporate antipatterns into their practice & reinforce these antipatterns with years of experience. All fields have this issue. Bad professional scientists are often worse than effective amateur scientists. Having good first principles with little experience often beats plenty of experience with bad first principles.

I have learned to appreciate codebases which break some rules & yet are easier to maintain for some reason(s) Distilling the reason(s) identifies areas where I can modify my technique...or at least help identify questions & alternatives to some techniques that I regularly use.


When scientific code is not required to be published with its research literature who really knows how bad it is?


A few responses come to mind. Who is requiring? What counts as code that needs to be published?

But perhaps the most relevant response is that few people read papers, even fewer are going to look into their zip. The whole idea of papers is to condense a whole lot of work into concise digestible information.


Aren't papers also supposed to be reproducible? Having access to the original code would allow you to see why you're not getting the same results if it's caused by a bug.


Good code is the simplest code you can write to get the job done.

Getting too excited about techniques is a form of scope creep


It depends on what it means to be to get the job done.

Write some code and generate immediate results can be considered a done job. If you want to reuse the code a year later and found an unreadable mess though, not so much.


Not sure what the exact context the blog refers to (they are a scientist turned software engineer?, their field is data/software intensive and see this as an improvement area?). Our team does engineering test and evaluation that include aspects of R&D and struggle with this. Scientists have academic approaches (must have sufficient sampling/statistical significance, sometimes delaying findings/reports checking additional aspects when a sanity check will suffice) but it does give rigor. On the software side, we definitely have spaghetti code, tools that rely on some file sitting in someone's H-drive, and plug-ins built by someone's old collaborator with little documentation. This is juxtaposed by the PM types that must be agile (fine when tasks are understood and defined enough to go nicely into a sprint, not always the case). Better communication on both sides would probably alleviate some of this but that is the great challenge in any group.


Cost is paid during development and maintenance. Value is reaped during usable life. The two periods overlap. Then subtract value from cost and integrate. Maximize that number.

Easily leads to conclusions of "you don't need this to scale" vs "build this to scale" and "you need to make this extensible" vs "just ship the hacky thing".

Just need to know what your params are and place uncertainty on it.

Once I wrote a program in a day. Did a kind of pricing. Hacky af. Just tech demo. Over time people start depending on it. One year later it breaks. Too many products priced, internal ring buffer only has few slots. Pricing stalls for all products.

"This is a company-killing issue!" Everyone yells. Sure, my mistake. I let thing get depended on without productifying but original choice was fine since it allowed iteration on something else that made money. Just when situation changes you gotta adapt.

No rules about that except just short loop on quality required and time spent.


I think a lot of developers naively rely on "software design" principles. They often can't state their reasons for adopting these principles and assume these principles are self-evident. This can be a problem when these principles conflict with external priorities, or when a complex "principled" design is used instead of an obvious, intuitive design. There's also a certain amount of trendiness to software design, so trends can be applied haphazardly.

That being said, I've also seen plenty of "scientific code" that's totally incomprehensible even to the point where the scientist who wrote the code can't debug it. So there's an extreme in the other direction.


A lot of scientific research code is execute-once-throw-away code. Once the code obtains a result you can put into your research paper, it has served its purpose and it'll never be used again. There is no need for any kind of abstraction, software design, architecture or development best practices. You're not going to be supporting this code or extending it in the future. If you're a software developer, the way researchers write code rubs you the wrong way. But you have to accept the reality. Applying software development principles to scientific code is just a waste of time.


Now I dare you to add some new functionality on a bad designed software. Ask a scientist to change a little bit the initial hypothesis and update the code accordingly, and watch them cry while rewriting all the code.


They'll be crying on the same bench as the 'one line of code' people, when they're required to finetune their code.


A former physicist at our organization wrote a tool t convert a diagram into an algorithm in a domain specific language.

He made it to make his own life easier, and never considered any best practices, everything was a thousands of lines of jupyter notebook, it was not modular, painful to customize but just got the work done, and was kinda a black box in way the way it generated the outputs.

Since then we've tried to re-write it with good programming principles, but still we were not able to re-create what he did. So yah sometimes done is better than perfect.


I'm not so sure that this claim is valid as a general observation.

Multiple times in my career, I have seen scientific code written by academics dramatically sped up by developers who used parallelisation, vectorisation using SIMD etc.

So in terms of performance, naively written scientific code is generally easy to beat in terms of performance for a reasonably adept programmer.

That said, "enterprisey" Java/.NET software engineering shops can indeed make scientific code sub-optimal sometimes. Have come across that sometimes too but I wouldn't generalize.


One notable difference between scientific code and regular software development is that the code scientists write is an implementation of well defined/documented mathematical models, while in a, say, web application, there is no reference paper or research, the code _is_ the reference. That's why best practices are important, not for the person writing a piece of software now, but for the future. If you need to change scientific code, papers and specification make otherwise confusing structure more manageable.


> Files spread in umpteen directories

Tools and frameworks encourage this. Git and VS code are build around directories. In VS code the first thing in the sidebar is the explorer. When you press Ctrl+P you see an overview of files. File-System based routing.

But directories lack a crucial feature compared to text: Ordering. If I put everything in one file, I can order it in a way that makes sense. If I put everything in different files and directories, it's all going to be ordered alphabetically.


Related:

Why bad scientific code beats code following “best practices” (2014) - https://news.ycombinator.com/item?id=12377385 - Aug 2016 (261 comments)

Why bad scientific code beats code following "best practices" - https://news.ycombinator.com/item?id=7731624 - May 2014 (168 comments)


Scientific programming and industry programming are distinct disciplines.

For scientific code certain things are just not important, hence you do not deal with them:

- Observability: You just care for the result of the run, not for the state of the running system - Security: Your code is running in isolation, used by yourself

The result looks horrible to a normal programmer, even if it’s well maintainable, but it is exactly what is needed to do the job.


No it isn't and this idea needs to die.

Sure, you can ignore security if all you're doing is processing local text files, granted. But things looking horrible to programmers isn't just about security bugs, it's about the whole span of correctness bugs. And scientists need to write code that is both correct and maintainable. The frequency with which they don't is partly why results so often can't be replicated, making the money spent on academia wasted.

The idea that science code doesn't need to be maintainable or that they have some magic way to do it that looks wrong, isn't right either. It's not uncommon to find model "codes" that scientists have been hacking on for decades. The results have become completely untrustworthy many years earlier, but they deny/obfuscate/ignore, attack or even sue people who point out concrete problems. Sadly, often with the acquiescence of the media who are supposed to be ferreting out coverups.

Scientists need to collectively get a grip on this situation. They will happily attack anyone outside their institutions as being non-expert conspiracy theorists, but when it comes to software they suddenly know everything and don't need to hire professionals. Paper-invalidating bugs are constantly being covered up and the only reason the problem hasn't reached criticality yet is that many people don't want to hear about it. But the unreliability of academic output is now becoming a political problem and a divisive culture war issue, when it really shouldn't be. A good first step to solving the replication crisis would be for scientists to stop pretending it's OK to quickly knock together a program themselves instead of assigning a ticket to a trained full time SWE. Yes it would cost more (a lot more), and that's OK. Generate fewer papers but get them right!


Worst code I have seen has been written by self taught bioinformaticians.

But, a lot of the time these scripts are used as one offs, to generate a result, then are done with, so quality doesn't need to be the same as with a server running 24/7.(Sadly the ones I had to fix were being run regularly).


Fortunately the list of "bad" code features attributed to each group is listed clearly. The title simply needs to read "Bad scientist code beats bad programmer code", or "A list of bad practices that programmers often think are good".


If the non programmers commit correctness bugs and the programmers are just using patterns you don't like, maybe try to understand the patterns instead of balk at them.


The author states they are primarily a software engineer and have also been guilty of following these patterns, so the clear implication is that they understand the patterns. The author isn't making the case "I don't like it," they are making the case that these patterns actually lead to more and bigger problems in the field of scientific computing than the usually simple errors of ignorance committed by non-programmers.


They state clearly that they can't follow the call structures and often give up on understanding them.


Define "correctness bugs". Does the code leak memory (which is not ideal)? That's only an issue for the scientist if it prevents/invalidates the science. But if the pattern hides how something is expressed, or someone unfamiliar with the science tries refactoring the code, that's more likely to cause issues with the science than the memory leak.


> Access all over the place – globals/singletons, "god objects" etc. Crashes (null pointers, bounds errors), largely mitigated by valgrind/massive testing Complete lack of interest in parallelism bugs (almost fully mitigated by tools)


> Many programmers have no real substance in their work – the job is trivial – so they have too much time on their hands, which they use to dwell on "API design" and thus monstrosities are born.

Definitely getting this vibe from modern frameworks and design patterns.

I want to like SwiftUI, but the WYSIWYG editor doesn't even work for the default projects for me. Storyboards were great for creating everything UI except tables and collections, where it was still functional just meh. The reactive UI in SwiftUI… mostly works, but sometimes doesn't, and when it doesn't I can't debug it because it's a magic black box; you can get similar results with a small amount of extra code in UIKit { didSet } on your model property and each input control, and while boilerplate isn't great, it's better than magic which only works 98% of the time.

I'm trying things in JS in my spare time, no libraries or frameworks, and it's easier and faster than getting anything done in XCode. And that's despite the Swift language itself being one I prefer over JS, and that I'm doing the development in BBEdit which is a text editor not a full IDE.

But this isn't just about Apple; the reason I'm not using any JS framework and libraries is that every single talk I've seen about web development has exactly the same problem, piling on layers of stuff to fill in the gaps missed (or created) by the previous layer of abstraction.


> piling on layers of stuff to fill in the gaps missed (or created) by the previous layer of abstraction.

I think the goal is to make if more accessible, so that more people with less knowledge can produce more crap with it.

People don't learn the basics, they want to write a comment in Copilot and have it assemble code that roughly does what they want. I believe that people who got into software in the 60s actually liked computers. People who get into software today just want to produce stuff, they don't care about their computer.


> I think the goal is to make if more accessible, so that more people with less knowledge can produce more crap with it.

If so, it fails at even this: for my A-levels[0], my teacher only knew VisualBasic[1], which was very easy to work with: drag and drop widgets onto a form, (double?)-click on a widget to get right into the code block that happens when a user users the widget.

Back when JS was new and there were no extra layers of abstraction, yes the language sucks, but you could get to work with it using only what you found in a £4.99 book from WHSmith[2] and a text editor, you didn't need to `install npm` and then some library and then

> People don't learn the basics, they want to write a comment in Copilot and have it assemble code that roughly does what they want.

Agree. Heck, I do that, and I started learning the basics when I was about 5. :)

> People don't learn the basics, they want to write a comment in Copilot and have it assemble code that roughly does what they want. I believe that people who got into software in the 60s actually liked computers. People who get into software today just want to produce stuff, they don't care about their computer.

My dad probably got into computers some time in the 60s, a one or two day corporate training program about "this new thing called 'software'". The way he talked about them, it was clear he didn't really understand them, and just wanted to get stuff done.

[0] https://en.wikipedia.org/wiki/A-level

[1] It turned out that the copy of REALbasic I had on my Mac at home was almost copy-paste compatible, the only exception I ran into was that `Dim foo, bar As Integer` has `foo` and `bar` being `Integer` in REALbasic while our version of VB had `foo` being `Integer` and `bar` being `VarType`

[2] https://en.wikipedia.org/wiki/WHSmith


Most scientific code breaks when anyone but the author of the paper tries to run it


The list of issues are mistakes juniors make, not best practices


I wonder why people complain about the replication crisis.


OP is using a strawman caricature of a programmer to make his point. While such bad (often junior) programmers exist, there also exist many reasonable ones who won’t commit all these over the top abstractions while also not falling into the "scientific" programming pathologies.


I'm a scientific coder, though I work in industrial R&D feeding product development. My work doesn't get published. I've studied good programming practices for 40 years, and I try to behave myself.

One thing I've noticed is that programming practices have evolved, not so much to make them better than before, though that's conceivable. But because practices have to keep up with rising complexity of the code itself, and also of the operating environment and the social environment (e.g., work teams, open source projects, etc).

Scientific programs tend to be easily 20 years behind software development in terms of complexity, and I think we can benefit from using older techniques that were simpler and easier to learn. I learned "structured programming" via Pascal, and to this day if I hew to the same practices that I learned in my Pascal textbook, my program will probably do what it needs to do and be tolerably maintainable.

Perhaps those practices have to come from the mouths of scientists. The software engineers have moved on, and are only interested in the latest and greatest toys. I don't blame them -- they have to own their careers and follow their interests just like we do.

I mentor younger scientists who come out of fields such as chemistry, and are beginners at coding. So I literally get to explain such basic things as putting code inside subroutines, and avoiding global variables. I haven't had to tell anybody about GOTO's yet.

About reproducibility: My parents were both scientists, though my mom spent a few years in mid-career teaching programming at a community college. I learned the scientific method sitting on my mommy's knee. "Reproducibility" was certainly a guiding principle, but it was also expected that reproducing a result would require some effort -- perhaps fabricating your own equipment from available materials, and gaining skill on a technique. You might get it wrong many times before finally getting it right.

What we expect now is "pushbutton" reproducibility, meaning that a project replicates itself from start to finish at the push of a button. This is a much higher standard than any scientist is trained to expect, even if software engineering requires it. A software project has to be at least 99% that way, or it would be unworkable, due to the high degree of complexity. The tradeoff is that it also requires complexity to make things that way.

I expect my results to be reproducible, but not pushbutton-reproducible. To overcome this issue, I'd rather spend my time documenting my code and its theory of operation, than making it bulletproof. Nothing that I write goes directly into production, and I expect the theory of operation to be more valuable to a project than my code. Often, the code just automates an experiment to test the theory, so it's a middleman rather than a product.


numerical code isn't like application code. the rules of application code don't always apply. for example, i think one character variable names are totally fine if they come from equations or papers where in most applications it's generally frowned upon.

this is why i'm a little skeptical of languages that blur the lines. sometimes ideas from application programming can complicate numerical code and sometimes numerical programmers don't fully understand the systems abstractions they're building on and end up reinventing wheels to avoid simpler solutions they feared or didn't know existed.

the moral of the story is to keep an open mind, to not be a zealot and to avoid dogmatic thinking.


Java and its consequences have been a disaster for the human race.


TL;DR: a counter-productive rant against software engineers, claiming that bad code from software engineers is worse than bad code from scientists.

Did you consider hiring an experienced software engineer as a lead?


Bad code written by software engineers is worse than bad code written by scientists, as the former takes more effort to fix than the latter (given the pathologies mentioned). It's naturally preferable to not have bad code, but it that choice was actually on the table, then I don't know who would choose the bad code.

As for hiring software devs, that's not going to change (in general, there are places where software devs write code used by scientists, but rarely are these codes themselves pushing research boundaries, it's code on top it that does) absent significant changes in funding structure and rules (which are typically a government/public service concern, and not up to researchers).


> as the former takes more effort to fix than the latter

Disagree. Well, maybe still acceptable if the software is small / limited to a single paper. Having worked on a code base the people writing it learned programming on that job, guessing their intention is like archeology. And each iteration tended to add some complicated interdependence. Or like when int errorCode came from other, overlapping error ranges.

A part of the "new" code base is exactly as described by TFA. Including most interfaces having only one implementation. But while annoying, I am more able to work on it without things breaking...


It's just comparing two different things: the average "bad code from scientist" is for simple tasks, whereas the average "bad code from software engineer" is for a complicated task (otherwise you shouldn't pay the software engineer in the first place).

My point about the rant being counter-productive is that the solution is to learn how to do the task better, not to blame people. Maybe the best way to do a task with a very low budget is to not do it at all.


You know, 96% of businesses (and by extension codebases) have to get by software wise without high priced software engineers. They couldn't afford it. The vast majority of running code is produced by people whose understanding of computer systems and programming goes as deep as how much documentation they need to ctrl+f through to get some specific tasks done.


>The vast majority of running code is produced by people whose understanding of computer systems and programming goes as deep as how much documentation they need to ctrl+f through to get some specific tasks done.

I doubt this very much. Surely the vast majority of running code is some chunk of Chrome, Android, or the JVM ("billions of devices run Java...") or something. All those things were produced by software engineers with more than surface-level understanding.


For every well-engineered application used by millions of users, you've got thousands of poorly-engineered, bespoke applications in use by one or two users (often internal corporate tools or expensive middleware with minimal customization besides changing the corporate branding).


Still, I wouldn't consider "Chrome" as good software.

Let's be honest: web browsers are really bad. Overly complicated machines made to print images and text on the screen. The web got sideways long ago, first because it was cool to add crap in websites, then because it made profit and allowed monopolies to make more profit by moving everything to damn webapps. And finally it allows companies to screw users by renting software on the damn cloud.

And they profit by making it as accessible as possible, such that everyone and their dog can produce a crappy webapp that will show their ads or track their users.

Webtech is part of what makes software really bad, even if it was created by good engineers. Because people don't want quality: they want cheap new crap.


Maybe by instances of the same codebase, yeah absolutely.

But there is certainly more codebases out there running some python or js written by designers, data analysts etc then those produced and curated by software engineers.


TFA seems to have hit a nerve.


If it's about ranting at classes of people, I can do it too: most software is shit, that's true. But somehow users like shit, and it makes profit so software engineers get paid to write it. They get valued by writing a lot of shit, not by writing little good code. Also most software engineers are juniors; junior civil engineers would not be allowed to build a bridge, junior software engineers can do all the crap they want.

But let's be fair: most scientists do bad research. Have you ever read papers in a field you know? 99% is bullshit. Not "non-conclusive good research", no. Downright useless, non-reproducible bullshit (or paraphrasing something that already exists, often making it worse). Just like software engineers, most scientists are juniors (we call them "PhD candidates") who get valued by publishing papers (any paper) in "recognized" journals (with some definition of "recognized"), by journals who make profit by accepting papers (any paper). Again not totally their fault: they have to produce recognized stuff in the time they are given, they don't have to produce good research.

I don't have a solution to those problems, of course: that's how the system works ("make profit"). I wish we tried to solve actual problems in a good way, but we don't. I don't think scientists are better than software engineers, though: we are all part of the problem. Less of all of us would make the world a better place (or would have prevented us from spoiling it, at least).

I just don't think my rant is worth publishing on HN.


OP may have not clicked on the rest of the blog...


Meh.

Who actually works as a software dev at one of these academic institutions? The pay is beyond terrible. Presumably you have either:

- Young and keen. Young and keen are the source of all kinds of terrible things. That’s why you need old and bitter to balance it out. Old and bitter is off working for much more money in a boring corporate.

- Side hustlers/other incompetents. “I’m now a software dev!”

It’s not that proper software shops don’t struggle against “complexification” and all manner of other deviant behaviours. There are so many ways to turn software into hell - “All happy families are alike; each unhappy family is unhappy in its own way.” But if this is your problem, the problem isn’t “the software industry” or “best practice” or “devs jobs are so easy they just have to make up complexity”. The problem is incompetence. Managerial/leadership incompetence.


Every year that passes these “SWE is an idiotic field” posts come earlier. So annoying…


Oh look, another tiring craftsmanship debate that other disciplines long figured out!

A, say, physicist writing bad code could equally well be building a pergola for his garden. He doesn’t really know woodworking but god be damned if he couldn’t calculate the forces acting on the beams, and then add some screws - how hard can it be! And probably, he’ll even get the thing up, and it doesn’t look too bad even. Now get a carpenter over, and they will be horrified about all the things the scientist did unusually, did not account for, or just plain wrong. ”Wall Screws you still had around?? How could you not know you’d need structural screws for that?“, he will scream. However, the thing does roughly what the scientist supposed it should do. Until the winter, that is, when the wood expands due to humidity and cracks appear, and he finally needs a professional to fix the problems.

It’s the same story, really: It is a software engineer’s job to build quality software. A scientists job is to solve problems. There’s a clear boundary here, where the latter will deliver a concept to the former, who will eventually create a production-grade implementation off of that. Neither does a scientist have to build proper software, nor does a developer have to do cutting-edge research.

And all the words wasted on how one of them might be doing something badly is on he wrong path.


I'm an ex-software developer/engineer and current scientist. In my experience TFA makes a good point, even though it's quite strawmanish.

Most scientific code is horrible from any sane software developer perspective. The quality is so bad that I think a huge proportion of published results are plain wrong due to bugs in the analysis. These apply to my much of my code as well.

But a lot of "software engineering" code is horrible too. Mostly because most popular technologies and "best practices" are just plain bad. Overenginering is a pandemic and has been a long time. Much of the roots is from the gilded age of enterprise Java. Totally misunderstood OOP. Byzantine layers of pointless abstraction. Rampant premature "web scale". Counterproductive bondage and discipline (yes, including much of static typing).

These have become a cargo cult in software development, and these cosmetic features are deemed as "quality code".

And this leaks into scientific programming. Even Python, arguably the language of science nowadays, and a "quick and dirty one", forces some of this cargo cult. Modules are needlessly complicated (e.g. relative and absolute imports are quite a mess), let alone the horrible packaging system. And the current trend to push typing.

That said, scientific code is getting slowly better, largely due to switch to public/open source code, and away from specialized hacks like MATLAB and R. Especially in more technical fields.

In software engineering OTOH things are IMHO getting worse.


I'll add that "code reuse" was pushed as an idea too heavily (at my university at least). I think that leads to a fair bit of the over engineering and unnecessary complexity.

I saw a comment later saying not to aim for "code reuse", but rather to "avoid code duplication" which made more sense.

I also agree that a lot of best practices are great if you are a massive company with unlimited resources, but they are just overkill and add complexity for smaller projects and teams.


Code reuse was indeed the hype, especially of OOP and inheritance. In practice it just made real code reuse a lot worse. The class hierarchies become so tightly coupled that any "reuse" requires a lot more boilerplate than the actual code to use.

I know that the idea is not just not writing more code, but have a "single point of truth". But the boilerplate leads to a situation where it's very inpractical and leads to a tightly coupled mess that's really hard to change.

What actually increased code reuse was duck typing (i.e. implicit interfaces), made popular by Python. But that's becoming verboten nowadays. And I don't think current structural typing systems are gonna reach the same level of reuse.


I basically agree (except I will take static typing over dynamic typing any day).

I work in computational chemistry, and scientists here don’t necessarily have problems actually coding (we’ve been doing it for 70+ years). But the “other stuff” is taking more and more time.

Before, you wrote Fortran, put the files on disks or whatever, and sent them around. Now, you need to know:

C++ and Python (and maybe Fortran too)

git and github and github actions, and packaging (pypi/conda/conda-forge I guess is the standard now).

Documentation? Sphinx I guess, although mixed codebases are still a pain. And where do I host the docs again?

Also better make sure it works on all three OSs.

Wait what is docker? Suppose I better learn make images. And they have to be hosted somewhere? Got to set that up too.

Oh wait again, Python 3.12 broke something. Have to fix that.

Does all of this make “better software”?. I’m coming around to the idea that maybe it doesn’t, at least in aggregate. Either way, scientists don’t have time to really learn all this AND the science they are doing. And it all changes every few years or so.


> Either way, scientists don’t have time to really learn all this AND the science they are doing. And it all changes every few years or so.

This is how I see it as well. I'm an immunologist, and it feels impossible for me to keep on top of my field of research and just about anything else. I don't have to produce quality software, but it seems difficult to keep your research cutting edge while maintaining software. It's hard enough keeping the research up to date!


> (except I will take static typing over dynamic typing any day)

Why? What exact typesystem do you prefer over Python's dynamic typing? A lot of its idioms, and probably e.g. NumPy/SciPy infrastructure would be about impossible with current static typesystems (look at the mess that is C++ scientific/ndarray libraries). Ditto for much of the autodiff and GPU stuff like pytorch.

Julia could perhaps get there, but the implementation has too many warts for it to take over.


It's more to do with developer ergonomics. Knowing what something is or returns (and having some guarantees about it) make it much easier to reason about code. Then, knowing (with guarantees) what a variable/object is, knowing right away what I can do with it.

> What exact typesystem do you prefer over Python's dynamic typing?

I just want to know what various objects are laying around in my code so I don't have to keep it in my head.

For example, if you are using a database library like psycopg, you run a query and execute it. What kind of object is returned from the query execution function? How do you check if it was null? You have to go looking at documentation, and often what you find is examples of what you can do. But those examples often don't include absolutely everything, so you have to go looking at the code itself. But library code in python is often pretty arcane.

With Fortran/C++/Rust/etc, you get all of that basically for free in your editor's autocomplete. And the compiler will check to make sure you didn't do anything truly dumb.

Dynamic typing also encourages some frustratingly bad habits. Lots of code out there will have functions that change return type depending on what arguments are provided. And occasionally someone will forget to return something in some branch and you won't know until you hit an edge case (during runtime) and suddenly you have a None coming from somewhere.

My projects tend to be in Python now, but I recently started a side project in C++ again and found it very refreshing.

> probably e.g. NumPy/SciPy infrastructure would be about impossible

Partly true, but maybe not as true as you think. I've been using the nlohmann JSON library for C++, and it is amazing. The code almost looks pythonic, and its type conversions are done automatically (although at runtime of course).

For example,

  for(const auto &[key, value] : some_json["x"].items())
    double d = value["other_key"];
EDIT: Also, I didn't realize how much I missed function overloading until I started the C++ project. Wow is that handy.


I find that "what can I do with the value" is a lot more important than "what the object is". Especially in scientific context this is usually "can I iterate this" "can I do arithmetic with this" etc. NumPy's broadcasting and support for "any" iterables is a great example of this. And very difficult to accomplish with at least the typesystems you listed.

E.g. going from a nested list in a JSON to a Eigen matrix is quite a pain even with (awesome) libraries like nlohmann/json.

I often have to resort to C++ for performance or libraries. I make a lot more bugs than in Python. And the bugs are often really tricky and often pass the typesystem (e.g. both nlohmann and Eigen have to really hack the templates and these cause a lot of edge cases).

Also serializing stuff in C++ is a real pain even with nlohmann/json because for god's sake there's still no reflection in C++ in 2024.

Of course you can abuse the power of dynamic typing. But you can just as well abuse anything else that's remotely powerful (see e.g. Boost).


"It’s the same story, really: It is a software engineer’s job to build quality software. A scientists job is to solve problems."

That's not the distinction. Good software engineers solve problems. That's what the paycheck is for. The distinction is whether code has to be maintained.

It's the scientist's job to solve a specific problem at a specific time. Who cares if the metaphorical wood rots next winter, the paper's been published.

It's _often_ the software engineer's job to build things that deliver business value over years, evolving and expanding requirements, in a development team, without grinding to a halt under the weight of accumulated complexity.

"Quality" software engineering is just heuristics for keeping the pace of change high over time without breaking things.


> Who cares if the metaphorical wood rots next winter, the paper's been published.

Isn't this why the replication crisis was able to be kept hidden for so long?


Not as far as I know. Much more problematic was the fact that replication is very hard to publish. That's because either you more or less confirm previous findings and therefore contribute little to the scientific record (or so reviewers seem to think), or your findings counter the original results and now it's on you to explain the discrepancy. Even if you satisfy the reviewers that you're right, they may not consider your result of sufficient caliber to accept for publication in this particular venue [1].

[1] I've seen this happen to a paper that proved that a theoretical framework for constructing proofs about RFID protocols was neither sound nor complete.


> It's the scientist's job to solve a specific problem at a specific time. Who cares if the metaphorical wood rots next winter, the paper's been published.

Sounds a little like cargo-culting than proper reproducible research. But this is a pest in academia definitely. Many papers do not provide all required data, all required model parameters etc. to get to the exact same result. Admittedly, they might nowadays need a software engineer to get that done.


Cargo-culting is about focusing on the process without fully understanding its purpose. Such as bureaucratic requirements for providing all data, software, parameters etc so that someone can reproduce exactly the same numbers with minimal effort.

Proper reproducible research is not like that. It's about providing sufficient details that other people in the field can extrapolate the rest. That they can use similar methods with similar data to achieve similar results.

Reproducing exactly the same result is not that valuable, as the "result" could be just an artifact of the specific data and specific methodology. Real validation depends on fully independent replications, with as little reuse of data and code as reasonably possible.


In software I have found getting the exact details and parameters very useful even if I don't intend to use them. Because when I try to do the same thing my own way and fail, then I can reference the original ones and gradually make my own version more and more similar to that, and see when it starts working. Or make their version more and more similar to mine and see when it breaks. This allows quickly easily identifying the critical difference.


If we cannot reproduce research results, due to missing the data, parameters, or code, or whatever else, is it not cargo culting, to take that research result and blindly believe it and build on top of it? If I remember correctly, Feynman stated in the same video, that one should actually reconstruct or reproduce the experimental results we rely on.


It is very true people should stick to the domain they know because otherwise they will have higher than average chance to f up.

But that 'clear boundary' thing is a naive bollocks! No such thing!

Both domain experts need to understand things beyond this imaginary and when precisely drawn then highly arbitrary boundary that is more like a gradient than a line normally (also not something relevant in a final good product).

Teams consisting members unwilling to wander into foreign territory and expecting to be fed and deliver over a strict boundary will do horrible things!

(we are not even trained this way btw., professions have quite a bit of overlap and we learn matters others are expected to take care of in practice, and in case of sofware engineering - apart from the most notorious ignorants no one wants to work with - it is pratically impossible avoiding to learn a big chunk of a foreign profession on the fly for delivering a good product, heavy science is no exception!)


Perhaps I phrased this badly. My point wasn’t a clear boundary between professions, because you are right, that is difficult to impossible to draw. However, there’s a clear boundary between the goals of the code written by scientists vs. software engineers. Where a scientist aims to prove something, a software engineer builds code to produce business value. Both are trained very differently towards these goals.


This probably highlights why so many software engineers in scientific R&D are "bad" in the strictest sense: they often don't understand that business value doesn't come from production-perfect solutions, but instead 100% of the business value comes from doing stuff fast, good enough, and understandable (so that researchers, not software engineers, can read it in a paper and iterate off of it).


I still disagree a bit. Their goals are the same: producing a product that fulfills the intended purpose and brings on value (financial or else or mixed). They may bring in their specialty learned beforehand (or even during) but this distribution of labour is not a goal but tool in reaching the common goal.

Also no such as a software engineer is trained to seeks business value while a scientist seeks proof with clear cut separation, not at all.


> It is a software engineer’s job to build quality software.

It’s our job to deliver value to the business at a rapid and maintainable way. Rapid changes are often worth more to the business than maintainability, even over a period of many years. In some cases you could put a non software engineer down and let them build something with ChatGPT and it would work perfectly fine for the next 5-10 years because it’s focused on something particular, doesn’t change much and lives in isolation from the greater IT landscape. In other cases all your points are extremely valid.

That being said we also work in an industry where a lot of “best practices” often turn out to be anti-patterns over a period of a decade. OOP is good in theory, and I know why we still teach it in academia, but in practice it very often lead to giant messes of complexity that nobody really knows how works. Not because the theory is wrong, but because people write code on Thursday afternoons after a week of no sleep and a day of terrible meetings.

After a few decades in the industry, what I personally prefer isn’t any particular approach. No, what I prefer is that things are build in isolated services, so that they only handle their particular business related responsibilities. This way, you can always alter things in isolation and focus on improving code that needs it. It also means that some things can be build terrible, and be just fine, because the quality of the tiny service doesn’t really get to “matter” over its life time.

I personally write code that follows most of our industry’s common best practices. Because it’s frankly faster once you get used to it, but I’ve seen really shitty spaghetti code perform its function perfectly and never need alteration in the 5-10 years it needed to live before being replaced.


Research software engineering is a hybrid field and fixed roles do not work well. It requires one having a grasp (at some level) of a lot of different stuff, eg software engineerin, the relevant scientific theories, statistics _and_, quite importantly, the culture of scientific practices in a field, in order to make something good. So it gathers a lot of people who, no matter where they started from, often have to sort of converge by learning stuff outside their own discipline.

The problem is not that "physicists write code". Scientists end up learning a lot of stuff and getting good at it (software engineers the same). The problem is that writing software is often left as a job for the occasional phd, postdoc or research assistant, ie people with temporary positions, and in general to people who see building software as a side duty at best, annoyance at worst. This results in no generational knowledge building, being hard to find mentors, less learning and reflecting on practices on how to build software, rediscovering the wheel constantly, too much effort put into . It is not that one graduates as "software engineer" or as a "scientist" and then they know the best practices and everything of their respective fields. People get to learn stuff. Software engineering practices should be in the culture of scientific software building, not simply carrying them from the software engineering world to science, but adapting them and taking it each own idiosyncrasies as a field.


I agree, a lot of unmaintained code is due to turnover of staff and few considering the software as an important output in its own right. But I don't think it would be that hard to find grad students/postdocs who would care about code, if that were something the field properly incentivized. Not only does software contribution not check the right boxes for career progression, it is also often looked down on.

I find this especially laughable in biology... I've seen some (faculty) PhD committee members object to the student having a thesis chapter related to software contributions, because this is not "intellectual". But they are perfectly fine with one of the chapters being a wet lab paper where the student was a 3rd author who contributed purely through helping run experiments designed by the 1st author (e.g. handling mice, pipetting shit). There are PIs that simultaneously hold these two views, which to me signals a real misunderstanding of the challenges in and importance of writing decent software.


The first distinction here should be what kind of "code" are we talking about. As far as I see there are two main possibilities:

  1. code for performing a scientific simulation or analysis (a script).
  2. code for solving a specific problem generally (a program).
There are different "best practices" for the two situations above. And the article primarily talks about applying "best practices" from the 2nd scenario to the 1st. Of course they don't apply.


>Neither does a scientist have to build proper software, nor does a developer have to do cutting-edge research.

Sounds like the mission of Research Software Engineers (https://society-rse.org/).

I work as a software engineer (with a PhD in a field that is not CS) in a research setting, and there is a give and take. 50% of my job is reading, understanding, and adapting very bad code from non-software engineers into a production system. But another 50% is binning the overwrought inflexible code written by my software engineer predecessors in order to do all that more quickly than refactoring would allow.

In a research setting, in my opinion, MVP is king. Researchers seem to usually not produce viable long-term solutions. But software engineers do as well by virtue of not being domain experts (how could they be, they would need a PhD to understand the research domain!) and being unable to test 100% of the assumptions underlying the software themselves. Which is why it can help to have someone in between who is a domain expert, but knows just enough software engineering to produce production-good-enough code.


This might work for CERN. But a great deal of science is done by small teams who don't have a professional programmer available. Basically all of the social sciences, for a start; a lot of genetics too.


If you cannot model using decent code is it worth writing models at all? What if bugs mean the model is simply wrong?

It has consequences too. There has been a lot of argument about how much impact the poor code quality of the Imperial college covid epidemiology model (which was the basis of British government policy during the pandemic) had on its accuracy. I do not know how bad it was, but it cannot be good the code was bad.


One problem is that it's really hard to tell when you've just written bad code, which is also a problem for people whose job title is software developer, not just people who do it as a small part of their overall work.

Some genes have been renamed because Excel interprets the old names as dates. The people who put all their genetic analysis into Excel had no reason to expect that, just as the people writing Excel itself weren't expecting the app to be used like this.


Yeah but, in your example. this is a bookkeeping issue that, while frustrating and time consuming and costly, is just that. It’s not like a gas line in Manhattan blew up because someone in Toledo hit C-v in Excel. The scientists swapped around excel files and imported stuff without checking, which then was fed into other systems. A clusterfuck but one that is a daily occurrence at, minimally, every major non-tech company. It eventually gets unfucked with human labor or it simply wasn’t important in the first place. Exact same thing happens in research and academia.

Point being, unknown unknowns are just that. But most unknowns are known and can be programmed defensively against for most serious use cases. All major fields are like this—like, you can hook up a car battery to light a menthol tank to boil two cups of water… or we can use a kettle. Perhaps for a brief point in time, due to our ignorance or just history, people lit containers of menthol on fire like it was sane, but that doesn’t mean it was, or is.


Given Ferguson's track record of being out by orders of magnitude on absolutely everything beforehand, it almost seems like he was chosen to give an over the top estimate.


It would be comforting to believe that because it'd mean there were other epidemiologists who were right but ignored. Go read the works by his counterparts though, and they're all out by similar orders of magnitude.


It is possible. It is certainly common to pick the expert who says what you want them to - and to get rid of experts who say the "wrong thing" (e.g. the dismissal of UK govt drug policy advisor David Nutt).


> I do not know how bad it was, but it cannot be good the code was bad.

I do know because I reviewed the code and its issue tracker extensively. I then wrote an article summarizing its problems that went viral and melted the server hosting it.

The Imperial College code wasn't merely "bad". It was unusable. It produced what were effectively random numbers distributed in a way that looked right to the people who wrote it (in epidemiology there is no actual validation of models, reinforcing the researcher's prior expectations is considered validation instead). The government accepted the resulting predictions at face value because they looked scientific.

In the private sector this behavior would have resulted in severe liability. ICL's code was similar to the Toyota engine control code.

A selection of bug types in that codebase: buffer overflows, heap corruption, race conditions, typos in PRNG constants, extreme sensitivity to what exact CPU it was run on, and so on. The results changed completely between versions for no scientific reason. The bugs mattered a lot: the variation in computed bed demand between runs was larger than the entire UK emergency hospital building program, just due to bugs.

The program was originally a 15,000 line C file where most variables had single letter names and were in global scope. The results were predictable. In one case they'd attempted to hand write a Fischer-Yates shuffle (very easy, I used to use it as an interview question), but because their coding style was so poor they got confused about what variable 'k' contained and ended up replacing the contents of an array meant to contain people's ages with random junk from the heap.

There were tests! But commented out, because you can't test a codebase that overrun with non-determinism bugs.

The biggest problem was the attitudes it revealed within academia. Institutionalized arrogance and stupidity ruled the day. Bug reports were blown off by saying that they didn't matter because the "scientists" just ran their simulation lots of times and took the average. Professional programmers who pointed out bugs were told they had no right to comment because they weren't experts. ICL administration claimed all criticism was "ideological" or was irrelevant because "epidemiology isn't a subfield of computer science". Others were told that they shouldn't raise the alarm, because otherwise scientists would just stop showing their code for peer review. Someone claimed the results must have been correct because bugs in C programs always cause crashes and the model didn't crash. One academic even argued it was the fault of the software industry, because C doesn't come with "warning labels"!

The worst was the culture of lying it exposed. The idea you can fix software bugs by just running the program several times is obviously wrong, but later it turned out that their simulation was so slow they didn't even bother doing that! It had been run once. They were simultaneously claiming determinism bugs didn't matter whilst also fixing them. They claimed the software had been peer reviewed when it never had been. The problems spanned institutions and weren't specific to ICL, as academics from other universities stood up to defend them. The coup de grace: ICL found an academic at Cambridge who issued a "code check" claiming in its abstract that in fact he'd run the model and got the same results, so there were no reproducibility problems. The BBC and others ran with it, saying the whole thing was just a storm in a teacup and actually there weren't any problems. In reality the code check report went on to admit that every single number the author had got was different to those in Report 9, including some differences of up to 25%! This was considered a "replication" by the author because the shape of the resulting graph was similar.

That's ignoring all the deep scientific problems with the work. Even if the code had been correct it wouldn't have yielded predictions that came close to reality.

Outside of computer science I don't believe science can be trusted when software gets involved. The ICL model had been hacked on for over a decade. Nobody had noticed or fixed the problems in that time, and when they were spotted by outsiders, academia and their friends in the media collectively closed ranks to protect Prof Ferguson. Academia has no procedures or conventions in place to ensure software is correct. To this day, nothing was ever done and no fault was ever admitted. There was a successful coverup and that was the end of it.

Again: in the private sector this kind of behavior would yield liabilities in the tens of millions of dollars range, if not worse.


Wow. It's pretty unbelievable. It there a place where I can read the whole article?

The one I found the funniest/crziest is "Bug reports were blown off by saying that they didn't matter because the "scientists" just ran their simulation lots of times and took the average", because this is exactly how some scientists I know think.

This thinking is not limited to software. My father was by trade involved in building experimental apparatus ("hardware") for scientific experiments. Often they were designed by scientists themselves. He told me about absurd contraptions which never could measure what they were intended to measure, and extreme reluctance/defensiveness/arrogance he often met when trying to report it and give some feedback...


Yeah confusion between simulation and reality can be observed all over the place. Multiple runs can be needed if you're doing measurements of the natural world, but for a simulation that doesn't make sense (you can do Monte Carlo style stuff, but that's still replicable).

You could see the lines being blurred in other ways. Outputs of simulations would be referred to as "findings", for example, or referenced in ways that implied empirical observation without it being clear where they came from unless you carefully checked citations.

Here are some of the articles I wrote about what happened (under a pseudonym)

https://dailysceptic.org/2020/05/06/code-review-of-fergusons...

https://dailysceptic.org/2020/05/09/second-analysis-of-fergu...

https://dailysceptic.org/2020/06/11/how-replicable-is-the-im...

After that people started sending me non-Imperial models to look at, which had some similar problems:

https://dailysceptic.org/2020/08/08/schools-paper/

I don't write for that website anymore, by the way. Back then it was called Lockdown Sceptics and was basically the only forum that would publish any criticism of COVID science. Nowadays it's evolved to be a more general news site.


Thank you! This is very interesting.

> Outputs of simulations would be referred to as "findings"

Yeah, a recent brouhaha about creating (!) a traversable wormhole in a quantum computer comes to mind...


Eek! That makes the worst code I've ever seen, seem good in comparison.


Aside from the code quality, his models have never been close to accurate on anything.


Right. That's not unique to Ferguson, epidemiology doesn't understand respiratory virus dynamics and doesn't seem particularly curious to learn anymore (I read papers from the 80s which were very different and much more curious than modern papers, not sure though it that's indicative of a trend or just small sample size).

Other models I checked didn't have the same software quality issues though. They tended to use R rather than C and be much simpler. None of them produced correct predictions either, and there were often serious issues of basic scientific validity too, but at least the code didn't contain any obvious bugs.


How are people supposed to do science without running statistical models?


This is asked in good faith of course, but that question really gets to the heart of what's been corrupting science.

Statistical techniques can be very useful (ChatGPT!) but they aren't by themselves science. Science is about building a theoretical understanding of the natural world, where that theory can be expressed in precise language and used to produce new and novel hypotheses.

A big part of why so much science doesn't replicate is that parts of academia have lost sight of that. Downloading government datasets and regressing them against each other isn't actually science even though it's an easy way to get published papers, because it doesn't yield a theoretical understanding of the domain. It often doesn't even let you show causality, let alone the mechanisms behind that causality.

If you look at epidemiology, part of why it's lost its way is that it's become dominated by what the media calls "mathematicians"; on HN we'd call them data scientists. Their papers are essentially devoid of theorizing beyond trivial everyday understandings of disease (people get sick and infect each other). Thousands of papers propose new models which are just a simple equation overfitted to a tiny dataset, often just a single city or country. The model's predictions never work but this doesn't invalidate any hypothesis because there weren't any to begin with.

How do you even make progress in a field if there's nothing to be refuted or refined? You can fit curves forever and get nowhere.

In psychology this problem has at least been recognized. "A problem in theory" discusses it:

https://www.nature.com/articles/s41562-018-0522-1


Right, statistical models are not sufficient for science. I agree. But they are necessary. So I recur to my original question.


Most teams at CERN don't have a professional programmer available either. In a few of larger projects (those with a few hundred active contributors) there might be one or two more tech savvy people who profile the code regularly and fix the memory leaks. But few (if any) are professional programmers: most contributors are graduate students with no background in programming.


And this is scary. At least with "high-energy experiments" (like the one that discover Higgs) in colliders, a lot depends on so-called triggers, which dismiss 99.9% of information produced in a collision "on the spot", so that this information is never recorded and analyzed.

They have to: there is way too much information produced. So the triggers try to identify "trivial" events and dismiss them immediately, relaying only the ones that are may be somewhat unusual/unexpected.

Essentially, the triggers are computers with highly specialized programs. Very smart people work on this, and supposedly they figure out problems with triggers before they affect the results of experiments...


The triggers are the most fun part of the experiments!

The composition of teams working on triggers might be a bit of an exception in the "engineer : "data scientist" ratio. Most of the talent is still from a physics background but there's more of an engineering bent where around half the team can probably write performance critical code when they need to. Elsewhere that ratio is much lower.

Determining which data to save is a mix of engineering, algorithms, physics, bits of machine learning, and (for better or worse) a bit of politics. Surprisingly we're always desperate for more talent there.

As you say, the goal is to try to stop problems before they affect the data, but it's not always perfect. Sometimes we discover sampling biases after the data comes in and need to correct for them, and in the worst case we sometimes blacklist blocks of data.


> It is a software engineer’s job to build quality software. A scientists job is to solve problems. There’s a clear boundary here, where the latter will deliver a concept to the former, who will eventually create a production-grade implementation off of that. Neither does a scientist have to build proper software, nor does a developer have to do cutting-edge research.

Precisely! See my relevant comment from another thread here - https://news.ycombinator.com/item?id=38821679

References:

1) Why science needs more research software engineers - https://www.nature.com/articles/d41586-022-01516-2

2) Research software engineering - https://en.wikipedia.org/wiki/Research_software_engineering


The real difference here is between hobbyists and mere workers. Programming happens to be one of the disciplines which has a lot of hobbyists. But that doesn’t mean you won’t find hobbyists in other disciplines.

Look at machining for example. Long dominated by people working in machine shops making tools and parts on the clock. But in the background there’s a strong hobbyist contingent and there you’ll find endless debates over whether a beginner should buy a decades old Bridgeport milling machine or a brand new Chinese-made one.

Similarly I find endless debates about what sort of frying pan to use (nonstick, stainless, carbon steel, or cast iron) among amateur chefs, but you’ll never see people working in a restaurant waste time on that.

In other words, it’s hobbyists who really obsess over tools and craftsmanship. They do it because they love it. Programmers just happen to be among the odd sort who can get paid to practice their hobby.


I don't know about that. Pros obsess over it too, but they don't bother to endlessly talk about it; they just do it. But if a better tool comes along, they'll consider it.


In computer science, for some reason - people struggle to distinguish between an engineer and a scientist.


I think that is, because we are still at the frontier and the lines between research and developing something new are quite blurry.

By now there are already lots of fields in IT that are quite standardized, but others not so much.

For example, what is the fastest way to draw lots of shapes on a canvas on the web?

There is no definite and fixed answer, as the field is still evolving and to find out the fastest way for your use case, you have to do research and experiments.


> as the field is still evolving and to find out the fastest way for your use case

No one in the world at large cares about the fastest way, they care about the lowest budget :)


Depends. If gaming is what you do, the better the performance, the bigger the market. As then more people can play your game.


HN is funny. I've been told on a previous discussion that AAA publishers don't optimize for potatos. Now you tell me it's a business requirement.


I told you it depends. There are AAA games for console and gaming computers and there are casual mobile games for example. Very different markets.


Oh the free to play stuff doesn't exist for me.


I guess the main problem is that many self-proclaimed "software engineers" are surprisingly bad "programmers" ;)


Just like most "research" is absolutely useless, when not counter-productive ;)


It's often very useful for those funding it.


Universities? Their main goal is to get better in the rankings, isn't it?


>It is a software engineer’s job to build quality software.

I think that most software engineers' jobs is not to build quality software, it is to make money. When you are trying to make money, the goal is not necessarily to make the best quality software that you can. Often, it is to make acceptably good software as soon as possible. A company that writes software that is half as good and ships it twice as fast might outcompete a company that writes software that is twice as good and ships it half as fast. I sometimes wish that my job was to build the best quality software that I can, but that is not the case. What I really get paid for is to make my employers' company successful, to make them money. And it's not that my employers don't care about making high quality software, it is just that if they cared about it too much they might get outcompeted by others who care less about it.


The problem is that most “professional” developers — i.e. people who write software as a career — are terrible at those things. Even (especially, sometimes) those that call themselves Software Engineers and talk endlessly about the right way to do things.

That is not to say the kind of software engineer that does what you say and tends to build quality software (or at least move it in that direction) doesn’t exist, but the demand for people who can make computers do stuff (loosely, developers) so far outstrips the supply that outside of a few bubbles (HN being one of them), they appear to be so vanishingly rare they might as well not exist.

I’m sure some scientists get to work with useful and highly valuable professional developers, but I’d be amazed if they were the majority.


If the bad wood working would jepordize the results of his, professional salary earning, work then he should probably consider learning wood working, no?


Don’t think so, no. A physicist has other stuff to learn and spend their time on. Instead, they should partner with a carpenter to do their woodworking from a rough sketch.


If your point is that the physicist should partner with someone who is a "professional programmer" ("carpenter") to do the coding, I couldn't disagree more, speaking as a former research physicist who wrote many programs while I was in academia. A "rough sketch" is not enough for a "carpenter" to go off of for the programs a physicist using computational techniques is writing. They'd need to have a sophisticated understanding of the physics and the mathematical model involved: which almost no "carpenters" have.


Now we’re lost in metaphors. A carpenter should definitely be able to build a pergola from a rough paper sketch, and a software engineer should be able to build a machine learning system from an algorithm paper by a data scientist.


That really depends on what you mean by "machine learning system." For a business, implementing some researched algorithm and essentially using a template? Sure, it's possible. The research that goes into producing the algorithm in the first place? Probably not.

There may be some classes of scientific programming which have simple enough models for a software engineer to implement. Scientific programming that relies on deep understanding of the domain and mathematical models employed to study it don't fit in that category.


Of course it always depends on the context. But my point still stands: a software developer is someone specialising in creating and maintaining code for an application. Any domain knowledge they may need for a particular application doesn’t matter here; their core skill set is software, whereas a scientists core skill set is active research.

Just because some scientific disciplines require extensive programming doesn’t make those scientists good software engineers: Producing an algorithm is different from generating revenue from it.


It’s like saying that a scientist doesn’t need to know how to write, and they should just pair up with ghostwriters/copywriters.

Many professions have tool/skill requirements that are not related to that profession on a strict sense, but are still necessary to do the job properly.

When I learned engineering we were taught how to draw diagrams by hand and write in a technical font. Computer code, for many science fields, is like diagrams but for theories.


> It’s like saying that a scientist doesn’t need to know how to write, and they should just pair up with ghostwriters/copywriters.

And I'd say that illustrates my point even better! A scientist needs to know how to write a paper, but that doesn’t make them a great author. When they aim to write a book, they should get help from a professional publisher. Both tasks require writing text, but a book is very different from a paper and requires heaps of additional training and knowledge that scientists usually don’t need to have.


> but that doesn’t make them a great author

This is also widely considered a huge issue in academia. The dissemination is so bad that a lot of great research is never read by others.

Academia is not a business. They barely can sustain themselves, therefore, they need to do things themselves. It is though to be a physicist. Mostly because you, besides being a great physicist, also need to be a good programmer and a good author.


It depends on what you consider a core skill. One has to specialize; that means leaving secondary tasks to others.


This is only true for systems where comparative advantage applies. Academia is not one of these places (as they explicitly don't want to be a part of the economy, which I understand).




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: