Consider working on genomics

lebovic · on Nov 19, 2022

I'm a software engineer who works on genomics. I see a lot of negativity in this thread, which mirrors my experience: in most places, you'll be paid like a researcher with the respect of a lab assistant – unless you have a PhD and a postdoc.

That said, it's possible find work that's respected and pays well. Most of that kind of work is happening in the context of startups or freelancing. My favorite example of this is Robert Edgar: he's a freelance computational biologist with over 100k citations who has made a living for the past 20 years by selling licenses to his bioinformatics software (https://scholar.google.com/citations?user=RzVMRc0AAAAJ).

To find those kinds of jobs, I'd try YC's Work at a Startup, Flagship Pioneering's portfolio companies, and emailing founders of companies that have a bioinformatics component (my email is in my profile!).

I think the issues with the field are because it's a new and growing space. We do need better tooling, respect for engineering, and established best practices, but that seems to have been the case in the past for other domains that moved from research to industry – including software engineering itself.

otoburb · on Nov 20, 2022

>>I see a lot of negativity in this thread, which mirrors my experience: in most places, you'll be paid like a researcher with the respect of a lab assistant – unless you have a PhD and a postdoc.

As an outside observer to this area, something doesn't add up. It sounds like software tooling is desperately needed to advance the entire field across the board, yet it seems that few startups or founders are attempting to tackle this problem or, if they exist, aren't having much of an impact (perhaps, yet?).

One would imagine that all of this inefficiency, suffering and bottlenecking of incredible therapies to cure diseases and advance human knowledge would be a siren call for capital allocators to unlock value by solving this pain point -- but here we are, in some cases still 20 years and counting.

I can buy the argument that FAANGs have had amazing compensation packages over the past 2 decades, but this still doesn't address the reason why nobody else has bothered or been able to to "disrupt" (yes, air quotes) the industry in this regard and harvest such seemingly low-hanging fruit.

I see a few comments talking about the PI and grant-funding model -- but if the promised value was sufficiently large then I find it hard to believe that this wouldn't have been a competitive candidate alongside other recent buzzword-laden investment trends such blockchain & AI that pulled down so much VC funding over the past decade.

Clearly, I'm missing a piece of the value puzzle as to why founders and startups are few and far between to specifically address the dire straits that biological software engineering (computational biology, bioinformatics, systems biology, etc.) finds itself in.

x0x0 · on Nov 21, 2022

> As an outside observer to this area, something doesn't add up. It sounds like software tooling is desperately needed to advance the entire field across the board

This is hardly the only place in society where X is desperately needed, but people don't want to pay for it, and continue to suffer through the underprovision of X.

Here, software engineering is simply not viewed as prestigious or important, or often, even as valuable. The science is viewed as valuable. A lack of bugs or rigorous software engineering practices... nobody cares.

Some of it is a phd is a prestige competition, and dirty engineers being comped on par with people who spent (cough wasted cough) a decade or more of their life in college/post-docs just won't do.

And a piece of it is the scale of the investment needed. Imagine a couple million LOCs with only manual testing. Your two weeks of writing tests is a tiny drop in the bucket. Retrofitting reasonable software dev standards on these projects is enormously expensive.

Finally, there's an inescapable volume issue. Suppose I build a hot new confocal microscope and I sell 300 of them for low hundreds of thousands each. I have 2 teams of devs ($2M/year/team) for 2 years on analysis code, for a $8m investment. That's $27k/machine. That's real tough math to make work. Whereas Google pays gmail engineers really well, in part because they spread those costs over a billion users.

ethbr0 · on Nov 22, 2022

> Here, software engineering is simply not viewed as prestigious or important, or often, even as valuable. The science is viewed as valuable.

Specifically, publishing is viewed as valuable. And publishing is a frozen-in-time snapshot.

Ergo, all the things that drive quality software elsewhere (SRE, maintainability, interpretability) simply don't exist as incentives.

> [Low product sales count is] real tough math to make work. Whereas Google pays gmail engineers really well, in part because they spread those costs over a billion users.

Also a great observation. Most cutting edge is, by definition, mostly custom. It's really hard to amortize even incredibly valuable things over tiny population counts and still pay market software engineering wages.

mfld · on Nov 20, 2022

I have a practical example to explain what, at first glance, doesn't seem to add up. Many genomics analysis contain a step to clean up DNA sequences. For this "Read Trimming" step there exist no fewer than 40 open source tools. Let's say you need this step in your project and use one of these tools. You find an issue the original creator is unwilling to fix: just choose another tool, problem solved. You find that the tool does not perform well enough on your data: choose another tool, problem solved. So while the articles points are true, they in practice often don't lead to a real pain.

mike_hearn · on Nov 20, 2022

I recall once hearing from a VC about why they hardly invest in biotech (or it might have been reading it somewhere, memory is fuzzy). It boiled down to: way too much non-replicable research, often with suspicions of fraud by the original labs. It can easily be the case that a biotech startup burns through millions setting up a lab from scratch, then attempting to replicate some academic paper that they thought they could commercialize, only to discover that the effect doesn't really exist. This problem doesn't affect the software industry, so that's where the money goes.

Why so few tooling companies - is there actually a market for good software in science? For there to be such a market most scientists would have to care about the correctness of their results, and care enough to spend grant money on improvements. They all claim to care, but observation of actual working practices points to the opposite too much of the time (of course there are some good apples!).

In 2020 I got interested in research about COVID, so over the next couple of years I read a lot of papers and source code coming out of the health world. I also talked to some scientists and a coder who worked alongside scientists. He'd worked on malaria research, before deciding to change field because it was so corrupt. He also told me about an attempt to recruit a coder who'd worked on climate models who turned out to be quitting science entirely, for the same reason. The same anti-patterns would crop up repeatedly:

- Programs would turn out to contain serious bugs that totally altered their output when fixed, but it would be ignored because nobody wants to retract papers. Instead scientists would lie or BS about the nature of the errors e.g. claiming huge result changes were actually small and irrelevant.

- Validation would be often non-existent or based on circular reasoning. As a consequence there are either no tests or the tests are meaningless.

- Code is often write-once, run-once. Journals happily accept papers that propose an entirely ad-hoc and situation specific hypothesis that doesn't generalize at all, so very similar code is constantly being written then thrown away by hundreds of different isolated and competing groups.

These issues will sooner or later cause honest programmers to doubt their role. What's the point in fixing bugs if the system doesn't care about incorrect results? How do you know your refactoring was correct if there are no unit tests and nobody can even tell you how to write them? How do you get people to use tools with better error checking if the only thing users care about is convenience of development? How do you create widely adopted abstractions beyond trivial data wrangling if the scientists are effectively being paid by LOC written?

The validation issue is especially neuralgic. Scientists will check if a program they wrote works by simply eyeballing the output and deciding that it looks right. How do they know it looks right? Based on their expertise; you wouldn't understand, it's far too complicated for a non-scientist. Where does that expertise come from? By reading papers with graphs in them. Where do those graphs come from? More unvalidated programs. Missing in a disturbing number of cases - real world data, or acceptance that real data takes precedence over predicted data. Example from [1]: "we believe in checking models against each other, as it's the best way to understand which models work best in what circumstances". Another [2]: "There is agreement in the literature that comparing the results of different models provides important evidence of validity and increases model credibility".

There are a bunch of people in this thread saying things like, oh, I'd love to help humanity but don't want to take the pay cut. To anyone thinking of going into science I'd strongly suggest you start by taking a few days to download papers from the lab you're thinking of joining and carefully checking them for mistakes, logical inconsistencies, absurd assumptions or assertions etc. Check the citations, ensure they actually support the claim being made. That sort of thing. If they have code on github go read it. Otherwise you might end up taking a huge pay cut only to discover that the lab or even whole field you've joined has simply become a self-reinforcing exercise in grant application, in which the software exists mostly for show.

[1] https://github.com/ptti/ptti/blob/master/README.md

[2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3001435/

ninala · on Nov 19, 2022

If you want to find good jobs like this, you should check out the monthly HN "Who is hiring" threads. There aren't that many biotechs, but there are a few (like ours). I agree about the need for better tooling. There's also a need for a stronger conceptual understanding of cell biology generally, and an ability to build an ecosystem of APIs that work well together. My email is in my profile if you'd like to talk more about this space.

ababaian · on Nov 20, 2022

Genomics today is the internet of the early 90s or programming of the late 70s. There is going to be an enormous boom in genomics-derived technologies in the coming decade which has been driven by the exponential decay in data generation costs. You're absolutely right that places with a "startup" mentality are where to be right now.

On that note :) I'm starting a forward-looking research lab at UofT to advance massive-scale (think petabytes) genetic analyses and am looking to find the right few individuals who have a similar vision. It's difficult to find passionate engineers with a solid CS and HPC background who are willing to meet halfway and work _together_ with biologists in getting the analysis right. Robert does this _very_ well, and that's why we recently co-wrote a landmark Nature paper: https://www.nature.com/articles/s41586-021-04332-2.

Job post: https://jobrxiv.org/job/university-of-toronto-27778-full-sta...

bmitc · on Nov 19, 2022

Usually, scientific oriented companies or organizations have little regard for software as a domain, craft, etc. It’s just a thing that gets in the way, despite being vital. It’s almost just a utility to them rather than a differentiator and active component of the advanced work going on.

For example, the Broad Institute is super interesting, but having applied there several times, they are esoteric, to say the least, in their hiring. They pay well below market, and their process is opaque and slow and sometimes downright non-communicative. They are also not really open to remote work, so you gotta move there and commute to the heart of Cambridge. Budgets are set by folks maybe a couple years out of a PhD program, who will also make technical decisions in terms of the software design (the latter an assumption given my experience in similar places).

These organizations are also pretty traditional in their selection of stacks. Good luck trying to use a functional-first language, aside from maybe Scala (usually lots of Java stacks), and be prepared to write lots of Python, the only language that exists to many scientists. I once saw a Python signature (function name and arguments) spill over 10-20 lines, in a file over 10,000 lines long. They had given up on another software stack because “it wasn’t working for them”.

This is all painting with broad strokes, of course. But I think scientific organizations that would embrace software as a major component of their technological and scientific development would do well. There’s a lot of opportunity.

hobofan · on Nov 19, 2022

> Good luck trying to use a functional-first language

Good luck trying to use a functional-first language at any company (be in bioinformatics or otherwise).

agumonkey · on Nov 20, 2022

it happens :)

and the coming years will be interesting, rust is placing a lot of functional bits on the map, just like closures were an obscure thing 10 years ago, there might be a rise in abstraction in the mainstream

neilv · on Nov 19, 2022

> I once saw a Python signature (function name and arguments) spill over 10-20 lines,

Quote I liked (can't find attribution; maybe Alan Perlis?):

"If your function has 10 arguments, you're missing some."

geoffjentry · on Nov 19, 2022

> Good luck trying to use a functional-first language, aside from maybe Scala

While they've moved away from it in the last few years, the Broad Institute had a huge investment in Scala. It's been in use there since at least 2010 and I believe longer. The primary software department was almost entirely Scala based for several years. That same department had pockets of Clojure as well.

aednichols · on Nov 19, 2022

Current Broad SWE with 5 years’ tenure. Feel free to ask any questions.

I’m in the “bunch of software people together” department so it’s not as insular or PI driven as working in a lab.

I still mostly like the role but it has become more generic over the years as the department acquiesced to the working ways & programming languages of outside private funders.

maxFlow · on Nov 20, 2022

1. Could you share a bit about your stack? I'd be specially interested about the data engineering side of things if possible.

2. As a SWE, how deep into biology/genetics concepts have you had to go during your tenure?

aednichols · on Nov 23, 2022

1. I'm on the product engineering side rather than data, but from what I know, Scala is still heavily entrenched. New projects are all in Java; other languages are effectively disallowed.

2. I learned a fair amount in the 2017-2019 time frame but with the pandemic and increased specialization my pace has decreased. Lunchtime talks are just not as fun to attend on Zoom. Another possible explanation is that my curiosity has been satisfied, and someone with more curiosity could get more out of it.

laidoffamazon · on Nov 19, 2022

I live next to Broad's offices and see people leaving/entering the office at odd hours on Saturday and Sunday. That (and the fact that they pay about 75% what I made as a new grad) prevented me from ever applying there.

geoffjentry · on Nov 19, 2022

Keep in mind that there are wetlabs with experiments being conducted in them. Lab techs will be coming and going at all hours.

enraged_camel · on Nov 19, 2022

Yeah, I’d love to work in scientific computing and write Elixir, but it seems non-existent.

DonsDiscountGas · on Nov 19, 2022

I've worked on genomics, at the Broad. Can confirm there is a ton of toxicity. There's also a lot of smart and great people, if you can find a team specifically dedicated to software I would recommend it. Also the pay is good...for a non-profit.

More broadly...shortages like this aren't because SWEs just love ad-tracking and hate health improvements. People need to be willing to pay for these services (case in point; the jobs link for AWS has 3 links which appear unrelated to genomics, for Microsoft there are 2 for interns, and for Google its empty).

There are a lot of opportunities in biotech for SWEs, and many firms (though not all) really do respect the power of software. Worth looking around if you're interested in the area.

noname123 · on Nov 19, 2022

Hmm as an ex-Broad employee (and now in another genomics center), what did you find really toxic about the Broad?

FWIW, I really loved Broad the people, my direct line manager and co-workers. The management was horrible and the management at DSP (not the line folks/managers) were the worst.

noname123 · on Nov 19, 2022

Since my post was upvoted a tad... I'll give more feedback about the Broad.

When I joined, it was running really like an academic center. Like literally in my lab, if I wanted to go into the lab and pipet and do library prep, the wet lab scientist would teach me and vice versa. It was lit. a place where anybody could pivot their career to anything. We worked on NIAID/NIH grants and went to conferences even as SWE's and I felt we were doin' important research - not just pipeline monkeys but actually performed important analysis like RNASeq differential analysis, ChIPSeq peak calling, metagenomics etc. on publications along with PhD scientists even if I didn't have the academic credentials. Groups within the Broad was running courses for Software Engineers to learn Biology... and you could literally take off middle of day to go across the street to Stata Center to attend lectures on ML or audit Comp Bio at MIT. Nobody would bat an eye. 75% of my group got a Masters degree on the job where we spent more time some months on classes than actual work.

The culture somewhere shifted around 2018-2019... where they brought in new management to run DSP (Data Science Platform where most SWE's likely end up). The DSP management (not the chief guy) but lieutenants ran the "tech playbook"... get PM, Scrum/Agile coaches in; make software and comp biologists line workers.A lot of my fav. people either left or got pushed out. A lot of intuitional knowledge about sequencing and biology got lost, self-driven people left and line workers to work on Portal web development and Data pipeline management came in. To the point where I presented once to the software engineers of DSP and nobody in the room even knew the basic's like what is a long or short reads is. I left soon afterwards to a place where I wouldn't be silo'd. I wouldn't recommend the Broad to anybody now... unless you're working for an academic group. Avoid platform groups (DSP mostly; other platforms are still good) if you want to learn & grow.

bmitc · on Nov 19, 2022

This is a great read. Thanks for the information. What you originally described is basically my dream job: software engineers working alongside scientists and engineers, where the software engineers become domain knowledgeable if not experts in certain areas.

I had a job similar to that at a similar places (actually places), but I ended up leaving because I was a one man team and got burnt out. Writing software for scientific purposes and true R&D is very fun and interesting, and I think there is a lot of untapped potential for doing some interesting things there. But there is a balance between the wild west, then what your first described, and then what you later described. Keeping things organized enough to not be chaos but loose enough to not get siloed.

geoffjentry · on Nov 19, 2022

A really hard aspect to this is that there's a massive impedance mismatch between the research & production side of things. Working in the research side is pretty straightforward - although software development practices are going to be a lot looser & faster. Working in a production environment is straightforward, it's like any other software job. But - working at the confluence of those two states is incredibly difficult.

noname123 · on Nov 20, 2022

Fwiw I acknowledged the good people directly from the Cromwell team on a presentation recently due to the incredible support/help somebody & their team provided my team. The WDL/Cromwell community has grown and I've heard people mention it everywhere now (far away from the Broad) and it's in no small part due to that team and its former leadership.

aednichols · on Nov 23, 2022

Hey, that's my project! (And geoffjentry is my former boss.)

Nice to hear the praise, thank you. The project has changed a lot over time and inevitably left some disappointed people filing Github issues (CWL, non-cloud backends, etc).

It's really unique and enjoyable working on OSS that has a strong community, it is the highlight of my career.

aednichols · on Nov 19, 2022

Current DSPer since 2017 and broadly agree with OP.

noname123 · on Nov 20, 2022

Yea I liked almost everybody in DSP.

I disagree strongly however with the management's tech approach, the Broad cannot compete with the other companies in the Boston area in terms of comp. What worked in the past is smart people came to the Broad for the fun, autonomy and intellectual challenge over pay.

If you reduce people to just worker-bee's finishing CRUD tickets on an Agile board - you'll get efficiency for the 1st 2 years but you'll lose so much institutional/domain knowledge and collaboration; the PMs will get promoted in 1 year while projects on a 5 year timeframe get uber-delayed; the smart people will get bored and leave and even the good efficient engineers in the system will also leave for the better well paid tech jobs that Broad cannot compete with. I've seen it in every company who has run this playbook.

Melting_Harps · on Nov 20, 2022

> respect of a lab assistant – unless you have a PhD and a postdoc.

This applies to even those with those with Biology backgrounds, as an undergrad that entered the Industry after an expensive and precarious 5 years of University and exiting during the aftermath of the financial crisis with tons of debt I knew I was never going to enjoy or like my time there within the first months.

I had aspirations to be CLS (you need to be sponsored by a corporation for the training/licensing process) but the truth is the Industry is rife with petty political rivalries where you can get sucked into for no other reason than being assigned to someone's lab that didn't cite them years back--you and your career can easily become collateral damage as result or some other bitter rivalry.

I found most in that Industry to be passive-aggressive cowards who would never confront an issue with anyone or anything and would rather create and foster this toxic atmosphere where it's typical that unless you did a PhD or a Post-doc you might as well be a mindless drone who carries out the edicts of your superiors who graduated in the 70s or 80s.

I will offer this advice: don't enter the Industry unless you get paid extraordinarily more to do so than any other offer you get, and if you love the life/health sciences (as I once did) please find some other outlet because the Industry will quickly steal any passion and leave you without much recourse.

Work in Genomics is promising, as is most Health Sciences in the 21st Century, but it is in DIRE need of a cultural shift (most boomer aged researchers need to die or retire already) and since the best ones are bio-hackers for a reason despite the lack of funding, there are other options albeit not lucrative ones.

> The culture somewhere shifted around 2018-2019

Your experience sounds like the brochure version of what we were sold as an undergrad in the Health Sciences, the ability to have on the job cross-discipline job experience, the reality was way more toxic, we didn't have agile or PM back then but we had Lab Directors and the thumb of corporate which in my view was way more hostile towards such an environment. Anything that deviated from your workload was seen as a unnecessary distraction and misuse of company resources.

I'm glad I made the pivot to tech when I did despite the turmoil to get there, but sadly now that I'm focused on AI/ML in order to come back to tech industry outside of my narrow displine, it's now imploding on itself with mass layoffs or hiring freezes and it seems that the recession will be used a reason to up-end the many reasons why tech was better than the health sciences, where apparently it's already becoming more normal for even a role as an intern for a YC backed company to require a Masters/PhD student!

noname123 · on Nov 20, 2022

Yea if there's a unsolicited take-away from me... don't ever become a lifer or think a company/field will never change. Nobody, even if it's GOOG circa 2010's, IBM circa 1970's, MSFT 1990's ever offer good combination of intellectual and comp forever. Only you and maybe your mother care about your well-being... I don't feel sad anymore about the good jobs/research opp. is now gone. I realize I was lucky and I aim to always to try adapt to get into those environments and accepting things are always changing.

aWidebrant · on Nov 19, 2022

"From my experience, what works incredibly well is a partnership between biologists and software engineers: the biologists first come up with the first concept of the tool, which is purely focused on ensuring good results. After this first iteration is completed, engineers then come in and rewrite the tool using modern engineering practices with things like speed and reliability in mind."

Like others have pointed out, this really makes the engineer's end of the bargain sound like janitorial work. There's no lack of fields where researchers and engineers both sit at the table from the beginning to pick which projects to pursue and how to implement them.

kzuberi · on Nov 19, 2022

> this really makes the engineer's end of the bargain sound like janitorial work

I don't think you should interpret it that way. Another take would be that its like collaborating with a domain expert outside your specialization.

Important is that your potential impact as an engineer can grow as you become more knowledgeable in the relevant bio. Most of the scientists I've worked with were happy to teach background (and some were just exceptional, fun times if you also found the field interesting as I did!). Obviously some allowance must be made for differences in culture from org to org, and that likely accounts to some of the disappointed voices - but I'm not convinced this is endemic to the field as opposed to organization specific. Just like with an opportunity with any particular company, do your research.

Incidentally, working on a well defined engineering+optimization problem, if you are lucky enough to bump into one, is just candy for lots of engineering types. Ok quick & simple one: a scientist I worked with was doing some analysis that involved intersecting piles of genomic intervals with each other, which was taking many hours for a single run - super painful to tweak and re-execute. Our team showed them how to use interval-trees and made these available integrated in our internal tools, and the problem transformed into ~10 min execution runs. See, a wee a bit of comp-sci where suddenly you're the domain expert. And appropriately appreciated!

clmcleod · on Nov 19, 2022

Yeah I think this is fair enough after reading it back. However, that was not exactly my intention here, and I think this is a case of me needing to be more careful in my wording.

When I said that software engineers add in the speed and reliability, I didn't mean they _only_ add in the speed and reliability: just that these two tenants of good software engineering where accounted for in this "correct" way of doing things (as opposed to the state of most genomics software that I described above).

However, I can see how my phrasing can give the wrong impression about the contributions an engineer makes when the biologist and engineer sit down to do create the real thing together. In a positive environment, both sides (biologists and software engineers) share enough information with one another that the either can make contributions to the scientific/software engineering domain.

sargstuff · on Nov 19, 2022

software engineer provides/developes the appropriate level of abstraction for the non-software engineer to make use of.

Which if there's no standard for field, and working outside of a given field, makes writing grant(s) without paring up with someone who can develop field standards to be included in grant necessary. Hard to find/compete for scarce applicants using limited resources.

aka startups vs. big company funding for pure research lab (bell labs, xero parc, etc)

cratermoon · on Nov 20, 2022

This rings all kinds of alarm bells and flies all the red flags.

How many of us have heard from some guy who has said, essentially, "I have an idea for the next <fill in the blank with whatever is hot>, and I've already sketched out a prototype. There's just the small matter of programming and we'll make $LARGE_SUM?

Imagine this happening in the business world. A partnership between SMEs and software engineers. Oh, we do this all the time, that's why software engineers get paid well: we turn ideas into working code. Anyone ever heard of a product manager "banging out a prototype" and then handing if off to the software engineers to rewrite?

The more I re-read the passage from the article, the worse it sounds.

sseagull · on Nov 19, 2022

I’m more familiar with chemistry, but a lot of times the scientist is the one who needs to make the first iteration to prove their idea. It’s often the case you really don’t understand the problem until you actually program/run the idea in at least a quick and dirty way.

But the role of the software engineer after that is invaluable in making that idea accessible and reproducible.

npteljes · on Nov 20, 2022

Janitors maintain, not create or optimize. It's literally an engineer's job to plan and realize a concept.

whimsicalism · on Nov 19, 2022

Call me when non-tech fields learn to treat engineers as equal partners rather than disposable labor.

Academia? Yeah they're going to be one of the last to realize, PIs don't want to cede any power in their little fiefdom. Very familiar with the dynamics there.

sargstuff · on Nov 19, 2022

Scope of PI interest / funding defined by research grant.

So, unless research is ground breaking / exploritory across disciplines, supporting disciplines tend to be extremely limited by PI research interest.

aka (wording sanitized a bit) tends to come across as being tight wad / ham fisted

sargstuff · on Nov 19, 2022

side note: Biological sciences tend to have way to many applicants for available positions. Typically not enough software engineers for available positions.

Treating a position with limited available applicants as if there were to many available applicants is always a receipt for issues.

hobofan · on Nov 19, 2022

> the biologists first come up with the first concept of the tool, which is purely focused on ensuring good results. After this first iteration is completed, engineers then come in and rewrite the tool using modern engineering practices with things like speed and reliability in mind.

I think that is already accepted as good practice, and the way most people in the field work, which is part of the reason why the field is in this shoddy state right now. Because in reality, most of the time those engineers don't exist and it will never advance to the second stage, but will still be used regardless. And even if you manage to find an engineer for your team, the same problem exists in many layers down your stack.

As with most other kinds of software, the biologists should be treated as customers (or trained up to be skilled-enough engineers), as it is done in other disciplines. To create good accounting software you also wouldn't propose to have the accountant write the initial version of the software, would you?

> Many of the projects that are critical to the foundation of genomics are reaching or have eclipsed the ten-year mark. How much longer can we expect these individuals to single-handedly maintain these code bases?

What you propose sounds more like "hey, be the next idiot that commits to maintaining critical software for nothing", rather than any systemic change. The ugly secret of bioinformatics is the same one as in broader tech: Most of it runs on the backs of unpaid OSS maintainers (in this case a handful of motivated PIs that carve out some of their time for that).

If you want to have good software in the sciences, you first have to solve the OSS funding problem.

PS: the `user-select: none;` on your page is really annoying

jltsiren · on Nov 19, 2022

> As with most other kinds of software, the biologists should be treated as customers (or trained up to be skilled-enough engineers), as it is done in other disciplines. To create good accounting software you also wouldn't propose to have the accountant write the initial version of the software, would you?

Accounting is a bit different, because it has already been invented. There are standards and best practices for it. In bioinformatics, writing software is often a research activity. You write software to determine what the software should do, and then you adjust your ideas and rewrite it. The person writing the first version(s) of the software is a researcher – at least in practice if not by job title.

hobofan · on Nov 19, 2022

Accounting is not a static thing, and is also constantly changing with new legislation and financial instruments popping up. Most bioinformatics tasks nowadays are not any more "creative" in their research. Specifically in the last few years a good chunk of the research is just okayish application of ML research to their field of research.

For many specific problem sets in the natural science informatics disciplines, you can just stay up-to-date on ML trends and release a new paper that applies them every few years, in an almost automatable way.

jltsiren · on Nov 19, 2022

There is a good chunk of research like that, but there is also a good chunk of research where the "biologist as a customer" model does not work. In research like that, it's the job of the person writing the software to figure out which biological problems they are trying to solve and how.

uvesten · on Nov 19, 2022

Anecdata, but I did a master’s in bioinformatics a few years ago, and as part of that I spent about a year in a human genomics lab.

I fully agree that the software used is really bad in general, but what is worse is the level of IT literacy among the PhD’s and post-docs from the biology side. (Also statistics, I guess a lot of p-hacking is the result of authors simply being clueless…)

After finishing my thesis, I was offered to stay and work at the lab. After thinking about it, and accepting, I was told that funding wasn’t secured yet, but that it should come ”any day now”…

Thanks, but no thanks.

Anyway, I fully see the need for professional software engineers in this field, but job security and even job availability (aside from the low salaries) in academia is abysmal, so I don’t think the current situation will change any time soon.

rafiki6 · on Nov 19, 2022

Honestly, I think people like you are the right kind of people to start private enterprises and bring along professional software engineers to help you build something incredible in the space.

rleigh · on Nov 19, 2022

There might me an opportunity for a company to enter this space, but then again maybe not.

When everything is a mess of ad hoc Perl, Python and R scripts to solve unique one-off problems, you might well find that there isn't a sufficiently common subset of functionality that people are prepared to pay money for. That is, while the need may be there, the business case may not be. It might be that most of the field are quite content with the status quo.

It's easier and cheaper to get some poorly-trained PhD students to wrangle badly-written and poorly-maintainable scripts than it is to pay a company to provide a robust and well-written solution instead. The "indentured labour" also distorts the supporting ecosystem. [I say this after having done a PhD in biomedical science.]

I remember one of my colleagues asking me to help him getting some special software from a particular group working [for DNA methylation analysis]. They wanted paying $10K for it IIRC. It was a complete mess, wouldn't work, had not documentation, and I didn't trust it was genuinely functional it was just such a state. For a one-off, maybe $10K was worth it, but if you only have 2-3 customers worldwide who will pay, it's not a viable business if the product works perfectly, let alone if it's a fragile disaster that barely works at all.

neilv · on Nov 19, 2022

When considering software roles in science organizations, forget assumptions you might make about a typical tech job, joining a bunch of other software and hardware people -- or you'll risk accidentally ending up on the other side of a distorted status system (not the side that normally pampers techbros).

You need to feel out the particular person you'll be reporting to on how well they personally respect and understand the role, and also whether they'll have clout/funding and have your back if the org turns out to be rough (think AMZN). And also try to feel out respect within the organization, and some of the people/teams with whom you'll be collaborating.

You also need to check compensation, so you don't wind up a low-paid person who later discovers they're competing for local house offers with others in the org who are getting big-bucks TC (plus consulting on the side).

You also probably have to be OK with never being the star (like you hypothetically could someday be in a software company). Supporting actors should still get respect and get paid.

Find the right science situation, and you might have much more positive impact on the world than you could have in a software company, while also being happy and comfortable.

Some more quick of-the-cuff comments about this (sorry for run-ons, but I need to get back to my weekend)...

* RESPECT -- Whether or not the organization is university-affiliated, a lot of the researchers and administrators might have only worked in academia-like environments before. Academia is very hierarchical, software engineering might be considered commodity technician or support staff, and the high-status people almost certainly don't understand your discipline, though they might think they do. (They often think software is relatively easy grunt work, and that software people just have oversized egos, which has some truth to it, but not that much.)

(Some real-life instances of this I've heard of include: someone with no understanding overriding software engineering technical decisions, because a colleague from their academic caste made an offhand comment, and they assume an academic who hasn't even looked at the system knows more than an experienced practitioner developing it; not wanting to include people who made key software contributions as coauthor on a paper for a software system, but making sure professors who had near-zero involvement were included; scientists openly speaking of the software people as having commodity interchangeable skillsets, in way they'd never speak about peers in their domain; getting an unsalvageable monstrosity of pasted-together incompatible frameworks and Stack Overflow posts done by a summer intern, dumped on software engineer to "clean up" or "extend", and being unable to convince that this is orders of magnitude harder to fix than to just make a viable system in the same time the intern took; in an academic environment, a grad student being higher status than key software people, and bossing them around with bad decisions, while treating their own obligations like homework they were trying to sneak past a grader rather than as a system that has to actually work.)

* COMPENSATION -- Related to the above. If you're very experienced and marketable in tech, and would be making key enabling contributions, are you getting paid like it?

(The most recent life sciences software engineering opportunity I talked with, with a high-profile organization, they needed FAANG-like Staff/Principal experience in multiple areas, all-in-one person, for key bespoke computational infrastructure on which a lot was riding. When we got to salary, it was capped at less than a new grads were getting offered elsewhere, and despite being in a top HCOLA city. The recruiter half-heartedly argued about it being for the science, etc. I said, if they're thinking of this as an academic non-profit, that would be OK, so long as everyone there is making this level of money. But that wasn't the case: the science domain people were considered the valuable assets, making good money, and software was seen as more a commodity support skill by whomever set the pay grade. Maybe within a decade that will agree with the market, everyone will decide that someone who can learn organic chemistry should get paid more than someone who doesn't seem to do much more than fingerpaint in a Web framework builder and type nonsense in Jira, :) and maybe then most software people will be thankful for any job at all, but not yet.)

(I did actually look at a science company with a strong software tech company influence. But, though they claimed to be rethinking how the tech company did things, they seemed to carbon-copy the single most obvious bad side of that company. Talking with colleagues after I withdrew my application, the gossip was that they were getting lots of software people who'd burnt out on the tech company. So I guess maybe the rethinking was on what had been bothering those people, who were already at the tech company, and so who weren't entirely representative of the talent pool that included people for whom the tech company had showstoppers.)

_dain_ · on Nov 19, 2022

your parenthetical paragraphs are bigger than your paragraph paragraphs

clusterhacks · on Nov 19, 2022

I work in this field at a large medical research institution. There is a significant amount of genomics analysis that occurs here on a day-to-day basis. The genomic processing pipeline work all falls directly into my group.

There is next to zero demand for tool development internally. I do it on the side of "normal" IT data management because I love high performance computing, algorithms, and multithreaded hackery. But even at my large, well-funded institution, there isn't a specific role where that is all that you do by design.

I do suck at marketing - meaning, despite having some success with big improvements in research tools that folks have definitely appreciated, no one comes to me asking for help with better engineering of genomic applications. Partly that is due to many researchers maybe only know R, so they will default to whatever packages are already available in Bioconductor, install those, and throw the resulting mash-up for their current research effort onto the compute cluster and simply wait for hours or days for the jobs to finish.

PIs are often insulated from software engineering problems too - if work is completed before the next bi-weekly meeting and update session, well, it must be ok.

mfld · on Nov 20, 2022

Great post - which contains the answer to many of the questions raised in this thread. I am working in this area as well. There is "next to zero demand for tool development" because there are great open source tools. Only in rare cases (e.g. Illumina Dragen), a commercial software adds significant value that the audience is willing to pay for.

sargstuff · on Nov 19, 2022

Ah, sounds so much like history of programming. At the 50's stage of straight up statistical manipulation.

DNA base units not viewed as base 4 binary number system that can be transformed into an abstract software language, where can select abstraction level of choice to use. Much like musical notation not viewed as numeric system.

Although, most software engineers don't view systems as numerical language development, too.

sargstuff · on Nov 19, 2022

difference in view between qualatative & quantitative usage; NP vs. P type problem(s).

anonymous_bio · on Nov 19, 2022

The author completely neglects the downsides:

- The compensation absolutely do not match the workload and education required. - The sheer number of disreputable PIs and their unrealistic goals for software. - The data is likely questionable and often underpowered. - Institutional politics. everywhere. - Marketing ("Curing Cancer"). The role is actually just juggling various bioinformatics file formats.

cratermoon · on Nov 19, 2022

> just juggling various bioinformatics file formats

Your other points are spot on. This one I want to address specifically. The file formats. Academics love their incredibly over-engineered file formats. MARC. SGML. DICOM. HL7. RDF. Those are just the ones I know. Universally, they try to cover every corner case that anyone could ever imagine. Academics absolutely love their ontologies. Just implementing one of them is a nightmare. Going from one to another is an exercise in the philosophies of ontologies.

zmmmmm · on Nov 19, 2022

Actually I think genomics / bioinformatics is a counterpoint there. One of the things I like about the field is nearly every file format is under-engineered. It's TSV all the way down and if you need compression gzip it. If you need to index that, sort it (literally often with unix sort command) and block-gzip it. Anything more engineered arose specifically because the above failed and something more is actually needed.

The downside is it's a giant hellscape of unstructured, poorly specified formats where data types are barely specified at all or if they are most of the schema is published on some rambling blog post by some rando scientist. You will spend most of your time understanding it by empirical reverse engineering of the data that you are trying to deal with.

cratermoon · on Nov 20, 2022

Oh, then eventually they'll get a committee together and after a few years they'll produce a unified file format that somehow manages to cover all the cases in the different existing formats (or at least the ones used by well-funded PIs) and is a hellscape of optional properties and required elements so poorly specified that it's impossible for any two implementations to communicate.

manv1 · on Nov 19, 2022

HL7 is't technically an academic file format, it's an industry standard interchange format for health data.

DICOM is for radiology.

RDF and SGML, well, they're from the same era as XML, so yeah.

kweingar · on Nov 19, 2022

> Just implementing one of them is a nightmare. Going from one to another is an exercise in the philosophies of ontologies.

Good thing there are lots of competing implementations! It would be a shame if these files were actually portable.

the_only_law · on Nov 19, 2022

> The role is actually just juggling various bioinformatics file formats.

I need an advanced degree for that?

cratermoon · on Nov 19, 2022

You do in academia. Otherwise you might as well be washing dirty labware for all the respect you get.

tonto · on Nov 19, 2022

it is a meme that bioinformatics is just about converting different file formats but it's a shallow take

BeFlatXIII · on Nov 19, 2022

- Marketing ("Curing Cancer")

Nothing like putting that boilerplate pablum on research grant proposals. Either that or something about green energy. Some PIs just want to play with ligands, man.

boppo1 · on Nov 19, 2022

hobofan · on Nov 19, 2022

Principal investigator: https://en.wikipedia.org/wiki/Principal_investigator

The person that runs a research lab, which at a university is usually a (tenured) Professor.

rleigh · on Nov 19, 2022

Principal Investigators. Basically, the academics who get the grant funding and run their own research groups.

photochemsyn · on Nov 19, 2022

Related HN discussion (May 2022) on similar article:

https://news.ycombinator.com/item?id=31577376

https://www.nature.com/articles/d41586-022-01516-2

> "Fundamentally, RSEs build software to support scientific research. They generally don’t have research questions of their own — they develop the computer tools to help other people to do cool things."

ninala · on Nov 19, 2022

This does not have to be true. You can certainly pursue interesting biology research questions informed by a software engineering POV.

AlbertCory · on Nov 19, 2022

20 years ago I got interested in "bioinformatics." I loved learning something about molecular biology, after all those years of hearing about DNA and not understanding it. And "Molecular Biology of the Cell" is, hands down, the greatest textbook ever written.

That said: a lot of the comments are spot on. You're working in a field where the hard scientists and business people rule and you're a helper. Maybe they're grateful for your help OR maybe they regard you as an overpaid lab assistant. After all, they have PhD's and postdocs, and you don't.

I've never actually worked in that field. I'd guess that it might be very satisfying, despite the low pay. Or not.

tifik · on Nov 19, 2022

I have worked in a genomics lab after finishing a bioinformatics master's.

It was my first fulltime job, and by far the most chill. People were great. The PI was laid back, the whole lab went out for beers every now and then - and not because of a mandatory startup-style 'bonding' event. We genuinely enjoyed each others company and hung out outside of work. I never had that in any other job, which were/are all commercial operations.

The vibe and the power structure felt very different. More level. There werent any purely managerial roles, everyone was doing at least a bit of 'science'. And even junior ICs like me got to coach undergrads every now and then. Most of the operational budget comes from grants, on which you have to deliver. The pay is not amazing, so most employees really are in it for the science.

Or I was still young and naive and was lucky all of the two layers of management were all nice people.

Ultimately I left, as the grant money coudnt keep up with offers I was getting.

It is still the job I am most proud of. I love talking about it, and it really sucks that even a well funded lab cant really afford market engineering rates.

jonnycomputer · on Nov 19, 2022

I'm in a role that is very similar (different field though). However, I know enough about academia to know that alot depends on the culture the PI fosters. Also, I spend a lot of time learning the field.

rafiki6 · on Nov 19, 2022

I wish more fields would just start adopting the product/engineer partnership that Software companies have perfected. Engineers are very good at what they do. Product people are very good at what they do. They need each other to build things. Sure, engineers might know enough about product to get by and product people might know enough about coding to get by, but the reason it works is because each one is an expert in what they do and are equal.

Its no different in finance, healthcare, genomics etc. I'd love to work in a setting where I'm paired with an SME product manager in a domain I have no clue about and they respect my work and I respect theirs and we are partners.

This is one of the biggest factors that made software/internet companies explode. They respected people who build software. They didn't need to. A bunch of MBAs could have easily just decided that the best way to run the company was to treat the people building the product as a cost center. Many did. I think that's probably one of the reason for the lack of innovation and down fall in many old tech companies like HP/IBM.

The ones that treated SWEs properly and valued them accordingly, did very well.

jonathanyc · on Nov 19, 2022

I have heard from a friend who's a doctor that in hospitals there's a very adversarial relationship between doctors and MBAs. The MBAs see the doctors as a cost center, and the doctors resent people without MDs being above them.

Your comment reminds me to be thankful that at many software companies engineering, product, and design do respect each other as equal partners. I totally agree that to do otherwise is business suicide.

sargstuff · on Nov 19, 2022

to very opposing philosopies:

MD's -> patient interest comes first

MBA's -> company interest comes first

rossdavidh · on Nov 19, 2022

Having worked (as a consultant/contractor) for a few businesses in the field, I can say that my experience was closer to "grateful for your help" than to "an overpaid lab assistant". I even recall once, in a meeting, being referred to (by a senior staff scientist with a Ph.D.) as "the technical guy", causing me to wonder at how someone who does gene sequencing thinks of programming as being more technical.

But, YMMV.

blep_ · on Nov 19, 2022

> causing me to wonder at how someone who does gene sequencing thinks of programming as being more technical

Everything you don't understand looks complicated from the outside.

bluejellybean · on Nov 19, 2022

Molecular Biology of the Cell got me extremely excited about genetics and bioinformatics, highly, highly recommend this book to any software person I meet who is interested in biology.

As to the work environment, it seems to be extremely varied depending on the lab and team your on. I came from a number of years doing web development in marketing and finance before joining an R1 university research lab, and in many ways the day-to-day is quite similar in both fields. You are not the 'go-to' person for most things, but with that said, even as an individual contributor I feel my voice is heard on technical decisions where appropriate. As for pay, it's the biggest aspect that will make me leave at some point. If you do not have a PhD, or even a degree in my case, you can't expect to get paid a lot. As to the speculation on the satisfaction of the work, it is indeed deeply satisfying!

I got to have a conversation with one of the hero donors that gave a kidney biopsy after a life-saving transplant. It's hard to overstate just how impactful your work feels when talking to someone like that. Even as a small cog in the larger machine (our lab is around 50 strong with many people being at the top of their sub-fields), the end results of the effort will be massive improvements in individuals quality of life, this alone makes it quite easy to get out of bed in the morning.

harles · on Nov 19, 2022

Any particular edition of Molecular Biology of the Cell you’d recommend? I just looked up the 7th edition on Amazon (seems like the latest) and it’s $300 USD. Oof.

rleigh · on Nov 19, 2022

I've still got my 3rd edition copy (from 1999 when I was an undergrad molecular biologist). Most of the basic biochemistry and molecular biology will be exactly the same--it hasn't changed much if at all. While there have been lots of additional details added over the last two decades, the fundamentals are unchanged for the most part.

This wouldn't apply to other fields such as Immunology (Janeway's Immunobiology) where I have purchased multiple copies of the years due to the field changing so fast.

AlbertCory · on Nov 19, 2022

I'm on #3.

An awful lot has changed since 2000. RNA is now a Thing, where it was just a poor stepchild before. Protein folding, of course.

But yeah. The pictures are shining examples of what a scientific diagram can be.

somedudetbh · on Nov 19, 2022

Go on ebay and by the "international" edition

wwweston · on Nov 19, 2022

What level of chemistry do you need to know in order to benefit from reading the text?

clmcleod · on Nov 19, 2022

> That said: a lot of the comments are spot on. You're working in a field where the hard scientists and business people rule and you're a helper

This definitely was the culture when I started working in the field 6 years ago. However, the culture has shifted (at least where I work) to where biologists and engineers are equal partners that work together on solving these problems. For those organizations that are not this way, I think they’re going to have to change if they want to innovate.

jghn · on Nov 19, 2022

Agreed. Huge change over the last 10-15 years. My first job in the space had a view that obviously a mere software developer wouldn't be paid more than even a postdoc scientist. And as postdocs weren't paid all that well, you see where this is going.

These days more biotech companies are computationally/software focused. They understand that to pull in strong talent they're not operating in the same academic science world.

pclmulqdq · on Nov 19, 2022

That may be the case for engineers with PhDs and scientific credentials, but I'm not so sure that is true of normal developers who did not play the academic game. I'm not going to take a job based on the eventuality of a culture shift, and I don't think you should either.

This isn't just genomics, by the way. Scientific computing folks are very similar.

AlbertCory · on Nov 19, 2022

That's always been my impression, but it does sound like "software eats the world" has had some effect. At least in some places.

Looking at it from their point of view: CS people tend to think that "everything is just information, and now that we're here you're all going to be working for us."

You can see why a PhD in mol bio would resent that. Everything is not just information.

x0x0 · on Nov 19, 2022

I worked in the field. Leaving to work on ads immediately tripled my salary, and gave me more room to grow.

Everyone who says you're the hired help and treated about as well as a secretary that the organization dislikes is dead on. At best, you're viewed as an overpaid cost center.

Which is sad, because I'd love to work in these areas... but I'm not giving up 66% to 75% of my income to as charity to private corporations.

boppo1 · on Nov 19, 2022

>Molecular Biology of the Cell

Tangential, but what are the chemistry prereqs to grasp this book?

timr · on Nov 19, 2022

MBC is readable by someone with an undergraduate background in science. You'd probably want basic knowledge of biology, general chemistry and organic chemistry.

It's essentially an upper-level undergraduate textbook.

wenderful · on Nov 19, 2022

Fwiw, I studied humanities, do a lot of pop science reading in my spare time, and I'm able to appreciate it. There are more detailed and technical sections I skim or skip, but the overviews are fantastic. Incredible description of, for example, the sheer wonder we should all experience at the fact that all life starts as a single cell.

ninala · on Nov 19, 2022

The chemistry requirements are minimal. You should understand the difference between ionic and covalent bonds, how van der Waals forces work, hydrophobicity, solubility, and the effects of catalysts on reaction transition states. It will also be important to understand what reaction kinetics are and what pH means. An understanding of buffers might be useful.

I would argue that, to understand the book, you specifically don't need to know electrochemistry, organic chemistry, analytical chemistry, organometallics, spectroscopy, or even physical chemistry.

tranzudao · on Nov 19, 2022

Probably just a college level gen chem class. Pretty accessible, albeit technical, textbook from what I remember of reading it for a course a few years ago.

AlbertCory · on Nov 19, 2022

I did get a book or two out of the library, plus I had Chem 101 in college, but really, not very much.

olalonde · on Nov 19, 2022

Is it feasible to do any meaningful work in this field without joining a team? (e.g. as a solo hobbyist/entrepreneur)

boldlybold · on Nov 19, 2022

If you have a software background and can get some basic domain knowledge, there's lots of open source projects that could use your contribution.

Doing fundamental reseach is a taller order. But lots of software, tools, pipelines etc need maintainers, optimizations...

quest88 · on Nov 19, 2022

Which projects? That seems like a good place to start.

boldlybold · on Nov 19, 2022

I contribute to Nextflow core (https://nf-co.re/) It's more of a collection of pipelines than traditional software, but there are users all around the world and a good community.

Most of the packages on bioconda (https://bioconda.github.io/) are open source. But you probably want to find a sub-field that interests you most before finding a project.

In grad school, we also had an ex-google software engineer volunteer with us one day a week. It was very impactful for many members of the lab to learn good engineering practices, and it wasn't at all like the sentiment others in this thread are expressing where engineers were "janitors".

Koncopd · on Nov 20, 2022

https://github.com/scverse But this is mostly about transriptomics (RNA), not genomics.

attractivechaos · on Nov 19, 2022

Difficult but possible. For example, Robert Edgar [1] works alone and is one of the most productive developers in this field.

[1] http://drive5.com

dekhn · on Nov 19, 2022

I worked with Bob some ~20 years ago at Berkeley. he showed up one day to check out the seminars and see if he could "help out" after having sold his database company to Intel. he said he'd been trained as a physics guy in the 80s but there were no real jobs so he started a software company instead. He joined my advisor's group (it helped a lot, because at the time most journals wouldn't publish a paper submitted from a home address).

He proceeded to completely understand hidden markov models and protein sequence alignment and was immediately hacking improvements to HMMER. However, Sean Eddy couldn't understand his optimizations (Sean has to know how HMMER works at all times) and so Bob went off and made his own tools like MUSCLE.

One of the reasons he can do this is, well, he's a programmer/math genius, and the other reason is that HMMs and protein alignments are a fairly well understood and programmable thing these days.

Still blows me away we train up all these people to be scientists when there are no jobs for them in that role.

njbooher · on Nov 19, 2022

I don't work in this space anymore, but just want to say kseq (and the rest of klib) is such an awesome time saver. Thank you.

samtho · on Nov 19, 2022

I’ve had a growing interest in the power of DNA and what the data can be used for since discovering no less than 3 family secrets (one of which pertaining to me) after taking an Ancestry DNA test. Did I know I was going to find 18 half siblings the moment my results came in? Nope, but yet there they are, listed in order of most shared DNA.

Despite my interest, I’ve found that landing a job in this field at my desired compensation level is very difficult especially if you not have the ”correct” academic background. Who does a double degree for computer science and forensic genealogy? I’m sure some people but for $75k/yr you’d think the companies need to at least adjust their expectations.

foooobaba · on Nov 19, 2022

Yeah, I agree, I looked into this before, and the pay doesn’t come close to other swe jobs, as far as I have seen whenever I look. It is usually like 2x less, it’s hard to want to choose that just to work on something a bit more interesting. I even have a background in bioinformatics, but I never found anything that compensates it as much as pure swe roles.

the_only_law · on Nov 19, 2022

Yeah the low pay is one thing, but in my experience a lot of the academic jobs seem to want a domain scientist who can do programming, not the other way around.

ninala · on Nov 19, 2022

Not always true - but finding a very good programmer who knows the domain well enough to make a significant impact is challenging.

cuttothechase · on Nov 19, 2022

I've been a software engineer in this space. I just want to say that there is exactly 1 job (non-intern) job between Microsoft, Google and Amazon listed according to the search links provided in the article.

rhn_mk1 · on Nov 19, 2022

What is the significance of that observation?

the_jeremy · on Nov 19, 2022

> Often, it's not required to know the domain before you join a group, and they will teach you on the job.

I looked. There are zero full-time, remote roles that don't require previous genomics experience at any of the companies listed.

clmcleod · on Nov 19, 2022

I know there are at least a few, because positions on my team offer remote and don't require and previous genomics experience.

CoastalCoder · on Nov 19, 2022

Perhaps you and the GP could compare notes about where they searched and where you advertise.

the_jeremy · on Nov 20, 2022

https://talent.stjude.org/careers/jobs?keywords=software%20e... , i.e., taking your search and entering my zip code, sShows only 3 positions in St. Jude, none of which mention the word genomics at all.

debacle · on Nov 19, 2022

Science programming jobs suck. You get all the bad parts of academia, including less money, plus you're seen as a janitor rather than an engineer, and you get to deal with scientists all day.

Tooling roles in SWE in every other field are highly regarded. Why not here?

tgv · on Nov 19, 2022

Because it's not what sells. It's literally a tool, and if you don't deliver the level of perfection they're used to get from sequencing, NMR or assay testing machines, you're the PITA. You really have to bring something very interesting to the table to earn some status, and software engineering just doesn't. It's too far from the core business. Think of the attitude SWEs have towards sales people...

firstplacelast · on Nov 19, 2022

The thing is many will pay big bucks to contractors/consultants/IT services/LIMS systems, but if you’re an employee, nope.

They have a hard time having someone with a BS or MS making 50-75k more than a freshly-minted PhD.

I just left a job in pharma because I cannot do it anymore (salary being a big one, but my experiences reflect many in this thread).

They spent 500k on a consulting company to build a few NGS processing pipelines. This was built using a framework I was unfamiliar with. I re-factored one of them and was able to increase runtime by 60% in a couple weeks. I was paid in the low 100’s.

They would rather contract out the high-paid work and pay orders of magnitude more for it.

convolvatron · on Nov 19, 2022

it depends. I've been in science support roles where people are genuinely grateful for the help, and its _really_ interesting to get to peek in on people's research.

it depends on the role. it worked out really well for me when I got to drop in and do piece work on lots of projects in different fields. working on a larger software development project can be really painful and demoralizing because the people running it don't really understand how the sausage gets made.

sargstuff · on Nov 19, 2022

Well, math / computational power for simple, static protein modeling is horendus.

metalforever · on Nov 19, 2022

Look, I did this at multiple places for a number of years. The issue is that you often form an adversarial relationship with the scientists. They don't really want you there. They are perfectly happy just organizing everything by hand with post its and excel spreadsheets. They do not want you to mess with their flow with your software, even if it would help them to be more efficient.

boppo1 · on Nov 19, 2022

Can you elaborate with some anecdotes? Why is their current workflow wrong? Why would an organization hire someone to build software if they can achieve goals with spreadsheets?

maximus-decimus · on Nov 19, 2022

The fact they renamed human genes because they were importing it in Excel in a way it thought they were dates and was changing them says a lot.

Can you really trust the scientific results if they depend on software made by people who don't care about code quality?

boppo1 · on Nov 19, 2022

>renamed human genes because they were importing it in Excel in a way it thought they were dates

Just at your shop, or the field in general?

jghn · on Nov 19, 2022

Field in general [1]

[1] https://www.theverge.com/2020/8/6/21355674/human-genes-renam...

rleigh · on Nov 19, 2022

The whole field! This was actually a thing(!)

myaccount9786 · on Nov 19, 2022

Why indeed.

lowbloodsugar · on Nov 19, 2022

Even large corporations in this space pay relatively little for software engineers, and treat them with little importance.

I also experienced "software engineers" who had no idea what they were doing being given more credence because they had a PhD in some bio-related field. Oh, you got a PhD in some molecular aspect of some tiny piece of biology, and that makes you qualified to build big data systems? It did not. Apparently what that gives you is an adherence to reading decades old textbooks about database design. It was like working with a first year software engineering undergrad from twenty years ago.

To be fair, it looks like the same can be said for machine learning. Many software engineers I know are in the "machine learning space", but report that they are just operations support for data scientists, and don't actually get to learn about, let alone be involved with creating, the models they support.

If you are a software engineer, work in a software company, where engineering is the value proposition.

fxtentacle · on Nov 19, 2022

Google already axed all job offers, Microsoft and AWS are searching student interns...

I used to work in genomics and computational biology. It was incredibly interesting. But it's university research and gets paid as such. 2-year time-limited contracts, lots of interns and students, extremely low salaries.

cowsandmilk · on Nov 19, 2022

The AWS jobs aren’t even related to genomics. They just have genomics in the description of types of workloads performed by customers of AWS. The jobs are hard core CS automated reasoning jobs.

jesse__ · on Nov 19, 2022

Shameless self-promotion incoming.

I'm interested in contributing to this field. I have significant experience in 3D graphics, game engines, compilers and language runtimes. I'm a competent low-level engineer.

There's a lot of red-flags in this thread about adverse working conditions, but I'm running under the assumption there are a handful of companies out there that work with a software-minded approach.. ie. respect SWEs for who they are and what they do. If you represent one such company, and are looking for engineers who have a keen eye for performance and architecture, I'd love to hear from you.

jesse@scallywag.software

https://scallywag.software/resume.html

EDIT: Largely interested in remote roles, but could relocate for the right offer.

spacemadness · on Nov 19, 2022

Sorry, no. This was my dream area to work in and I obsessed over degree programs in bioinformatics many years ago. Then I realized it’s incredibly low paying for the work, finding work in the area was a chore, and a masters might not even get your foot in the door. Nothing communicated that you would be valued. The harsh reality of the world won out in the end.

arnaudsm · on Nov 19, 2022

I hope this FAANG downturn will push software engineers to new industries, and bring some cross-pollination.

What happens when the world's most brillant minds do something else than making us click on more ads ?

zach_garwood · on Nov 19, 2022

If academics embraced software and software developers as heartily as advertisers did, you'd see that result. Until they do, I expect you'll continue to see a bunch of skepticism from developers.

clmcleod · on Nov 19, 2022

Exactly my thoughts, and what I hope this post makes some consider.

pengwing · on Nov 19, 2022

Can you provide a list of the top problems in that space? Much rather try to understand them deeply myself and build a company solving them than just getting a job.

whitepaint · on Nov 19, 2022

This please. I would love to start working on (or create from scratch) some software that helps people in that field.

331c8c71 · on Nov 19, 2022

Creating pipelines is still a problem. Typically one needs to call a bunch of other tools in order to get to the final result. There could be map/reduce behavior in the middle where chunks of data are processed in parallel in order to gain speed. And you need some kind of data management/tracking as well (putting samples in groups, ingesting raw data, exporting results). And sane monitoring especially if something breaks/fails.

There are probably 100s of tools written for this but no clear winner so far. The traditional software engineering approaches like git, ci/cd seem too heavyweight (or rather too low-level) especially during development. IMHO there could be space for a fully remote/cloud solution where one would code/debug/deploy from the browser optimized for writing/maintaining pipelines.

kzuberi · on Nov 19, 2022

I also found the quality & proliferation of data pipeline tools to be baffling. Somehow always more painful to put these together than it seemed like it ought to be.

At one point we wrote an internal tool (I think lots of organizations do this, since all the 100s of existing tools somehow don't fit, so you invent #101) and while it was tremendously satisfying getting batch jobs with 1000's of cpu's churning away, that kind of data infrastructure needs to be standardized. I think some companies are doing this, e.g. saw a presentation about Arvados/Curii that seemed interesting (but haven't used it so not sure). Maybe CWL will turn out to be the way forward here?

jinto36 · on Nov 19, 2022

Protein structure prediction was a huge deal, which is why AlphaFold received so much fanfare. It is actually pretty good. The next step is to predict where multi-protein complexes would interact- which is not just as simple as predicting the structure of two proteins independently and then trying to fit them together like a puzzle, because the the interactions can also change the structure. While it's not as hard as it used to be to experimentally determine protein targets of, for example, a protein kinase, it's still not an arbitrary or cheap experiment, and to do that for the many thousands of such proteins, across different conditions (stress, presence of co-factors, etc) and in different organisms would be rather a lot of work. Something like alphafold that makes reasonable predictions and can be used to help you focus on what's most likely to be relevant to your disease or process of interest helps quite a bit.

There's also more need for integrating "multi-omics" data, where you have data from multiple assays (gene expression, phospho-proteomics, lipidomics, epigenetics, small RNA expression, etc etc) with the goal of somehow combining all these different assay results from various levels of gene regulation, to get closer to figuring out actual mechanism for complex processes. Building on that, we can also do single-cell multi-omics to some extent- where you have results from different sequencing-based assays on the level of the same individual cell. This is still pretty limited, but it's exciting and advancing pretty quickly. This will eventually be combined with things like spatial transcriptomics, which is useful for mapping out what's going on in heterogeneous tissue samples like tumors, for example, so we'll end up with spatial single-cell multi-omics, at which point you're looking at 1) some quantitative trait for multiple genes/loci/molecules, and often 10k+ of such features at the same time per assay, 2) multiple assays, such as DNA accessibility and gene expression, in 3) single-cells, of which you might have 10k of in a single sample, 4) across a physical tissue sample where individual cells are spatially mapped, and where you probably want to figure out how cells might influence the state of those around them, and 5) in multiple different samples, where you might want to compare disease vs control, or look for correlation to heterogeneity of results within one group.

There's a lot of public data already available for single-cell gene expression projects if you want to get a feel for how these things are structured and how (passable but not amazing) the existing tooling is- one of the main repositories for this data is the NCBI's SRA https://www.ncbi.nlm.nih.gov/sra but you'll quickly note that searching and browsing is not as easy as you might think it would be- because one of the main limiting factors in bioinformatics is how bad everyone is at keeping terminology consistent. For many bioinformaticians, a majority of time is spent in the data cleaning phase. It's awful. Sometimes the experimental parameters make it into SRA or GEO, but sometimes you have to read through the associated paper to pull that out. Often it's only large consortium projects like the The Cancer Genome Atlas (TCGA) or the Genotype-Tissue Expression project (GTEx) - which have enough funding for staff dedicated to data management- end up publishing datasets that are easy to "consume" without having to jump through a whole bunch of hurdles to figure out how the data was produced.

I have a BS/MS in bioinformatics and I'm presently a PhD candidate in genetics and computational biology defending in February.

pengwing · on Nov 21, 2022

So if I understood you correctly then further lowering the cost of experimentally determining protein targets could be a viable way forward that is completely orthogonal to computational methods?

whage · on Nov 19, 2022

I'd like to hear about this too!

YouWhy · on Nov 19, 2022

I am a career SW engineer that has worked on genomics in a startup. The field is genuinely exciting.

The endemic disease of the field is the leadership. A leadership made out of Principal Investigators forged in academia, appear simply incapable of producing any item which is not articles (or equivalents thereof).

ninala · on Nov 19, 2022

Do you think that's true of pharmaceuticals/biotechs as well? Or just academia?

tejtm · on Nov 19, 2022

Decades ago my very very bright HCI prof commissioned a psyc study for a database we were building for some biologists next door, you know so we could better address their needs in ways that would be useful to them. Details are pretty fuzzy anymore but they proved correct many times over.

Things the study said would not work never worked i.e. biologists wanted "temporarily" private data, say until till published as psyc predicted they would never freely share it.

but the biggest thing I will try to paraphrase:

Biology is an observational, the work is in interpreting which lends to group dynamics and politics, leaders ect.

Which is at odds with Math/CS which is constructive where if something can be proved then that is that.

So when a CS person states a fact from their perspective a biologist might see it as just another opinion subject to hierarchical ranking.

So I would argue it is a function the individuals proclivities and correlated training in the cultural environment they end up in.

So a healthy work environment could value both fact and opinion where each has a complementary role whether academic or industry.

But as a longtime academic, I am now sadly looking towards industry.

YouWhy · on Nov 20, 2022

The people staffing senior leadership in pharma/biotech are typically either former PIs from the academia, or people who could have been PIs but chose to go straight for the industry.

They have more cash to play with, but their leadership fails in the same pattern.

foobiekr · on Nov 19, 2022

My first job offer out of college for compsci was for a genomics research company that desperately needed software engineers. At the time they were storing sequences as ATGC strings in an oracle database using perl scripts. It was really below even undergraduate-level basic stuff.

The offer was $38k a year. About two days later, I got my second offer, $50k from a game company, and then a real offer, $60k, which I took. This was in the late 1990s.

That was 20+ years ago, of course, but I sort of wonder if things have changed. I frankly think a lot of SWE work for fundamentally evil, socially destructive companies, and I honestly don't think you have to to earn a good living, but you also don't have to work for companies that deliriously underpay you.

j7ake · on Nov 19, 2022

Genomics is still predominantly a research field. In research, software development and hence software engineers are not valued much, because technologies change rapidly, new ideas come every day, so it is about being able to hack together a workable solution enough to write a paper or get funding.

Software development becomes important when certain data processing methods have been standardised, eg mapping sequencing data to mouse or human genome, differential expression analysis, pca visualisations.

faizshah · on Nov 19, 2022

This is very true and I loved working with bioinformaticians but the pay is so much lower than a normal SWE role which is why SWEs will pick tech over genomics companies.

asciimov · on Nov 19, 2022

Not quite a decade ago, I took some work for a lab to replace some aging software (circa 1990) used to do peptide synthesis.

It was an enlightening experience. While I was the programming expert with a CS degree, I wasn't trusted for anything, because I wasn't a PhD or had a background in bioinformatics. However, I did get to work with lots of smart people, fixed and improved the code and processes that the Phd level statisticians and bioinformaticians used.

It is a real joy to work in hard science, with brilliant people who love their work. I learned a ton and gained a healthy respect for the people that do this kind of work.

However, the downsides are pretty bad. Pay and compensation is awful. Most people, myself included, could have made as good if not better pay waiting tables. There end up being different levels of people Administrators, Private investigators, and lab workers (peons). Unless you are an admin or a high level PI you're not gonna be getting much money.

Everybody lives and dies by the grant. If funding dries up, you will be out of a job.

Ethics. Us CS people are woefully under educated on ethics. You will find yourself asking why we can simply do something, often the answer will be ethics.

Regulations, like ethics, you will have to bend to regulations. It's not a bad thing, just a different thing.

Unless you find yourself in a admin role, you will just be another lab peon. Its not a a bad place to be, but you will never be at the top of the totempole.

Loads and loads of ego. You will work with very smart and sometimes unreasonable people. Learning to navigate this with tact is important.

danking00 · on Nov 19, 2022

I don't have any funding to hire right now, but I'm always happy to chat about the industry and my experience building Hail (https://hail.is, https://github.com/hail-is/hail), a tool widely used by folks with large collections of human sequences.

The other posters are not wrong about compensation. Total compensation is off by a factor of two to three.

However, it is absolutely possible to work with a group of top-notch engineers on serious distributed systems & compilers in service of an excellent scientific-user experience. I know because I do. We are lucky to have a PI who respects and hires a diversity of expertise within his lab.

I enjoy being deeply embedded with our users. I do not have to guess what they need or want because I help them do it every day.

I also enjoy enmeshing engineering with statistics, mathematics, and biology. Work is more interesting when so many disciplines conspire towards the end of improved human health.

UncleOxidant · on Nov 19, 2022

Yes, Genomics may be important, but are there really that many jobs for software developers? (same could be said for many other important fields - I recently saw an article about how software engineers should move to green energy - but who is going to pay them?)

jghn · on Nov 19, 2022

There are a lot more positions for people with advanced mathematics and/or science backgrounds with strong programming chops than there are typical software positions. But they do exist.

jugg1es · on Nov 19, 2022

I have a BS in neurobiology but have been working in software for 20 years. I'd always wanted to get into a more biology-focused software after interning at NIDA (NIH) and saw how bad the software support was. I spent most of internship developing software to make it easier to digitize the dozens of giant drawers full of index cards where they recorded all their raw data.

The problem is that the organizations involved in this sort of work often still consider software development as a cost center and therefore do not offer competitive salaries.

dottedmag · on Nov 19, 2022

This field does not _need_ software engineers.

This field needs marketing, product and project managers (for-profit or non-profit variety) that could figure out:

1. what product to build to have the biggest impact

2. how to build it.

Once 1. and 2. is clear it will be equally clear that if you have a bunch of scientists you won't get a great product, as nobody will build the product, everyone will build a prototype.

So then it will follow that the project needs to hire (=attract) software engineers to be in charge of software, and attracting software engineers means giving them competitive compensation.

amrx101 · on Nov 19, 2022

Would love to but I don’t think academia will want masters at least and years of industry experience will be discarded completely. I have 6 years experience in data intensive IoT applications and yet that would not be considered useful by academia

Cupertino95014 · on Nov 19, 2022

The bio & pharma & medical fields value academic credentials very, very highly. Too highly.

That's their whole life: "where did you do your PhD? Who did you do your postdoc under?"

Many world-class hackers would do pretty poorly on those questions.

Marsymars · on Nov 19, 2022

I left academia for that reason; there was no advancement path that didn’t involve more advanced degrees, and that wasn’t something I was interested in at the time.

wesleywt · on Nov 19, 2022

The code is bad because transient Phds and Post-docs are writing it. If there was money in it then the best software developer would already be working on it. Sadly there is none.

zmmmmm · on Nov 19, 2022

yep ....

One of the borderline fraudulent aspects of the field is the pretense that method publications are real software.

That is, you come up with a break through statistical or algorithmic method, you get it to run exactly once based on whatever random walk of exploratory code got you to a result that looks better than competing/prior methods, and then you dump your workspace into a script and put it on Github and pretend this is something anybody else could or should responsibly use in your Tier 1 publication. The minute the publication is approved there is zero benefit to the authors in maintaining the software, and in fact its better if nobody can run it because that way they can't disprove your results. Then naturally nobody can get this to work afterwards and 50% of software engineering time and effort is trying to run code that can/never will work outside the context it was created in - but you have to try because this is now the accepted best practice method of doing X or Y based on its publication.

The bigger problem is that this whole cycle actually shapes the view of software engineering by academics to the point where they really do think that most software engineering is a waste of time. A small number of 10x engineers manage to prosper in the environment, but it's mainly because they have the sheer technical capability to deal with ALL of that while still doing something useful, and it actually makes the problem worse because the academics then see that as the baseline for software engineering capability.

runeblaze · on Nov 20, 2022

Yes just to second this -- every time I wrote decent code for my bioinformatics software I regret it because my PI does not really care.

Sometimes I really don't understand. Much of the field's code does not even have testing, and it is baffling for me to think how the results are believed to be correct in the first place if there is no rigorous testing.

mherdeg · on Nov 19, 2022

What is the opportunity here -- writing new algorithms, implementing them accurately, optimizing them for special execution architectures, or just building more usable tools?

I remember Manolis Kellis sprinkled some pretty interesting genomic questions into his Algorithm class's problem sets. There were a number of cool problems about optimally aligning strings, searching within text, etc.

This was like 15 years ago and I haven't kept up with the discipline at all. But is there still algorithmic low hanging fruit?

I do keep reading about an ongoing series of problems with Microsoft Excel distorting analysis in the scientific literature (https://www.nature.com/articles/d41586-021-02211-4) and wondering if the tooling is having trouble..?

jltsiren · on Nov 19, 2022

> But is there still algorithmic low hanging fruit?

Algorithmic bioinformatics has become a separate research field, because there are so many low-hanging fruit. Biotech companies create new instruments producing new kinds of data, researchers find new uses for the data, and new algorithmic problems emerge all the time. There is also a steady migration of people from theoretical computer science to bioinformatics, because it's often easier to get research funding for something bioinformatics-related than for pure CS.

331c8c71 · on Nov 19, 2022

> But is there still algorithmic low hanging fruit?

I would say no unless looking at the frontiers of what is done in the wet lab which might require new analytical tools. But this stuff is probably much easier for and much better aligned with someone doing CS in academia.

My impression that there is quite some space for ML-based approaches including DL. But even there I would not call it low-hanging.

boldlybold · on Nov 19, 2022

We're only starting to see the age of genomics accelerated by GPUs. I think it's still early if you have the technical background.

331c8c71 · on Nov 19, 2022

Edico developed FPGA-based processing solution for common bioinformatics processing tasks (e.g. dna/rna mapping, variant calling) and the company was bought by Illumina.

The product (Dragen) has been around for a few years and now will be integrated in the new generation of sequencers. Extremely impressive technology and a better fit for the niche compared to GPU-based solutions I have seen. More downstream processing and analytics is sometimes closer to traditional ML and naturally there are lots of GPU-based algos.

boldlybold · on Nov 19, 2022

I'm more excited about NVIDIA's acquisition of Parabricks and the version 4.0 of the software that makes it free to use, than I am about DRAGEN. At the very least it's good to have some competition in the space, Illumina's stuff is always SO expensive. We'll have to see what hardware will win in the end.

331c8c71 · on Nov 20, 2022

I have tried both and dragen was more polished and also faster (that depends on the GPU for parabricks of course). Also more features and they keep adding them.

Agreed that competition is good to have. There is also Sentieon and similar solutions which run on common hardware but are optimized.

Speaking of costs (both upfront and licensing), dragen imo is not expensive relative to sequencing costs (e.g. sequencers and flowcells). Surely it would be expensive to buy for occasional use.

By making parabricks free to use Nvidia tries to gain a market share I guess. In professional settings you still end up buying support and likely dedicated hardware (which is comparable in pricing). Good fit for the cloud and research environments that already have access to GPUs and/or are decoupled from actual sequencing.

mshockwave · on Nov 19, 2022

Is there any open source projects on genomics that I can start looking into as a hobby rather than jumping right into a full time position in this field?

ababaian · on Nov 20, 2022

Serratus (https://github.com/ababaian/serratus) is an OSS bioinformatics project created by a passionate group of volunteers. Short story is we're re-analyzing all of the world's DNA/RNA sequencing data to find new viruses that other people have missed. It works surprisingly well, but there's a ton left to do.

dankle · on Nov 19, 2022

> There is a significant gap between how software is currently developed in this space versus how it should be developed. The vast majority of genomics-related software is not written with speed or reliability in mind.

True, but working in academia is very VERY different working in a tech/product company.

pestatije · on Nov 19, 2022

> This state of affairs makes it difficult for anyone other than the original author to contribute to these code bases, further cementing the one-maintainer policy.

Who wants to fix other peoples code mess? This is a no-no if you want to promote a job opening.

cratermoon · on Nov 19, 2022

I do. It's my bread-and-butter. I call myself a code janitor. I live by books like "Working Effectively With Legacy Code" and "Kill it With Fire". But I have my limits. Academic code has.. coded, in the medical sense, and it can't be revived. Put a DNR on it.

CoastalCoder · on Nov 19, 2022

I've seen this also in several software systems that started life in a CS grad department. (Not all the same university.)

The original authors' quirks get enshrined in the code base, and its neigh impossible to fix until they leave the company that commercialized it.

sargstuff · on Nov 19, 2022

sorta like the original calculus thesis.

d4nyll · on Nov 19, 2022

I have a degree in biochemistry. Would love to combine my passion in software and biology, but academic research is often funded by governments which means the salary is (super) low.

It's the same reason why there's a lack of qualified computer science teachers in schools.

raphaeljlt · on Nov 19, 2022

Quick plug here for Atomic AI ( https://atomic.ai/ , https://boards.greenhouse.io/atomai ), which could be added to the list. We value and respect (and pay) our engineers—I myself trained as a SWE and worked at FAANG.

Shoot me a message at raphael@atomic.ai if you want to learn more.

aschleck · on Nov 19, 2022

(chiming in here as a founding engineer at Atomic)

So I spent more than 8 years as a SWE at Google, and now work here with both experimental biologists and machine learning scientists. And yes, a lot of the concerns mentioned in this thread are also things I have had anxiety about.

Most obvious to me, being a software engineer at Google felt like being the center of the universe. Coming here, the focus is the scientific research. And yes, the scientists all managed to complete their PhDs so they don't necessarily need me to unblock them every second of their day. But contrary to my expectations, this has been remarkably freeing. I think one particularly important part of our company that makes this work is that, even on the science side, we're multidisciplinary (at a high level, emphasizing both experimental biology and ML.) And so engineering feeling like another arm of that multi-discipline nature is fairly... natural.

The reason I feel it's freeing, and the reason I enjoy working here, is also the greatest challenge. Because the scientists are focused on the science, because they respect me and trust me to figure it out, and because they aren't constantly blocked by me, my job is mostly about dreaming extremely expansively about what I can do to reduce toil and make the scientists more productive. Of course they have feedback and input, but how I use my time and what I build is ultimately my decision because I am the engineer. And I have been able to do some things I am very proud of, like rolling out Bazel and Kubernetes and finding ways to seamlessly bring them into the cloud (we're even multi-cloud now without them even noticing!) On the other hand, it's very challenging because when you work on a product, say Google Photos, as a SWE, you always have some direct tether to the product ("what should we build next? ahhhh, well I guess we could just embed stable difficusion and a million people would immediately play with it".) At Atomic, my tether is very ambiguous. If I do my job successfully, they'll be able to do research more quickly (? effectively?), and eventually we'll be able to produce a therapeutic that hopefully changes the world. Identifying what I can do today to speed up that far outcome in the future is very challenging, but it is a far more interesting challenge than gluing some pre-existing software into my UI or running A/B tests to turn a red button blue.

If, like me, you enjoy being given ownership over incredibly ambiguous problems, please do reach out!

This role focuses on directly partnering with the biologists: https://boards.greenhouse.io/atomai/jobs/4726839004

This role is expansive cloud infra: https://boards.greenhouse.io/atomai/jobs/4531035004

And this role is directly partnering with the ML scientists: https://boards.greenhouse.io/atomai/jobs/4191285004

elric · on Nov 19, 2022

Would love to, both out of interest and out of a belief that it might one day improve the world. But it's not happening. I have 20+ years of experience as a software engineer, but I don't have a degree, so anything that has even a whiff of academia rejects me outright. Not to mention that it would involve a big paycut over fintech.

roughly · on Nov 19, 2022

I work at at a SynBio company and heartily second this. If you're looking for interesting work where you can make an impact, it's an incredible field to be in.

I'm a nerd about everything - I love learning, and this field is incredible for it. The complexity and depth of biological systems dwarfs what we're doing in the software industry. I work with brilliant people doing absolutely fascinating work, and I get to learn more every day. At the same time, I get to build things that make a genuine contribution to the people I'm working with - I can see the value and impact of my skillset in a way that was a lot harder when I was working at a software company. The leverage that good software folks can provide to folks outside the industry is almost impossible to overstate - our ability to scale up what the practitioners in the field are doing can offer an almost category change in what they can attempt.

At the same time, there's still really, really knotty software problems to be had - computer science has benefited quite a lot from our ability to segment and structure our problems, but biology doesn't allow for that - everything that we're working with is operating at every scale, from molecular interactions up through genomics into protein design and folding and into metabolic modelling. Add to that that the data structures you're dealing with can vary from a few characters up to a couple megabytes (within the same represented "object"), distant elements within the same object can interact meaningfully, the objects themselves tend to be embedded in larger structures with which they meaningfully interact, and you've got some fiendishly complex problems.

And at the end of all that, you've got a field which offers a legitimate possibility of helping us move past petrochemicals; an enormous expansion in the kinds, potency, and specificity of healthcare; and a new and novel set of tools for shaping our world. It's an incredibly exciting place to be, and I've found people are genuinely thrilled to have good software folks along.

mdlm · on Nov 19, 2022

Those who are interested in the Broad Institute can reach out directly to me at mdelamaz@broadinstitute.org

isoprophlex · on Nov 19, 2022

Do you do fully remote positions, from non-US timezones (eg. Western Europe)?

aednichols · on Nov 23, 2022

[Also at Broad] Must be a US resident except for exceptional cases (e.g. world-renowned scientist).

joefreeman · on Nov 19, 2022

Thanks for posting this, and your learngenomics.dev resource looks great - I'm looking forward to reading this though. I recently started working as an engineering manager/lead in a genomics startup (https://www.genpax.co), and I've been picking this up as I go. I've also started working my way through the 'Micro binfie' podcast, which is great.

Our company values software quality and we're very product focussed. We're actively hiring in London: https://news.ycombinator.com/item?id=33423547

mehphp · on Nov 19, 2022

I did do this, there were a lot of great people on my team but it paid (a lot) less and is more stressful than just building another CRUD app.

iainctduncan · on Nov 19, 2022

I worked for a while at a consultancy supporting genomics through LIMS (lab info management software) customization, so not really genomics, but in the genomics biz (big genomics companies were our clients). For me, it was the least interesting software work I have done in my 20 year coding career. On the other hand, for people who just wanted a steady pay cheque and to go home at 5pm, it was a good gig. But man, software that moves samples and test tubes and their data around, it could be cars in a parking lot for all that the science makes it interesting.

We had bad attrition to both more interesting and higher paying work. (I left for both after a year at the consultancy)

brofallon · on Nov 19, 2022

Most of the discussion here seems to assume bioinformatics / genomics jobs are academic, but I work for a clinical testing lab where production-quality code is a must. We're probably a 10/12 on the Joel test.

If you're into bioinformatics or genomics, but aren't excited about an academic setting, take a peek: https://recruiting2.ultipro.com/ARU1000ARUP/JobBoard/62cc791...

We hire fully remote positions and starting salaries are about US$100k.

jdeaton · on Nov 19, 2022

As someone who puts tremendous value in technical mentorship when considering a role this is about the worst possible advertisement for being a swe in genomics as it amounts to "all our code is awful- come fix it!"

fastaguy88 · on Nov 19, 2022

It may be worth pointing out that several of the leaders in the Genomics field started off in commercial software development. I agree that it does not make monetary career sense to move into genomics -- academic labs cannot pay you more than the lab head makes, which is probably much less than many software developers are worth in other markets.

But I've known several financially successful developers who have gone back for a PhD in bioinformatics and genomics, and, after getting over their distaste for existing tools, have made important and well-recognized contributions. But they did not make more money.

pabs3 · on Nov 20, 2022

I wonder how popular open source is in genomics, there does seem to be a lot of open source genomics/med/science related software.

https://wiki.debian.org/DebianGenomics https://blends.debian.org/med/tasks/ https://blends.debian.org/science/tasks/

wdwvt1 · on Nov 20, 2022

Somewhat self-interested plug here: consider working in metabolomics as well. Metabolomics is where sequencing was in ~2008. The physics and chemistry are pretty well worked out (though many improvements are surely coming in the same way that 454 gave way to Illumina, PacBio, Nanopore, etc.). The software and computational workflows are truly awful, like hard to describe bad. The company that figures out metaboloimcs well is going to command a much larger market than genomics - genomics tells you what's possible, metabolomics tells you actually what's happening.

bambax · on Nov 19, 2022

> Google Genomics. Careers link. > Microsoft Genomics. Careers link.

Google and Microsoft probably know how to make software?

Side note: why does this page have user-select: none on body? It's annoying; what does it accomplish?

jghn · on Nov 19, 2022

Google Genomics is now aka Google Cloud Life Sciences. There's also their sister company Verily that operates in this space.

cosentiyes · on Nov 19, 2022

There is also a research group: https://health.google/health-research/genomics/

theGnuMe · on Nov 19, 2022

Couple of things I know.

Bioinformaticians come in two flavors. Those that studied biology and then took up coding and then the even rarer computer scientists who learned biology. The latter are so rare that they are almost all professors or founders or work at Deep Mind etc... Then, there are the biomedical engineers, etc...

The computer scientists will go off a solve protein folding when the bioinformaticians and chemists worked on it for years.. I am exaggerating a little here, I imagine there were plenty of bioinformaticians on the Alpha Fold team, but the fundamental breakthrough was DNNs.

sargstuff · on Nov 19, 2022

biologist / chemist will take the architecture studio approach, then develop math to shorten the write-up.

research software engineer will develop the mathematics to describe things, then use the numerical system to write software to determine things.

throwawaysleep · on Nov 19, 2022

I’ve worked at a lot of places and for researchers was my worst job ever by far. I’ll never work for someone with a PhD again, as Sheldon Cooper’s attitude towards engineers is no joke.

denvaar · on Nov 19, 2022

I'm most definitely not an expert in this area, but I have recently taken interest in learning about "succinct data structures", which from what I understand have their place in bioinformatics.

It's been a challenging topic to learn about, because most of the information comes from Computer Science papers and articles where the information is presented in a very formal, mathematical way, which I am just not used to.

Normally when thinking about data structures and algorithms, we're mostly concerned with optimizing for speed. Space complexity is not usually as big of a consideration. Succinct data structures are all about creating ways to achieve good runtime performance while representing the data in a "compressed" format. I think this comes in handy when doing things like DNA sequencing since data sets are so large.

I'm excited to check out some of links in the post, and in case any one else is interested in learning more about succinct data structures, here's a few resources I'd recommend:

Prof. Ben Langmead's YouTube channel: https://www.youtube.com/user/BenLangmead/featured

Alex Bowe's blog has some good content: https://www.alexbowe.com/articles/

Prof. Erik Demaine's "succinct" lectures from his adv. data structures course at MIT on YouTube: https://www.youtube.com/watch?v=3Y2weLDiUWw

Edward Kmett's Haskell live coding session going into some details about succinct: https://www.youtube.com/watch?v=9MKEmNNJgFc

There's also a lot of research papers, which you should be able to find by searching for "succinct data structures" (Jacobson, Munro, Brodnik, Raman, Rao, Navaro, Sadakane just to name a few). I at least have a basic CS undergraduate degree, but many of these papers are over my head, but I have still been able to slowly understand more and more. Some I had to purchase.

moron4hire · on Nov 19, 2022

This is tangentially related to what I'm currently doing.

I basically work in EdTech. The company is not an EdTech company, it's a education services company. I was hired on to develop software that we couldn't find in the market[0].

I'm the process of building this thing, we've been attending and speaking at conferences in our industry. And I'm seeing a lot of the same stories: academia is trying to do research, the research fundamentally requires software to make the research happen, the quality of the software can have a huge impact on results, but because software development is tangential to the research goals, there's little to no allocation to software developers. This leaves the researches to cobble together a solution that maybe kinda fulfills their need, not corky, and certainly not perpetually (a lot of reliance on trial software and services).

We would love to offer our software to researchers in our field. We've gotten feedback from several that what we are building is exactly the sort of thing they need. But they have no money, and even if we were in a position to give it away for free, we can't even make those connections come to fruition.

So I don't know what to do. I really am thinking of starting to give it away for free, because at least we'd benefit from more research results in our field pricing the efficacy of our approach. But that's a really slow burn.

[0] Specifics don't matter, but if you're curious, I make a VR environment for foreign language training emphasizing culture.

jefftk · on Nov 19, 2022

I recently switched from software engineering on ads and web performance at a FAANG to (meta)genomics at a nonprofit startup; happy to answer questions

boppo1 · on Nov 19, 2022

From a genomics layperson with decent dev skills:

A. What are the broad and medium goals you work on?

B. What are your daily activities? How do they fir into (A)?

C. What does nonprofit genomics vs for profit look like from a revenue standpoint?

D. What specific technologies/stacks are you using?

E. The CRUD frontend+backend+database to serve users (and sell ads) is pretty ubiquitous in 'tech', with some branches. How does your field compare?

jefftk · on Nov 19, 2022

A: The goal of my current project is to identify novel pandemics, even if they're caused by something we've never seen before, most likely by looking at growth patterns. At a broader scale, I'm trying to learn enough about working in this field that I'll be able to contribute on whatever future projects seem most appropriately important to me.

B: unlike my previous work, I'm back to being an individual contributor. Very few meetings, mostly coding and analysis. Current thing is trying to understand what drives per-sample variability in wastewater sequencing data.

C. Our group is currently philanthropically funded, and is focused on determining whether/how this is possible/practical.

D. I'm mostly working in (bio)python, with a few bits written in C and gluing things together with bash.

E. I was most recently working on (a) JS infrastructure to fetch and render ads and (b) working with browsers on platform features that would improve privacy, security, and efficiency.

boppo1 · on Nov 19, 2022

F. The parent article mentions solutions are often custom made by one person. Can problems in the field be reduced such that extensible open-source frameworks could be applied? The way we have frameworks for webdev?

jefftk · on Nov 19, 2022

I'm very new to this area, and am really not the right person to ask, but I'll try my best ;)

In general you have frameworks when lots of people are trying to solve a large number of problems that look similar at the start and then will diverge. That's pretty web specific. I think instead in bio you mostly get (and will keep getting) modular tools and pipeline standardization.