Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I continue to be impressed by how quickly DeepMind has managed to progress in such a short time. CASP13 was a shocker to all of us I think, but many were skeptical as to the longevity of the performance DeepMind was able to achieve. I believe with CASP14 rankings now released, it's safe to say that they've proven themselves.

Congratulations to the team! This work will have far reaching impacts, and I hope that you continue to invest heavily in this area of research.




> but many were skeptical as to the longevity of the performance DeepMind was able to achieve

For a non-biologist, on what is this skepticism based?

Just purely based on following ML news it looks like the trend for ML solutions has been that they've overtaken expert-systems once they've gained a solid foodhold in a field. Maybe this is some perception bias. Are there any cases where ML performed decently but then hit a ceiling while expert systems kept improving?


It's because for many researchers ML is just to take a standard keras or scikitlearn model shove their data in and get some table or number out, and see if that solves their problem. If that's your only ML experience then I suppose this is how sceptical you'd be of ML in general.

It looks like DeepMind invented a completely new method for this round that's not just an extension of their previous work, showing how much you can gain if you don't shoebox yourself into just trying to improve existing methods.

That all the scientists were highly skeptical about the scope of ML (and these are computer scientists to begin with mind you) just shows how little they knew of what they did know of what a computer or a program can possibly do, which is a bit appalling to be honest.


"It looks like DeepMind invented a completely new method for this round that's not just an extension of their previous work, showing how much you can gain if you don't shoebox yourself into just trying to improve existing methods. That all the scientists were highly skeptical about the scope of ML (and these are computer scientists to begin with mind you) just shows how little they knew of what they did know of what a computer or a program can possibly do, which is a bit appalling to be honest."

My PhD (now over a decade ago...yikes) was in applying much simpler ML methods to these kinds of problems (I started in protein folding, finished in protein / nucleic acid recognition, but my real interest was always protein design). Even back then, it was clear that ML methods had a lot more potential for structural biology (pun unintended) than for which they were being given credit. But it was hard to get interest from a research community that cared little about non-physical solutions. No matter how well you did, people would dismiss it as a "black box solution", and that pretty much limited your impact.

Some of this is understandable: even today, it's not at all clear that a custom-built ML model for protein folding is of much use to anyone -- particularly a model that doesn't consider all of the atoms in the protein. The traditional justification for research in this area is that if you could develop a sufficiently general model of protein physics, it would also allow you to do all sorts of other stuff that is much more interesting: rational protein design, drug binding, etc.

The alphafold model is not really useful for any of this, so in a way, it's kind of like the weinermobile of science: cool and impressive when done well ("hey! a giant hot dog on wheels!"), but not really useful outside of the niche for which it was designed. So it's hard to blame researchers in this field -- who generally have to chase funding and justify their existence -- from pursuing the application of deep learning to this one, narrow problem domain.

Obviously there will now be a wave of follow-on research, and it's impossible to know what methods this will spawn. Maybe this will revolutionize computational structural biology, maybe not. But I think it's a little unfair to demonize the entire field. Protein folding just traditionally hasn't been a very useful or interesting area, and like all "pure science", it leads to a lot of small-stakes, tribal thinking amongst the few players who can afford to compete. This is right out of Thomas Kuhn: a newcomer sweeps into a field, glances at the work of the past, then bashes it over the head, dismissively.


We don't know too much about the exact model they made but it looks sufficiently generalizable to be able to give a candidate protein structure for any given sequence. It doesn't automatically cure cancer and inject the drug but that by itself is an amazing tool that if available to everyone will revolutionize biology experimentation.

I will definitely blame the protein structure field in multiple levels though. It was always frustrating to me to open up Nature or Science and see it filled with papers about structure - like they are innovating so much that half of the top science magazines every week have papers in that field, yet it's not going anywhere? Or is it simply just a bunch of professors tooting their own horns about ostensible progress in a field that's archaic by decades if not years? The overall protein structure field internalised some dogmas in self defeating ways to everyone's detriment and finally events like this (and Cryo em, maybe) will jolt them out or make them fully irrelevant so we can move on. it's only doubly ironic that this came from a team in a company with minimal academic ties showing how toxic that entire system is. I only feel pity for the graduate students still trying to crystallize proteins in this day and age.


The reason for your second paragraph is pretty straightforward. There has been an immense amount of support for proteins as "the workhorses of the cell" for hundred+ years. I call it the "protein bias". We've seen in many times- for example when it was first hypothesized and then proved that DNA, rather than protein, is the heredity-encoding material, and seen many times, for example in the denial that RNA could act as an enzyme or the functional core of the ribosome could be a ribozyme.

I think what basically happened is a very influential group of scientists mainly in Cambridge around the 50s and 60s convinced everytbody that reductionist molecular biology would be able to crystallize proteins and "understand precisely how they function" by inspecting the structures carefully enough.

I learned, after reading all those breathless papers about individual structures and how they explain the function of protein is that in the vast majority of cases, they don't have enough data to speculate responsibility about the behavior of proteins and how they implement their functions. There are definiteyl cases of where an elucidated structure immediately led to an improved understanding of function:

"It has not escaped our notice (12) that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material."

but most papers about how cytochrome "works" aren't really illuminating at all.


"We don't know too much about the exact model they made but it looks sufficiently generalizable to be able to give a candidate protein structure for any given sequence. It doesn't automatically cure cancer and inject the drug but that by itself is an amazing tool that if available to everyone will revolutionize biology experimentation."

They say on their own press-release page that side-chains are a future research problem, and nothing about their method description makes me believe they've innovated on all-atom modeling. This software seems able to generate good models of protein backbones; these kinds of models certainly have uses, but a backbone model is not enough for drug design.

This is certainly an advancement, but you're exaggerating the scope of the accomplishment.

" I only feel pity for the graduate students still trying to crystallize proteins in this day and age."

Nothing about this changes the fact that protein crystallography is a gold-standard method for determining a protein structure. CryoEM has made it possible to obtain good structures for classes of proteins we could never achieve before, and it's certainly interesting if we can run a computer for a few days to get a 1Å ab initio model for a protein sequence, but we could already do that for a large class of proteins with homology modeling. These predicted structures still aren't generally that useful for drug design, where tiny details of molecular interactions matter.

To put it in perspective: protein energetics are measured on the scale of tens of kcal / mol. Protein-drug interactions are measured in fractions of a kcal. A single hydrogen bond or cation-pi interaction or displaced water molecule can make the difference between a drug candidate and an abandoned lead. Tiny changes in backbone position make the difference between a good structure and a bad one. Alphafold isn't doing that kind of modeling.


Of course, they havent solved everything, but you seem to be doing exactly what I accuse that entire field (and academia in general) of doing - which is to insist a problem is intractable or hard and undermine someone potentially challenging that. When they released the 2018 results tbey field did embrace it (for sure I'd consider the groups organizing CASP as at least forward thinking) but was still skeptical on how much more progress it can make; now they blow everyone's minds again by a monumental leap and again people want to come say of course this is the last big jump!

I understand the self preservation instincts that kick in when there's a suggestion that the entire field has been in a dark age for a while, but I hope you can see that there might be something fundamentally wrong with how research is done in academia and that is to blame for why this didn't happen sooner, and why it's so hard for many to embrace it.

Regarding your comments on the inapplicability of this current solution for docking, I'm sure that's the next project they're taking up, and let's see where that goes.

This is exactly the same type of progression that happened with Go, where when their software bet a professional player everyone's like "yeah but I bet he wasn't that good". A few years later and Lee Sedol just decided to retire. I am interested to see what happens to that entire academic field in a similar vein, though my interests are more in knowing how science can advance from more people thinking this way.


> Nothing about this changes the fact that protein crystallography is a gold-standard method for determining a protein structure.

Yes it does. Protein crystallography is/was the gold-standard. Once this result is verified and accepted by the scientific community as a whole, that changes.


Are you always so dismissive of Nobel-level achievements?


ML is a super overloaded term.

There are definitely cases where machine learned statistical solutions do not perform as well as the systems tuned by the experts, but if you can define the task well and get the data for a deep solution, usually those will overtake.


This. I believe technically just linear regression could be considered "machine learning".


I've seen people at bio conferences actively calling linear regression machine learning.


This is likely because linear regression meets most widely accepted definitions of machine learning. [0][1] It is simple and very effective when learning in linear space.

[0] https://en.wikipedia.org/wiki/Machine_learning

[1] https://www.cs.cmu.edu/~tom/mlbook.html


Sorry, I don’t get it. Are you saying fitting a linear regression model to data and making predictions somehow isn’t machine learning? I am confused.


> Are there any cases where ML performed decently but then hit a ceiling while expert systems kept improving?

Yes, this describes entire history of AI including several boom-bust cycles. In particular the 80's come to mind. Yes the practitioners think that there's no technical barriers stopping them from eating the world, but that's exactly what people thought about other so-called revolutionary advances.

Although to be pedantic, "expert systems" is the technology behind AI boom of the 80's. At the time people were saying expert systems can't be as good as existing algorithms (including what we would now call "machine learning" techniques), then suddenly the expert systems were better and there was rampant speculation real AI was around the corner. Then they plateaued.

We appear to be at the tail end of the maximum hype part of the boom-bust cycle. Thinking that the rapid gains being made by the current deep learning approaches will soon hit a wall is a reasonable outside-view prediction to make: nearly every time we've had a similarly transformative technology in the AI space and elsewhere, hitting the wall is exactly what happened. The onus would be on practitioners to show that this time really is different.


I think the disconnect this time around is in productionization. We're getting breakthroughs in a wide range of problems, and translating those gains in the problem space into 'real' stable, practical solutions we can use in the world is the remaining gap, and often takes years of additional effort. It's still really expensive to launch this stuff, and often requires domain expertise that the ML research team doesn't have.

We're seeing a lot of this pattern: ML Researcher shows up, says 'hey gimme your hardest problem in a nice parseable format' and then knocks a solution out of the park. The ML researcher then goes to the next field of study, leaving (say) the doctors or whatever to try to bridge the gap between the nice competition data and actual medical records. It also turns out that there's a host of closely related but different problems that ALSO need to be solved for the competition problem to really be useful.

I don't think this means that the ML has failed, though; it's probably similar to the situation for accounting software circa 1980: everything was on paper, so using a computerized system was more trouble than it was worth. But today the situation in accounting has completely flipped. Apply N+1 years of consistent effort improving data ecosystems, and the ML might be a lot easier to use on generic real world problems.


Next time you fly through a busy airport, think about the system which assigns planes to gates in realtime based on a large number of variable factors in order to maximize utilization and minimize waits. This is an expert system design in the 80's and which allowed a huge increase in the number of planes handled per day at the busiest airports.

Or when you drive your car, think about the lights-out factory that built-it, using robotics technologies developed in the 80's and 90's, and the freeways which largely operate without choke points again due to expert system models used by city planners.

These advances were just as revolutionary before, and people were just as excited about AI technologies eating the world. Still, it largely didn't happen. To continue the example of robotics, we don't have an equivalent of the Jetson's home robot Rosey. We can make a robot assemble a $50,000 car, but we can't get it to fold the laundry.

These rapid successes you see aren't literally "any problem from any field" -- it's specific problems chosen specifically for their likely ease in solving using current methods. DeepMind didn't decide to take on protein folding at random; they looked around and picked a problem that they thought they could solve. Don't expect them to have as much success on every problem they put their minds to.

No, machine learning is not trivially solving the hardest problems in every field. Not even close. In biomedicine, for example, protein folding is probably one of the easiest challenges. It's a hard problem, yes, but it's self-contained: given an amino acid sequence, predict the structure. Unlike, say, predicting the metabolism of a drug applied to a living system, which requires understanding an extremely dense network of existing metabolic pathways and their interdependencies on local cell function. There's no magic ML pixie dust that can make that hard problem go away.


Well, we can agree that world peace is off the table!

Beyond that, let's notice that expert systems did indeed change how airports and freeways work: They improved the areas where they solved problems. Deployment happened.

What we're seeing now is new classes of previously unsolvable problems falling. Deployment in medicine is known to be particularly hard, but not impossible. My read on the situation is that there have been a number of ML applications in the current round that have been kinda-successful 'in vitro' and failed in deployment. That doesn't mean that all deployments will fail.

Furthermore... Neil Lawrence points out that in most cases we change the world to fit new technologies. For example, mechanized tomato pickers suck, so we develop a more machine-resistant tomato. Cars break easily on dirt roads, so we pave half the planet. ML/AI somehow flips people's expectations of how technology works, and expect the algorithms to adapt to the world. This is almost certainly wrong.

"it's specific problems chosen specifically for their likely ease in solving using current methods. DeepMind didn't decide to take on protein folding at random; they looked around and picked a problem that they thought they could solve."

I'm actually not sure this is at all true. Protein folding is a long-standing grand challenge on which no current methods were working. My guess is that it was initially chosen for potential impact, and chased with more resources after some initial success.


> We appear to be at the tail end of the maximum hype part of the boom-bust cycle. Thinking that the rapid gains being made by the current deep learning approaches will soon hit a wall is a reasonable outside-view prediction to make: nearly every time we've had a similarly transformative technology in the AI space and elsewhere, hitting the wall is exactly what happened. The onus would be on practitioners to show that this time really is different.

What a take. Neural networks just took a huge bite out of protein folding and your take is: This just in, the Deep Learning boom is about to go bust! Asinine.


It's not asinine to have realistic expectations and not give in to hyperbolic claims.


Progress like this was, in my view, inevitable after the invention of unsupervised transformers.

It'll be genetics next.

e: although AlphaFold appears to be convolutionally based! I suspect that'll change soon.


> It'll be genetics next.

Which part of genetics are you thinking of? Much of genetics isn’t amenable to this kind of ML, because it isn’t some kind of optimisation problem. And many other parts don’t require ML because they can be modelled very closely using exact methods. ML does get used here, and sometimes to great effect (e.g. DeepVariant, which often outperforms other methods, but not by much — not because DeepVariant isn’t good, but rather because we have very efficient approximations to the exact solution).


What do you mean?

Genetics is amenable because the genome is a sequence that can be language modeled/auto-regressed for depth of understanding by the network.

There are plenty of inferences that you would want to do on genetic sequences that we can't model exactly and there is some past work on doing stuff like this, although biology is usually a few years behind.

https://www.nature.com/articles/s41592-018-0138-4

e: for clarity


I meant, which specifics are you thinking of?

> Genetics is amenable because it is a sequence

Not sure what you mean by that. Genetics is a field of research. The genome is a sequence. And yes, that sequence can be modelled for various purposes but without a specific purpose there’s no point in doing so (and furthermore doing so without specific purpose is trivial — e.g. via markov chains or even simpler stochastic processes — but not informative).

> There are plenty of inferences that you would want to do on genetic sequences

I’m aware (I’m in the field). But, again, I was looking for specific examples where you’d expect ML to provide breakthroughs. Because so far, the reason why ML hasn’t provided many breakthroughs in less about the lack of research and more because it’s not as suitable here as for other hard questions. For instance, polygenic risk scores (arguably the current “hotness” in the general field of genetics) can already be calculated fairly precisely using GWAS, it just requires a ton of clinical data. GWAS arguably already uses ML but, more to the point, throwing more ML at the problem won’t lead to breakthroughs because the problem isn’t compute bound or vague, it’s purely limited by data availability.

I could imagine that ML can help improve spatial resolution of single-cell expression data (once again ML is already used here) but, again, I don’t think we’ll see improvements worthy of called breakthroughs, since we’re already fairly good.


> Not sure what you mean by that

I spoke loosely, my mind skipped ahead of my writing, and I didn't realize that we were parsing so closely. "Genetics (the field) is amenable because the object of its study (the genome) is a sequence" would have been more correct but I thought it was implied.

> without a specific purpose there’s no point in doing so

Well yes, prior to the success of transfer learning I could see why you would think that is the case, but if you've been following deep sequence research recently then you would know there are actually immense benefits to doing so because the embeddings learned can then be portably used on downstream tasks.

> it’s purely limited by data availability.

Yes, and transfer learning on models pre-trained on unsupervised sequence tasks provides a (so-far under-explored) path around labeled data availability problems.

I already linked to a paper showing a task that these sorts of approaches outperform, and that is without using the most recent techniques in sequence modeling.

Maybe read the paper in Nature that uses this exact LM technique to predict the effect of mutations before assuming that it doesn't work: https://sci-hub.do/10.1038/s41592-018-0138-4

I am not directly in the field, you are right - but I think you are also being overconfident if you think that these approaches are exactly the same as the HMM/markov chain approaches that came before.


Thanks for the paper, I’ll check it out; this isn’t my speciality so I’m definitely learning something. Just one minor clarification:

> Maybe read the paper … before assuming that it doesn't work

I don’t assume that. In fact, I know that using ML works on many problems in genetics. What I’m less convinced by is that we can expect a breakthrough due to ML any time soon, partly because conventional techniques (including ML) already have a handle on some current problems in genetics, and because there isn’t really a specific (or flashy) hard, algorithmic problem like there is in structural biology. Rather, there’s lots of stuff where I expect to see steady incremental improvement. In fact, in Wikipedia’s list of unsolved biological problems [1] there isn’t a single one that I’d characterise specifically as a question from the field of genetics (as a geneticist, that’s slightly depressing).

But my question was even more innocent than that: I’m not even that sceptical, I’m just not aware of anything and genuinely wanted an answer. And the paper you’ve posted might provide just that, so go and do my research now.

[1] https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_b...


Not being in the field, I would term what I see in this story as a ‘bottom up’ approach to understanding genetics/molecular biology. More akin to applied sciences than medicine or health. This, for example, seems to be very important but it still leaves us with a jello jigsaw puzzle with 200 million pieces and probably far removed from immediate utility in health outcomes.

Then there’s the more clinically oriented approaches of looking at effects, trying to find associated genes/mutations whatever mechanisms exist in between to cause a desirable or undesirable outcome. I’d call that ‘top down’.

I’m sure the lines get blurred more every day, but is there a meaningful distinction into these and/or more categories that are working the problem from both ends? If so, are there associated terms of art for them?


[flagged]


Rude. I would appreciate substantive criticism, especially when I'm linking papers in Nature starting to do exactly what I'm talking about.


I cannot give constructive feedback to something which is incomprehensible.

"the genome is a sequence that can be language modeled/auto-regressed for depth of understanding by the network"

The genome is not a sequence so much as a discrete set of genes which are themselves sequences which specify construction plans for proteins. That distinction is important.

Language modeling in the context of machine learning typically means NLP methods. Genetics is nothing like natural language.

Auto-regression is using (typically time series) information to predict the next codon. This makes very little sense in the context of genetics since, again, the genetic code is not an information carrying medium in the same sense as human language. Being able to predict the next codon tells you zilch in terms of useable information.

"Depth of understanding by the network" ... what does that even mean???

The above sentence is a bunch of popular technical jargon from an unrelated field thrown together in a nonsensical way. AKA word salad.


> The genome is not a sequence so much as a discrete set of genes which are themselves sequences which specify construction plans for proteins. That distinction is important.

aka a sequence. "a book is not a sequence so much as a discrete set of chapters which are themselves sequences of paragraphs which are themselves sequences of sentences" -> still a sequence

these techniques are already being used, such as in the paper I just linked.

> Being able to predict the next codon tells you zilch in terms of useable information.

You have absolutely no way of knowing that apriori. And autogressive tasks can be more sophisticated than just next codon.

> bunch of popular technical jargon from an unrelated field thrown together in a nonsensical way

Okay, feel free to think that.

There's always this assumption of it "will never work on my field." I've done work on NLP and on proteins and read others' work on genetics. I think you will end up being surprised, although it might take a few years.


It is incomprehensible to you, because you just simply do not understand what your parent is talking about. You are the ignorant one here and indeed quite rude. Doesn't matter that genetics is not natural language. The point is we can train large transformers auto regressively and the representation it learns turns out to be useful for a) all kinds of supervised downstream tasks with minimal fine-tuning and b) interpreting the data by analysing the attention weights. There is a huge amount of literature on this topic and what your parent says is quite sensible.


That statement you quote is completely understandable.

Let's say you have discrete sequences that are a product of a particular distribution.

Unsupervised methods are able, by just reading these sequences, to construct a compact representation of that distribution. The model has managed to untangle the sequences into a compact representation (weights in a neural network) that allows you to use it for other, higher level supervised tasks.

For example, the transformer model in NLP allowed us to not have to do part-of-speech tagging, dependency parsing, named entity recognition or entity relationship extraction for a successful language-pair translation system. The compact transformer model managed to remap the sequences into a representation that allows direct translation (people have inspected these models and figured out the internal workings of it and realized it does have latent information about a parse tree of a sentence or part-of-speech of a word).

Another interesting note is that designers of the transformer architecture did not incorporate any prior linguistic knowledge when they were designing it (meaning that the model is not designed to model language but just a discrete sequence).


FWIW, transformers is to sequences what convnets is to grids, modulo important considerations like kernel size and normalization. Think of transformers as really wide (N) and really short (1) convolutions. Both are instances of graphnets with a suitable neighbor function. Once normalization was cracked by transformers, all sort of interesting graphnets became possible, though it's possible that stacked k-dimensional convolutions are sufficient in practice.


I work in the field, I don't need the difference explained to me.

> Think of transformers as really wide (N) and really short (1) convolutions

Modern transformer networks are not "really short" and you're also conflating the difference between intra- and inter- attention.

There is still a pitched battle being waged between convnets and transformers for sequences, although it looks like transformers have the upper hand accuracy wise right now, convnets are competitive speed-wise.


> e: although AlphaFold appears to be convolutionally based! I suspect that'll change soon.

“For the latest version of AlphaFold, used at CASP14, we created an attention-based neural network system”

?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: