Hacker News new | past | comments | ask | show | jobs | submit login
The Decade of Deep Learning (bmk.sh)
133 points by signa11 on April 15, 2023 | hide | past | favorite | 72 comments



"The AlexNet paper is generally recognized as the paper that sparked the field of Deep Learning"

Uh not really. That would have been "A fast learning algorithm for deep belief nets" in 2006.

Also weird how this list completely ignores speech recognition. Deep learning's success in speech recognition predates AlexNet and motivated Google to create TPUs [1], and, more generally, invest in deep learning.

[1] https://www.wired.com/video/watch/the-story-behind-google-s-...


Shouldn't this argument be easy to settle with a simple citation search?

Step 1 - Rank papers by number of citations

Step 2 - Which of the most cited papers is the earliest?


One paper will be the most cited. One paper will be the earliest.

Unless they are the same there is not _one_ answer, what you get is an efficient set (the earliest paper with at least X citations / the most cited paper as of D).

And of course there is the problem of making the list of relevant papers in the first place.


You are mistaken. Ladleys papers detailing the explorative method and his figurative routines adopted by “A fast learning algorithm for deep belief nets” was the firecracker that started deep learning.

Also anyone who references wired.com should be shown the door


> anyone who references [Source] should be shown the door

Well - (exportable input) -, we are past the age of editorial boards (but for The Economist, probably) and in the age of independent journalists lending efforts inconsistently to different publishers.


Honest question, which paper is this? The 2012 one? The 2006 Hinton paper was the trigger surely - that's what I used and was filled with things that worked very quickly on GPUs.


As of a couple of years ago the main sentiment was: deep learning is neat but your problem could be solved by statistical learning techniques so check those out first. Does this still hold up?


90% of the problems can be solved by not fucking up your data, modeling your tables correctly, and a SQL query.

Of the 10% that can't be solved that way, 90% are solved with data cleaning + a linear model.

Of the 1% that can't be solved either way, 90% are solved with other statistical techniques (timeseries modeling, decision trees and so on).

For the remaining .1%, sure, deep learning I guess.


You keep making these very negative posts, where you sound very confident, but resort to either word salad or making things up.

> not fucking up your data, modeling your tables correctly, and a SQL query

And getting a pony.

> Of the 10% that can't be solved that way, 90% are solved with data cleaning + a linear model.

87% of statistics are made up.

> Of the 1% that can't be solved either way, 90% are solved with other statistical techniques (timeseries modeling, decision trees and so on).

But why?

This reminds me of how some people in the early '80s sneered at people who did their calculations using computers - recommending instead to memorise a billion mathematical shortcuts that would take longer to learn than programming a computer.


Why do you think this post is "negative"? "Negative" towards what?

> 87% of statistics are made up.

Yes, of course, I didn't mean "lower integer part of nine tenths of the total number of problems". Did that really need to be specified?

>> Of the 1% that can't be solved either way, 90% are solved with other statistical techniques (timeseries modeling, decision trees and so on).

> But why?

Is it really controversial that you should go for the simplest model that works?


For structured data this is valid, but the power of deep learning is for unstructured data where the embeddings and features need to be learned from raw data


90% of what problems?

I guarantee that 90% of the things I want to do have nothing to do with a table lookup.


And what of the 0.01% that deep learning is unable to solve?


Really? Can SQL queries do image recognition? How about self driving or, more recently, natural language processing?


Select * from images where item='car'

Select * from images where color='red' and item='light' and tag='traffic'.

Select * from voice where token='brrrrr'


Depends on the task. NLP? Audio? Video? DL is probably best. Classification, regression etc.? Don't bother (in my experience). You can still utilize deep learning-esque tools like embeddings (which aren't deep learning at all) and put an SVM on top.


That will almost always be the case. If basic stats could solve a problem then they still can.

All this AI development may open up new business opportunities or close existing ones - but for existing businesses, assuming they survive the disruption caused by AI, they probably still would be best served by focusing on getting some basic stats used in their processes. If they can't do that, then a neural network isn't going to make the situation any better for them.


Lots of organizations that are a bit clueless and trying to catch up quickly, driven by hype, think they need deep learning when they actually need Bayesian learning.

Instead of looking at stuff from Murphy volume I, they should look at volume II or Gelman's books.

There are neat combinations of ideas from both fields. Aside from volume II, Pyro's documentation provides some interesting use cases.


If you dont have a lot of data, then statistical learning. If your data is structured and have well defined interpretations, then statistical learning.

So if you have lots of data and no reasonable way to extract information or process them, you go to DL stuff.

You can get a lot of mileage out of gradient boosted trees and other forms of ensembles.


Small data is better handled by greedy methods. XGBoost is still the go to in tabular Kaggle challenges.


lmao no. imagine trying to do text to image with anything other than deep learning. nothing else comes close.


We have artists for that. I heard they can compete with sota methods.


OP was about classical statistical techniques. I'm pretty sure human artists are not logistic regression?


Compete on quality, certainly not on price or speed.


Ofcourse it does. Kings paper “probabilistically statistics” and his method of folding squares is the motivation behind this original advice


Link?


Three pillars of deep learning: training dataset, computing power, and machine learning models


All of this "throwing GPUs at problems" approach has brought us an online casino, domination over transparent games (chess, go) and trivia spitting chatbots (Watson, ChatGPT) and not much more.

It's brisk business for hardware makers, operators and engineers in general. Society keeps getting promised freedom, amazing new tech. And people keep using datacenters-as-a-service to drain speculative VC funds and government subsidies (depending on which side of the Atlantic they are located).

The core promise of silver bullet tech that can defeat real world limitations is such an obvious ploy. Well guess what they are just hiding the entropy exhaust somehow. And we keep falling for it.


All this science nonsense just got us all this mechanical horses that were unnecessary anyway and bigger explosions, who needs all that shit anyway. We keep believing this all leads to something, I’d say drop it all and go back to the woods.

Man, I don’t even know where to begin to counter all your unreasonable framing. Watson and ChatGPT in the same list, just dismissing chess and go as “transparent” (whatever that means) games.. just, wow.


> All this science

"Throwing experimentation and modelling at problems" is really not on the same plane as "throwing GPUs at problems". Also because "trying" is part of the preliminary phase.


The dismissive (and ignorant) part of this is that throwing GPU’s is all they are doing.


Please all this hard sell on complicated boondogles that will save us from the complexities of life. The concept of AI is as old of computers. Maybe older. People have been claiming we are just some form of AI since writing began. The turn of phrase may have been cast in anger and it clearly displeased people since it got downvoted to oblivion.

You can call it dismissive, but calling it ignorant is uncalled for. Producing the first half decent chatbot is not a good outcome given the time and effort put into this.


I think the poster tried to denounce the hype as a detrimental effect on the AI industry as search for actual solutions. Which may come from the confusion between the phases and outcomes of "Research" and "Development", where the "exciting" results of the "R" have created a fever on the "D" - also, since funding remains critical.


Maybe for general consumers. But deep learning is really a fundamental advance that has impacted science in important ways, e.g. AlphaFold. In general, deep learning learns function approximators that scale smoothly with data and resources, a fundamental advance from prior ML methods like SVMs, with extraordinarily general use.


A perspective illusion is given by the fact that the area is still importantly in development. See for example the matter of "temporal quality degradation" - a commentary appeared yesterday ( https://news.ycombinator.com/item?id=35566145 )


Wow, what an incredibly dismissive and jaded take.


And you feel like you've added something worthwhile to this conversation?


ChatGPT is the silver bullet, you can't convince me otherwise.

If the end of deep learning is to achieve artificial intelligence, we had almost achieved it at this stage.


It boggles my mind how people can think this when Alexnet stepped on the scene over a decade ago and not much has fundamentally improved since then. OpenAI just said they’re not planning on training GPT-5 for awhile because the scaling trick has hit diminishing returns. This isn’t “AGI is 2 years away”, it’s we’ve milked this advancement as best we can and bigger advancements are going to need a fundamental paradigm shift which usually doesn’t happen overnight. Not to say GPT-4 or vision models aren’t amazingly useful, but they’re no silver bullet. I’ve been using GPT-3 for awhile now and it makes a lot of fundamentally stupid mistakes.


"Not much" has fundamentally improved since Alexnet???

Honestly half the comments in this thread boggle my mind. What is even the word for someone that is shown something amazing that they don't understand and then dismisses it on completely superficial grounds?

Like if I go back in time and show someone a computer and they think it's just a typewriter and who ever needed one of those anyway? It's like ignorance but in the most willfull, dismissive way possible. Gross.


And the entire point of this article is to show what has changed since AlexNet (spoiler: a lot!), let alone GPT-3/DALL-E/Midjourney/Whisper/ChatGPT/Imagen/Imagen Video which aren't even in this article!

I wonder if what we're seeing is anxiety - that AIs are going to take my job, or create huge problems with spam/deepfakes - manifesting in frankly ignorant, jaded comments that completely ignore the huge positive potential.


People have been rewarded in the past for being skeptical, and have learned this trick to a detriment to their own intellect. It is a very easy trap to confuse skepticism with expertise on the internet, and HN is one of the worst about this - most of the comments that feign skepticism barely even understand the topic they are commenting upon anymore, and are simply generating skeptical words because that's what has rewarded them in the past. There is also unsubstantiated hype, which is not helpful either.

Occasionally someone actually has read a few hundred fundamental papers on ML and can give an actual educated response, but it is quite rare. Typically they don't feign skepticism, but rather notice there are noteworthy improvements provided by metaRL and RLHF, etc.


I work in vision. Go look up the imagenet leaderboard. Look at the results of Alexnet vs the top result today. The trend is a log line. The top contending architectures still include CNNs trained on backprop, they’ve just had a decade of tricks applied to eek out some improvements. The transformer based vision models aren’t much better.

Talk to any machine learning expert and they’ll tell you the math and fundamentals haven’t really changed since the 90s, we’ve just gotten better at scaling. Transformers came onto the scene half a decade ago and we could scale them much better than CNNs, but like CNNs of today, we’ve hit the diminishing returns limit.

Maybe look at actual data instead of being dismissive to different opinions.


Ok, I checked. Alexnet 63%. Top rated 91%. That's a big difference.

And what did you expect other than a log curve. The maximum is obviously 100%.


So interestingly, you can actually have linear or exponential curves on your way from 0 to 100. And you completely ignored how the basic building blocks and algorithms are more or less the same. I think I'm done discussing with non-experts.


I find delightfully ironic you just accused me of not having an intelligent take on AI, just trying to fake one.

That out of the way, the very term AI has been applied to automatic computation since its inception. And the current hype drive is nothing but marketing for software engineering done the hard way. You get one good chatbot by turning 8 years of internet into 1 tb of parameters on memory, costs nearly a million a day to run and... it can regurgitate semi-coherent prose. Wow. Talk about hype.

I'm not skeptical on AI to sound smart. Hell what do I get from some random anonymous account I use to read some blog aggregator? I am deeply skeptical of people selling hard some shiny new compute silver bullet that will do away with all the nasty complexity. Because it won't. We've been warned nearly 60 years ago about that.

Since you don't know squat about my background, maybe you're the one slinging snark around here.


The straight denialism around what transformer models can do has been very disappointing, though maybe not unsurprising. Acknowledging what that means acknowledging that your world is about to soon change in massive and unpredictable ways and the future you had planned and expected is not going to happen.


Vision models are still CNNs trained with backprop. Now they’ve begun to incorporate transformers into vision task, but the results aren’t x10 better. Please explain to me how vision models today are massively different than Alexnet and not just a bunch of slight optimizations and tricks to eek out some marginal accuracy improvements.


If you're talking about vision, fine. But you were replying to a comment about ChatGPT and then brought up AGI. No offence, but although some vision tasks are AI-hard, in general vision has minimal to do with AGI. That's why I quit the field. Transformers in vision may not be very interesting, but they certainly are a breakthrough in language.


What about Transformers?

And GPT-3 is fundamental improvement to earlier LLMs. Same goes for speech recognition which went from highly parametric to end to end in the last decade. Many other things like stable diffusion, TTS, etc.

It's like github's UI. If you use it daily, you maybe notice a little change here or there, but it doesn't disrupt you much. But now if one checks github screenshots from 10 years ago they look so different.

Say for example Youtube's auto transcription feature. I remember when it launched, it wasn't that bad but got many things wrong. Now it's really good, still making mistakes but gotten much better.


Define silver bullet. Because it doesn't really bring us closer to stuff like self driving cars.


ChatGPT is AN answer as how to create AI.

Simple as that.

It works as a natural language parser that works the same as human.

It is the holy grail that NLP community trying to find, one mechanism that can be commanded by plain English.

If it can program, then it can drive cars. Driving is inherently a low level task comparing to programming, hundreds of millions Americans can drive, far fewer can program.

But ChatGPT is just blind at the moment. Next step is to make it see, and walk and use its fingers. Won't be too far away in the future.


> works the same as human

Very certainly not, until you show that it has reflection - critical thinking -, which has many times shown not to have.

> Simple as that

As already expressed, you cannot just dump personal positions "porque [t]e sale de l'alma" (Borges).


"Same way" can mean "using the same principles", in addition to "identical in all ways"

For instance, shirts and parkas work the same way (trapping warm air near the skin), but have radically different capabilities.


> using the same principles

That is exactly what should be proven, if one dreamt of proposing that.

Especially because the "guess the next" mechanism is risking to become a gnoseological functional paradigm after the past few months, as if a theory of the mind. Which is more than an issue, because automation of thought (acritical repetition) is directly satanic.


My point was that the fact human brains are much more capable does not falsify the idea that they work the same way as these transformer style LLMS.

It might be we're special or substantively different. Or it might not. The fact we have additional capabilities just means we are not exactly the same as e.g. Gpt4 in all ways.


Reflexion*


Thanks! It brings us to some checks. But I read

> Spelling with -ct- recorded from late 14c., established 18c., by influence of the verb. OED considers the version with -x- to be "the etymological spelling", but Fowler (1926 - «A clear differentiation being out of the question, and the variation of form being without essential significance...») points out that -ct- is usual in the general senses and even technical ones

Actually, I would say that 'reflection' is the more proper. 'Reflexion' reflects the French spelling, so it is "«etymological»" in the sense of "post-Hastings 1066" (i.e. tracing the history), but in Latin it is 'flectere' (hence 're-flectere'). "Flectere" is already an action, so "reflection" is proper for the faculty.


Sorry, this was meant to be a semi-sarcastic reference to the paper by Northeastern. They use "Reflexion" in the title. It's a paper about how these systems improve their responses by simply allowing their outputs to be 'reflected upon' by the system. For instance, they are able to improve something like ~60% to ~87% by simply allowing the 'reflection' upon what was just output.


There is an ironic possibility that the poster confused the expression "silver bullet" for something different from "fake solution placing hopes on shiny idols".

(Probably - sorry, just analytically - by negating the expression "it is no silver bullet". Process that does not work, since it is the actual silver bullet that does nothing special.)

Edit: although, reading better, the later poster reprised the expression, probably ironically, from the original poster. Sorry, I missed that in the original.


I don't quite understand your comment, but I think you have your definition of "silver bullet" backwards. It means a solution that works incredibly, almost magically, well.


No, it's just a bit complex, but it depends on use - and today it is for a number of reasons more common to see "silver bullet" for "ineffective but seducing". It is a (non-)"solution" that /is believed/ to «work incredibly, almost magically, well». It works in a fable and not outside.

One of the first occurrences of the idea was the response of the Oracle of Delphi to Philip II of Macedon (father of Alexander), "With silver arms you may conquer the world". In this case the idea made sense on all planes: Philip understood that he could use bribery to achieve victory, and it worked.

But then the idea went on in folklore as a miraculous solution: you "could" use them against spells, then witches, then werewolves, vampires etc. Many reasons may conflate in the whole idea.

What happens next is the capillary reach of the scientific mindset, placing silver and the said "evils it cures" in a different place, farther from "dreamlike" perspectives.

So, today it is often said a "silver bullet" something that is said to be a "miraculous solution" - but that in reality is just a piece of shiny, alluring metal working well as an excuse to seduce a public in search of hopes, however ill-placed. Populists e.g. are said to have "silver bullets" ready against any sociopolitical illness - they have "solutions" presented as a miracle cure, but they just do not stand scientifically, effectively - though they may be effective in prolonging a career as an elected representative.


> No, it's just a bit complex, but it depends on use - and today it is for a number of reasons more common to see "silver bullet" for "ineffective but seducing". It is a (non-)"solution" that /is believed/ to «work incredibly, almost magically, well». It works in a fable and not outside.

Yes, you're re-asserting that, but I still disagree, at least in my dialect of General American English. Maybe you're in the UK and it's different there?

In my experience, you almost always hear it as part of the negative: "X is not a silver bullet". That phrase has your meaning of "ineffective but seducing". The implication then is that the silver bullet is effective. Magically so. Everyone associates it with the only thing that can kill a werewolf.

The thing that began this whole inane conversation was the mildly interesting use of "silver bullet" in the positive:

> ChatGPT is the silver bullet, you can't convince me otherwise.

The poster is taking the "no silver bullet" idiom and flipping it on its head. The "standard" usage would say "ChatGPT is no silver bullet", meaning it's not a magical solution to all problems. By negating it (and using the idiom unusually in the positive), they're saying that it is magically effective and can work on all problems.


> inane

Well, if you find it uninteresting, do not make it happen. I just showed you a bit of history of the expression, including the warning - do not assume that (especially globally) it is used your way. (Yes, if you are Tennessee it may be more frequently one way, in Singapore maybe the opposite...) You write «Everyone associates it with the only thing that can kill a werewolf» - and I am telling you that in fact that covers the slice of the population that also holds that silver bullets are in fact completely ineffective, in spite of the story, although it has consideration among people who mentally live in the world where werewolves roam.

> The poster is

It is clear what the user did: in fact, posts ago, I noticed that it is «ironic». The poster said "it is a silver bullet", and that is peculiar because some would reply to that "do not forget your magic cloak to approach more confidently", or similar.


Look, there's nothing unusual or ironic about the use of "silver bullet" in either the top level comment or the first response. That's just how the idiom is used.

I would be curious if you could offer an example of "silver bullet" used as you're saying some people do. All the examples I can find are the the way I've been saying. For example, the top three Google News hits for "silver bullet":

* Be excited about EPR, even if it’s not a silver bullet [0]

* Immigration: Silver Bullet for Nursing Shortage? [1]

* Offsets Are Not A Silver Bullet [2]

In all of these, "silver bullet" could be replaced with "totally effective solution". That's how OP used it as well.

You're saying that a "silver bullet" could mean "ineffective but seducing". If that were the case, the headlines would instead be:

* Be excited about EPR, because it's not a silver bullet.

* Offsets Are A Silver Bullet.

I've never seen it used that way. And that's not how the OP used it or the first commenter interpreted it.

It's not impossible it's used differently somewhere, if so I'd be curious to know. I know, for example, that "the point is moot" and "let's table the discussion" have opposite meanings in British and American English, for example.

[0] https://www.greenbiz.com/article/be-excited-about-epr-even-i...

[1] https://www.globest.com/2023/04/11/immigration-silver-bullet...

[3] https://eugeneweekly.com/2023/04/06/offsets-are-not-a-silver...


> ironic

No, I meant that it comes to be ironic (not that it was used ironically by the poster.

> It's not impossible it's used differently somewhere, if so I'd be curious to know

I would gladly provide: in fact, I did check on my RSS DB before my last reply - but I had rotated it only days before, so it contained a fraction of the usual data. I only found an interesting occurrence on the National Interest, but it was ambiguous, and an ironic use on the Guardian, but again it was not exactly in the way that I meant.

Let me assure you: there are authors that use directly the expression without the negative, e.g. "Populists always have silver bullets to propose". You can see that the expression makes sense: silver bullets work in tales, but this is reality. But the foremost example I have in mind does not write in English. I will check: if I can find a few references, I will post.


Nope, you made up some history of the origin of the phrase.

Practically everyone uses "silver bullet" to mean something that works incredibly well. If you have examples of your alternative usage, feel free to show us some. You're the only example so far.


> you made up some history of the origin of the phrase

Providing information that you can directly check - you can find sources around with a search engine?!

Unbelievable.

Post scriptum: and there we are with a sniper. Sniper, the poster accused of "making things up", very gratuitously (even Wikipedia contains references). This is very offensive. And if you have anything to say, say it directly instead of just being uselessly irritating.


It doesn’t? Given all data of surroundings, isn’t “driving” completing the next “token” (action)?


> you can't convince me otherwise

So? Are you trying to make this a poll?

> deep learning is to achieve artificial intelligence, we had almost achieved

DL has already """achieved""" AI, which was there before the perceptron. In fact, "Artificial Intelligence" is there to solve problems without direct human work, and it has worked pretty well in the past decades in many different domains.

This is to say, that our goal has been to have automated problem solvers - that of producing general intelligence (and, before that, general problem solvers) is kind of a different goal.


> It's brisk business for hardware makers, operators and engineers in general.

Gold Rush, sell shovels... you get it.


> has brought us

Also heavy privacy issues ("Throw everything we know about everyone in the computation - and more of it please").




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: