As someone who has worked in the field of AI/ML for quite awhile now, the problem with current AGI predictions is ML hasn't done anything new since the 80s (or arguably earlier).
At the end of the day all ML is using gradient descent to do some sort of non-linear projection of the data on to a latent space, then doing some relatively simple math in this latent space to perform some task.
Personally I think the limits of this technique are far better than I would have thought 10 years ago.
However we are only near AGI if this is in fact how intelligence works (or can work) and I don't believe we've seen any evidence of this. And there are some very big assumptions baked into this approach.
Essentially all we've done is pushed the basic model proposed by linear regression to it's absolutely limits, but, as impressive as the results are, I'm not entirely convinced this will get is over the limit to AGI.
> Essentially all we've done is pushed the basic model proposed by linear regression to it's absolutely limits
No, we haven't pushed linear regression to its limits. If it was only linear regression, it wouldn't work. Neural networks need a non-linearity to model complex things.
The beauty is that given an infinite series of nonlinearities, one can model any mathematical function. In practice we found it takes much less than "infinite", a handful will already get you a long way.
A stack of this basic building block, as you describe it, is really all it takes - we know that mathematically already. The interesting question is: how complex are the functions we need to model? So if we create a neural network of a certain size, is that size large enough to model the problem space?
> However we are only near AGI if this is in fact how intelligence works (or can work) and I don't believe we've seen any evidence of this. And there are some very big assumptions baked into this approach.
I think ChatGPT is good evidence of this. What evidence do we have that this isn't how intelligence works?
I like what you are saying but not the last line. Angels exist - what evidence do we have that they don't?
ChatGPT points to us possibly going in the right direction. It is so good that people need to write long articles about why it is not as good as a human. Contrast to older efforts, which were pretty clearly not so great. I find this pretty compelling, and GOTCHA examples that show that ChatGPT isn't as good as humans in everything miss the point.
Birds fly and drones don't look like birds - but they fly. If the goal is flying, it's okay that we achieve it through something that doesn't exactly mimic what inspired it. Do we need a "full human" for all we do? Our billions of machines, many performing work previously done by humans, show that we don't.
If we can largely replicate intelligence, it's not super important whether "this is how human intelligence works".
You inverted the argument accidentally with your angel analogy, I think something like "we have labradors, what evidence do we have that they are not angels?" matches much more closely to the thrust of their point while also highlighting what you find lacking about the argument.
Edit: the important bit is that the subject already has some qualities in common with the proposed classification which is a significantly different proposition than a purely hypothetical orbital teapot
A transformative aspect of human verbal intelligence involves dealing with concepts and their combinations intersection. ChatGPT does this very well. I think we can agree that ChatGPT provides intelligent completions for an astonishing range of human concepts?
It seems appropriate to describe what ChatGPT understands and what it doesn’t understand through evals or assessments (in the same way that we can use assessments to determine what a student understands or doesn’t). So if we have to call it “computational understanding”, fine —- but clearly ChatGPT understands an incredible range of concepts and their combinations.
It’s terrible and math and logic, but ChatGPT is amazing at concepts—that’s why it is so powerful.
It doesn’t work programmatically—that’s why it fails at logic. But it can reason inductively very very well. Do you have an example besides logic/math where it doesn’t understand simple concepts?
> Do you have an example besides logic/math where it doesn’t understand simple concepts?
All the time. It often fails to understand simple concepts. It doesn't really seem to understand anything.
For example, try to get it to write some code for a program in a moderately obscure programming language. It's terrible: it will confidently produce stuff, but make errors all over the place.
It's unable to understand that it doesn't know the language, and it doesn't know how to ask the right questions to improve. It doesn't have a good model of what it's trying to do, or what you're trying to do. If you point out problems it'll happily try again and repeat the same errors over and over again.
What it does is intuit an answer based on the data it's already seen. It's amazingly good at identifying, matching, and combining abstractions that it's already been trained on. This is often good enough for simple tasks, because it has been trained on so much of the world's output that it can frequently map a request to learned concepts, but it's basically a glorified Markov model when it comes to genuinely new or obscure stuff.
It's a big step forward, but I think the current approach has a ceiling.
>, try to get it to write some code for a program in a moderately obscure programming language. It's terrible: it will confidently produce stuff, but make errors all over the place.
Is that really any different than asking me to attempt to program in a moderately obscure programming language without a runtime to test my code on? I wouldn't be able to figure out what I don't know without a feedback loop incorporating data.
>If you point out problems it'll happily try again and repeat the same errors over and over again.
And quite often if you incorporate the correct documentation, it will stop repeating the errors and give a correct answer.
It's not a continuous learning model either. It has small token windows where it begins forgetting things. So yea, it has limits far below most humans, but far beyond any we've seen in the past.
How about this? Flip it into training mode, feed it the language manual for an obscure language, then ask it to write a program in that language? That's a test that many of us here have passed...
I think you missed my point. It's understandable that it doesn't know how to program in a moderately obscure language. But the model doesn't understand that it doesn't. The specific concepts it doesn't understand are understanding what it is, its limitations, and what it's being asked to do.
It doesn't seem to have any "meta" understanding. It's subconscious thought only.
If I asked a human to program in a language they didn't understand, they'd say they couldn't, or they'd ask for further instructions, or some reference to the documentation, or they'd suggest asking someone else to do it, or they'd eventually figure out how to write in the language by experimenting on small programs and gradually writing more complex ones.
GPT4 and friends "just" take an input that seems like it could plausibly answer the request. If it gets it wrong then it just has another go using the same generative technique as before with whatever extra direction the human decides to give it. It doesn't think about the problem.
("just" doing a lot of work in the above sentence: what it does is seriously impressive! But it still seems to be well behind humans in capability.)
I agree it has very minimal metacognition. That’s partially addressed through prompt chaining—ie, having it reflect critically on its own reasoning. But I agree that it lacks self-awareness.
I think artifacts can easily reflect the understanding of the designer (Socrates claims an etymology of Technology from Echo-Nous [1])
But for an artifact to understand — this is entirely dependent on how you operationalize and measure it. Same as with people—we don’t expect people to understand things unless we assess them.
And, obviously we need to assess the understanding of machines. It is vitally important to have an assessment of how well it performs on different evals of understanding in different domains.
But I have a really interesting supposition about AI understanding that involves it’s ability to access the Platonic world of mathematical forms.
I recently read a popular 2016 article on the philosophy of scientific progress. They define scientific progress as increased understanding — and call it the “noetic account.” [2] Thats a bit of theoretical support for the idea that human understanding consists of our ability to conceptualize the world in terms of the Platonic forms.
Plato ftw!
[1] see his dialogue Cratylus
[2] Dellsén, F. (2016). Scientific progress: Knowledge versus understanding. Studies in History and Philosophy of Science Part A, 56, 72-83.
No, it's amazing at words. Humans use words to encode concepts, but ChatGPT doesn't get the concepts at all - just words and their relationships.
To the extent that humans have encoded the concepts into words, and that text is in the training set, to that degree ChatGPT can work with the words in a way that is at least somewhat true to the concepts encoded in them. But it doesn't actually understand any of the concepts - just words and their relationships.
I disagree. If you play around with ChatGPT4 enough you can see it understands the concept. That is, it's able to model it and draw inferences from the model in a way that's impossible through just words and their relationships. "Sparks of AGI" paper gives some good examples, for instance where it's asked to balance a book, 9 eggs, a laptop, a bottle and a nail. I recently asked it to design a plan for a bathroom. It wrote and SVG mockup and got most of details right. For instance, it understood that sink and tub required both cold and hot water lines but toilet only requires a cold water line. These things are not possible with just words, you can see it's able to create an underlying world model.
I don't think this is the case at all. Language is how we encode and communicate ideas/concepts/practicalities; with sufficient data, the links are extractable just from the text.
I don't see how two examples I gave are possible with just text. They require understanding spatial relationships between objects and their physical properties.
Our own understanding of spatial reasoning is tied in many respects to our hand-eye coordination, muscle memory and other senses: we learn to conceptualize "balance" by observing and feeling falling and when things are "about to" fall.
What GPT does is not "text" - although it centers that as the interface - but "symbols". The billions of parameters express different syntaxes and how they relate to each other. That's why GPT can translate languages and explain things using different words or as different personas.
So when we ask it to solve a spatial problem, we aren't getting a result based on muscle memory and visual estimation like "oh, it's about 1/3rd of the way down the number line". Instead, GPT has devised some internal syntax that frames a spatial problem in symbolic terms. It doesn't use words as we know them to achieve the solution, but has grasped some deeper underlying symbolic pattern in how we talk about a subject like a physics problem.
And this often works! But it also accounts for why its mathematical reasoning is limited in seemingly elementary ways and it quickly deviates into an illogical solution, because it is drawing on an alien means of "intuiting" answers.
We can definitely call it intelligent in some ways, but not in the same ways we are.
This is the debate, isn’t it? I think if we create tests for understanding and deliver them to people, we will find variations in what people understand. I think we will find the same for chatGPT.
But I suspect your notion of understanding is not measurable, is it? For you, chatGPT lacks something essential such that it is incapable of understanding, no matter the test. Or do you have a way to measure this without appeal to consciousness or essentialism?
Well, consciousness is part of the question, isn't it? We know that we are conscious (even if we can't precisely define what that means). Is ChatGPT conscious? I'm pretty sure the answer is no, but how do you prove it?
Does understanding require consciousness? Maybe yes, for the kind of understanding I'm thinking of, but I'm not certain of that.
How do you measure understanding? You step a bit outside the training set, and see if whoever (or whatever) being tested can apply what it has learned in that somewhat novel situation. That's hard when ChatGPT has been trained on the entire internet. But to the degree we can test it, ChatGPT often falls down horribly. (It even falls down on things that should be within its training set.) So we conclude that it doesn't actually understand.
Another part of all this is chatgpt was trained on the entire internet and still does a mediocre job when its doing well. That's an amazing amount of resources required to do all that to arrive at it being able to do what it does writing codes or whatever when a person typically requires a couple slices of pizza and general access to the internet which that person has
Read almost none of.
How come humans are so efficient when these computers are using enormous amounts of energy? When will we have a computer that only requires a few slices of pizza worth of energy to get to the next step?
What do you mean exactly "besides math/logic"? Because logic underpins everything we think.
For example it cannot identify musical chords because despite (I presume) ample training material including explanations of how exactly this works, it cannot reasonable represent this as an abstract rigorous rule, as humans do. So I ask what is C E G and it tells me C major correctly, as it presumably appears many times throughout the training set, yet I ask F Ab Db and it does not tell me Db major, because it did not understand the rules at all.
I hate to break it to you, but humans aren't thinking logically, or exclusively logically. In fact I would say that humans are not using logic most of the time, and we go by intuition most of our lives (intuition is shorthand for experience, pattern matching and extrapolation). There is a reason of why we teach formal logic in certain schools....
i dunno man, I asked it about that chord and it told me it was an F diminished 7th chord. I asked it what a d flat major chord was and it told me. I then asked it what the relationship between the two was. It didnt catch it immediately but when I told it to think about inversions it got it. That's decent for a music student.
It even told me about the role of each in a chord progression and how even though they share the same notes they resolve differently
Humans clearly don't think logically anyhow. Thats why we need things like abacus to help us concretely store things, in our head everything is relative in importance to other things in the moment
> I asked it about that chord and it told me it was an F diminished 7th chord. I asked it what a d flat major chord was and it told me. I then asked it what the relationship between the two was. It didnt catch it immediately but when I told it to think about inversions it got it.
So it gave you a wrong answer and when you spelled out the correct answer it said "OK" x) Was that it? Or am I missing something.
it gave me a correct answer (but not the one GP expected), and then I asked it about another chord GP wanted it to say (D flat major) which can be stylistically replaced with this one (by "inverting" the top note). I asked it what the relationship between the two was and it told me about how they're used in songwriting correctly (gave information about the emotions they invoke and how they resolve to other chords), but didnt tell me the (frankly, trivia) fact that they share notes if you happen to invert it in a particular way.
In music theory a set of the same notes can be one of several chords, the correct answer as to which one it is depends on the "key" of the song and context which wasnt provided, so the AI decided to define the root note of the chord as the bottom one which is a good and pretty standard assumption. In this case major chords are much more common than weird diminished 7 chords but I think you'll agree the approach to answering the question makes sense.
It's kind of like asking the AI about two equivalent mathematical functions expressed in different notation, and it saying a bunch of correct facts about the functions like their derivative, x intercept and stuff and how it can be used, but needing a prod to explicitly say the fact that they are interchangable. It's the kind of trivial barely-qualifies-as-an "oversight" I would expect actual human people who fully understand the material to make.
A diminished 7th above F is B, not B flat. Also a 7th chord is understood to be a triad plus the 7th (and therefore the diminished 5th above the F is also missing). Unless I'm missing something it did indeed produce a wrong answer.
Sure. Let’s take quantum theory. There are lots of concepts that are based in math but can be reasoned about non-mathematically.
The reason that chatGPT can write quantum computer programs to in any domain (despite the lack of existing programs!) is because it can deal with the concepts of quantum computing and the concepts in a domain (eg, predicting housing prices) and align them.
Very little of human reasoning is based on logic and math.
>There are lots of concepts that are based in math but can be reasoned about non-mathematically.
Can you be more specific? I literally don't know what you mean. What can you say about quantum mechanics that is not mathematical or logical in nature? Barring metaphysical issues of interpretation, which I assume is not what you mean.
Rarely do I think responding with chatGPT is appropriate, but this is one of those times.
* Of course! When discussing concepts from quantum mechanics without getting into the mathematical details, we can focus on the general ideas and principles that underlie the theory. Here are some key concepts in quantum mechanics that can be explained in a non-mathematical way:
1. Superposition: In quantum mechanics, particles can exist in multiple states simultaneously, until they are measured. This is called superposition. It's like a coin spinning in the air, being both heads and tails at the same time, until it lands and shows one face.
2. Wave-particle duality: Particles like electrons, photons, and others exhibit both wave-like and particle-like properties. This means they can sometimes behave as particles, and at other times, as waves. This dual nature has been experimentally demonstrated through phenomena like the double-slit experiment.
3. Quantum entanglement: When two particles become entangled, their properties become correlated, regardless of the distance between them. If you measure one of the entangled particles, you'll immediately know the state of the other, even if they are light-years apart. This phenomenon is often referred to as "spooky action at a distance."
4. Heisenberg's uncertainty principle: This principle states that we cannot simultaneously know the exact position and momentum of a particle. The more precisely we know one of these properties, the less precisely we can know the other. This inherent uncertainty is a fundamental aspect of quantum mechanics.
5. Quantum tunneling: In quantum mechanics, particles can "tunnel" through barriers that would be insurmountable in classical physics. This is because the particle's wave function, which describes its probable location, can extend beyond the barrier, allowing the particle to appear on the other side.
6. Quantum superposition of states: Quantum systems can exist in multiple states at once, and when you measure a property of the system, it "collapses" into one of the possible states. This is a fundamental difference between quantum and classical mechanics, where systems have definite properties even before measurement.
These concepts can be discussed and reasoned about without delving into the complex mathematical equations that govern quantum mechanics. While a mathematical understanding is necessary for rigorous study and application of the theory, non-mathematical discussions can still provide valuable insights into the strange and fascinating world of quantum mechanics.*
Mate, not only is every single one of those concepts a mathematical one, the explanations it gives are misleading or incorrect! E.g. the typical pop-sci misleading lines of "it's a wave and a particle at the same time" (it isn't both, it's neither) or "it's in two states at the same time, like a coin which is heads and tails" (no it's not, the point is precisely that it doesn't behave according to classical probabilities).
Claiming these concepts are not mathalematical is like saying addition is not mathematics because you can explain it with words or diagrams to a child!
Ask a person on the street and they will either say "I don't know", "I don't have time for this", or on rare occasion you'll find some nerd who starts juggling numbers outloud eventually reaching some rational terminus (at worst with an error in recall but not in principle along the way).
Ah so it is! I guess my point was that even this simple function is enough, you don't need any 'magic' beyond that. In particular, this continuous function can be used (infinitely) to represent discontinuous functions - the step function being the usual example. That is the more interesting and relevant mathematical fact I think.
Well, yes. The 'magic' is the nonlinearity. It's because the composition of linear (actually also affine) functions is still just linear (affine). You don't get any additional power by combining many of them - which is also the reason why linear functions are so well understood and easy to work with.
You give sprinkle in a tiny nonlinearity (e.g. x^2 instead of x) and suddenly you can get infinite complexity by weighted composition - which is also the reason why we're so helpless with nonlinear functions and reach for linear approximations immediately (cf. gradient descent).
>It's because the composition of linear (actually also affine) functions is still just linear (affine).
Except you can "compose" affine functions using Horner's schema to get any polynomial... No need to sprinkle tiny non linearities. It's a buffet. Grab as much as you need.
That would suggest that 1-Hidden-Layer neural nets would work fine, since they are also universal function approximators. But no -- when people talk about "deep learning", the word "deep" refers to having lots of hidden layers.
I'm not an expert, but the motivation seems more like this:
- Linear regression and SVM sometimes work. But they apply to very few problems.
- We can fit those models using gradient descent. Alternatives to gradient descent do exist, but they become less useful as the above models get varied and generalised.
- Empirically, if we compose with some simple non-linearities, we get very good results on otherwise seemingly intractable problems like OCR. See Kernel SVM and Krieging.
- Initially, one might choose this non-linearity from a known list. And then fit the model using specialised optimisation algorithms. But gradient descent still works fine.
- To further improve results, the choice of non-linearity must itself be optimised. Call the non-linearity F. We break F into three parts: F' o L o F'', where L is linear, and F' and F'' are "simpler" non-linearities. We recursively factorise the F' and F'' in a similar way. Eventually, we get a deep feedforward neural network. We cannot use fancy algorithms to fit such a model anymore.
- Somehow, gradient descent, despite being a very generic optimisation algorithm, works much better than expected at successfully fitting the above model. We have derived Deep Learning.
> I think ChatGPT is good evidence of this. What evidence do we have that this isn't how intelligence works?
The fact that ChatGPT doesn’t behave like an intelligence. (Though it converses like one with radical deficiencies in certain areas, which highlights the narrowness of the Turing Test, which itself is a big step.)
OTOH, it gets less bad at this when you wrap it in a loop that provides recall and other capacities with carefully constructed prompting as to how (“thought” process-wise, not just interaction mechanics) to integrate those capacities, so there’s maybe a decent argument that it models an important component of general intelligence.
1) Given an infinite series of nonlinearities, one can model any mathematical function to an arbitrary degree of precision. This has been proven and is accepted by any working mathematician.
2) The human brain / intelligence can be simulated by a sufficiently complex mathematical function. It is somewhat accepted that simulating the human brain is sufficient for intelligence. Disagreements usually boil down to: It won't have a soul, you can't simulate biological / quantum processes on digital computers, or something abut Qualia I don't really understand,
3) How big / complex is the function that we need to get to AGI / simulate human or above intelligence? TBD, it seems like experts disagree but many working in the field have been surprised by the capabilities that GPT has and we are likely closer to AGI than anyone thought 2 years ago.
Andrej Karpathy, the director of AI at tesla, published an AMAZING video about how the current models are built which I think everyone in the tech space should watch. It is intuitive, easy to follow and quite frankly the best video I have seen on the topic. Here is the link: https://www.youtube.com/watch?v=kCc8FmEb1nY&list=FL2tbfd7UpJ...
If you conclude, after watching this, that this is all intelligence is - namely cascading optimization and data structure problems chained in a row, then you and a fair number of people here wont ever find common ground. That is not to say that you're wrong, it just seems as if there is a missing component here that we haven't yet discovered.
I'm ok with believing that all it takes to make intelligence (probably not the only way) is a sufficiently large neural net with the right architecture and weights.
I think it is easy to distracted by the specific mechanisms by which these models work, but most of the technological detail is because we want something to happen on systems at the scale of what we can actually build. We simply can't build a human brain scale neural net yet. We build what we can and besides maybe with all this research we will figure something significant about what intelligence actually is.
The notion "This can't be all thought is" is as old as the idea of AI. I think it informed Turing when he proposed the Imitation Game. The insight is that people would be resistant to the idea of a bunch of simple things stuck together becoming thinking until they were faced with something that behaves sufficiently indistinguishable from themselves that to doubt that they were thinking would be akin to doubting everyone you meet.
In the end some people won't even accept an AI that does everything a human does as actually thinking, but then again some people are actually solipsistic.
>The notion "This can't be all thought is" is as old as the idea of AI.
Older still:
>It must be confessed, moreover, that perception, and that which depends on it, are inexplicable by mechanical causes, that is, by figures and motions, And, supposing that there were a mechanism so constructed as to think, feel and have perception, we might enter it as into a mill. And this granted, we should only find on visiting it, pieces which push one against another, but never anything by which to explain a perception. This must be sought, therefore, in the simple substance, and not in the composite or in the machine.
> If you conclude, after watching this, that this is all intelligence is - namely cascading optimization and data structure problems chained in a row, then you and a fair number of people here wont ever find common ground.
I watched his first video in the series, but immediately I don't know what a loss function is. Can you recommend a book/free course that goes over the more basic concepts?
Humans "hallucinate" in the AI sense (it's an awful word that obscures how often we do it) all the time too. You'll find people confidently make claims not supported by data or indeed by any of their own data all the time, where when you poke around in their justifications you'll find they have none.
The key difference appears a combination of two factors: We appear to be more likely to have learnt which subjects we're not very knowledgeable about through extensive feedback, be it through school or conversations where we're told we're wrong, and which also would appear to teach us to be more cautious in general. We also appear to have a (somewhat; far from perfect) better ability to separate memory from thoughts about our knowledge. We certainly can go off on wild tangents and make stuff up about any subject, but we get reinforced from very young that there's a time and a place for making stuff up vs. drawing on memory.
Both goes back simply to extensive (many years of) reinforcement telling us it has negative effects to make stuff up and/or believe things that aren't true, and yet we still do both of those, just not usually as blatantly as current LLMs without being aware.
So I'd expect one missing component is to add a training step that subjects the model to batteries of tests of the limits of their knowledge and incorporating the results of that in the training.
Yes, but there's a critical difference. When humans hallucinate, it's normally because they misremember something: did you know, for instance, that our brains have to piece every memory together from scratch every time we recall something? It would be a miracle if humans did NOT hallucinate.
In other words, the human has a concept of truth or facts and simply had a memory lapse. This thing has no concept of truth at all.
> Humans "hallucinate" in the AI sense (it's an awful word that obscures how often we do it) all the time too.
Agreed. I'd like to add another point to the discussion. It seems to me, as if LLMs are held to a higher standard regarding telling the truth than humans are. In my opinion the reason for this is that computers have been traditionally used to solve deterministic tasks and that people are not used to them making wrong claims.
LLMs are also probed a lot more for the limits of their knowledge. Consider the thousands of hours of peoples time that have gone into poking at the limits of the understanding of ChatGPT alone.
Imagine subjecting a random human to the same battery of conversations and judging the truthfulness of their answers.
Now, imagine doing the same to a child too young to have had many years of reinforcement of the social consequences of not clearly distinguishing fantasy from perceived truth.
I do think a human adult would (still) be likely to be overall better at distinguishing truth from fiction when replying, but I'm not at all confident that a human child would.
I think LLMs will need more reinforcement from probing the limits of their knowledge to make it easier to rely on their responses, but I also think one of the reasons people hold LLMs to the standard they do is also that they "sound" knowledgeable. If ChatGPT spoke like a 7 year old, nobody would take issue with it making a lot of stuff up. But since ChatGPT is more eloquent than most adults, it's easy to expect it to behave like a human adult. LLMs have gaps that are confusing to us because the signs we tend to go by to judge someones intelligence are not reliable with LLMs.
> It seems to me, as if LLMs are held to a higher standard regarding telling the truth than humans are.
Paradoxically, it seems as if the people who are pushing this the hardest are the same people who flat out deny even the slightest flicker of what could be considered intelligence.
Nobody is suggesting we shouldn't improve on it. The point is that the jump to a significant improvement is not necessarily very big once you add reinforcement of the limits of the models knowledge, be that through conversations or senses (they're both just data).
The second thing is a very recent invention, which most people even now understand vaguely if at all.
If we made a ChatGPT that has some “senses” it can use to verify its perceptions, and it does as well with this as humans generally do, I’m sure we’ll get interesting results but I’m not sure we’ll be any closer to solving the problem.
Just the other day, my friend asked me if I knew what this bug she took a picture of was. I replied it looked like a house centipede, and it was indeed one. Two things:
1) My confidence was around 80%, without any sense of absolute certainty.
2) I had no idea why I knew that, and where did I originally learned about what a house centipede looks like.
If you have genuinely no idea why you know that and where you learned it, you should probably seek medical advice concerning maybe early onset dementia or some such.
There's a good argument to be made you learned about potentially dangerous insects as a child, from books aimed at children. A reasonable person would make this argument. It's also likely that you retained that information because your would-be peers who aren't around, aren't around because their ascendants inserted their genetic material in to the humanity's descent's to a lesser extent than your extant peers.
You might not be able to immediately recall where and when you learned something, and that shouldn't be equated with genuinely having no idea.
Hallucinations are a feature, not a bug. GPT pre-training teaches the model to always produce an answer, even when it has little or not relevant training experience, and on average it does well at that. Part of the point of RLHF in ChatGPT was to teach the model not to hallucinate when it doesn't have good supporting experience encoded in it's weights. This helps but is not perfect. However, it seems like there might be a path to much less hallucinations with more RL training data.
As others pointed out, humans hallucinate all the time we just have better training for what level of hallucination is appropriate given supporting evidence and context.
> I think ChatGPT is good evidence of this. What evidence do we have that this isn't how intelligence works?
Animals have bodies and have to survive in an environment. They're not just sitting there waiting on a prompt to generate text. They do things in the real world. Language is later invention by one particular social animal which serves our needs to communicate, which is different than waiting on prompts.
That is just taking data in and generating data out, though. Language is just data, not something fundamentally different.
There are certainly some missing elements, such as ongoing training and memory, but it's not at all clear that there's any major conceptual differences. There could be, we just don't really know.
>I think ChatGPT is good evidence of this. What evidence do we have that this isn't how intelligence works?
People don't need the gigantic amount of input data that ChatGPT needs to learn. However I'm not sure what exactly "this" is you and GP are referring to, and it may be possible to improve existing ideas so that it works with less input data.
You have ingested truly astronomical amounts of data. Your body is covered with sensors of various kinds. Millions of datapoints streaming in day and night.
You are not storing it all, but you are finding patterns and correlations in it from the day you were born. This all forms a base where after a decade+ you can learn things fast from books, but that’s comparable to an LLM’s in context learning. It’s fast, but it depends on a deep base of prior knowledge.
My body may have "ingested" it in some way but filtering starts even at the sense ("sensor") level. So there is no way I have learned from it. I didn't even have to learn how to filter, it is part of my genetic makeup.
And even if I did, most of it would not be of the same category as what LLMs are learning; information about how a fabric feels or the sound of an ambulance doesn't teach me anything about programming or other things GPT can do. So when comparing the inputs, it makes no sense to count all the information that our senses are getting, to all the input for an LLM.
>This all forms a base where after a decade+ you can learn things fast from books, but that’s comparable to an LLM’s in context learning
I think I implied a strong connection between LLMs and how biological intelligence works, but I certainly didn’t intend to.
I think you discount general knowledge acquired non-verbally too easily. The sound of an ambulance and how it moves through space and how it correlates with visual information represents an astronomical amount of information that can be generalized from handsomely. All these data streams have to be connected, they correlate and intertwine in fantastically complex ways.
I think the sound of an ambulance does teach you things that eventually help you “program”. You have witnessed similar (non-)events thousands if not millions of times. Each time it was accompanied with shitloads of sensory data from all modalities, both external and internal.
The base of general patterns you start out from once you are ready for language is staggering.
Again not saying LLMs work like that, because they do not. All I mean to do is put their information requirements in perspective. We ingest a lot more than a bunch of books.
That is a highly questionable statement. For walking that is intuitively not true; most animals can walk almost immediately after birth. It would be very strange if human brains were so different that we had to learn walking from scratch. Indeed we do not; one of the most important reasons we can't walk from birth is because the brain has not grown enough yet. It grows by itself and then provides foundations needed for walking. Of course there is also a component to walking that is a learned skill that is refined.
For reading, the same applies. Our brains are equipped with many of the foundational aspects required for reading, and we only _learn_ a part what is necessary for the skill of reading.
Unlike computer models, brains are no tabula rasa. So we don't need the same input as computer models to learn.
But I have learned to program and do complex math having read and analyzed ridiculously less source material on those subjects (and in fact in the case of programming having seen very little code as I mostly only had reference manuals at the time).
We are trained in a way that involves far more back and forth, giving corrections to specific failure modes in our thinking, though. It'll be fascinating to see to what extent interleaving "independent study" (training on large chunks of data) with interaction fed back in as training data could reduce the amount of training needed.
Before you learned how to code from a book, you had to learn how to read and write English. You also had to learn how to follow instructions, how to imbibe and compose information etc. How many books and hours of instruction did that take?
Not as much as it took GPT to process all its input.
>Let us consider the GPT-3 model with 𝑃 =175 billion parameters as an example. This model was trained on 𝑇 = 300 billion tokens. On 𝑛 = 1024 A100 GPUs using batch-size 1536, we achieve 𝑋 = 140 teraFLOP/s per GPU. As a result, the time required to train this model is 34 days.
I'm not sure expressing brain capacity in FLOPs makes much sense, but I'm sure if it can be expressed in FLOPs, the amount of FLOPs going to learning for a normal human is less than that.
This is actually a Noam Chomsky linguistics question: how much of language is innate/genetic and how much is learned.
The common perception has been that children aren't exposed to enough data to arrive at their grammatical language skills, implying there's some proto language built in. Comparative analysis of languages has looked for what aspects are truly universal but there's actually not a lot of concrete true universals to ascribe to our genetic innate language.
But if it is genetic that doesn't really mean it's fundamentally different than ChatGPT, it just took a different and drastically longer training period and then transfer learning when children learn their mother tongue.
>But if it is genetic that doesn't really mean it's fundamentally different than ChatGPT, it just took a different and drastically longer training period and then transfer learning when children learn their mother tongue.
It doesn't necessarily mean it's fundamentally different, but it doesn't mean it is comparable either. Geoff Hinton doesn't think the brain does backpropagation. Training a neural net uses backpropagation. So if Hinton is correct then saying "it just took a longer training period" while brains doesn't learn like our current neural nets is glossing over a lot of things.
You have millions of years worth of training data. The model is just pruned, tuned, and specialized. More importantly, you have a lot of control over the tuning process. But it is naive to think that your life is all the training data you have and I'm not sure why this discussion is almost non-existent in threads like these.
> Personally I think the limits of this technique are far better than I would have thought 10 years ago.
Honestly I've always surprised at the scepticism AI researchers have had of about the limits of training large neural nets with gradient descent. Where I've had my doubts is in the architecture of existing models and still think this is their primary limiting factor (more so than compute and network size).
I think the question that remains now is whether existing models are actually capable of producing a general intelligence that's well rounded and reliable enough to be competitive with human general intelligence. Personally, I think LLMs like GPT-4 are generally intelligent in most ways and should be considered AGI already (at least in a weak sense of the word), but they clearly have notable gaps in their intelligence such as their ability to be consistent, long-term memory, and ability to discern reality from delusion.
I don't think scaling existing models could possibly address these limitations – they seem to be emergent properties of an imperfect architecture. So I suspect we're still a few breakthroughs away from a general intelligence as well rounded as human general intelligence. That said, I suspect existing models (perhaps with a few minor tweaks) are generally intelligent enough that larger models alone are likely still able to replace the majority of human intellectual labour.
I guess what I'm touching on here is the need to more nuanced about what we mean by "AGI" at this point. I think it's quite likely (probable even) that in a few years we'll have an AI that's generally intelligent and capable enough that it can replace a large percentage of existing knowledge work – and also generally intelligent enough to be dangerous. But I suspect despite this it will still have really clear limitations in its abilities when contrasted with human general intelligence.
For me AGI is achieved when AutoGPT is at a point when it's able to improve its own algorithm (specifically improve on the GPT architecture).
Flash attention came out a bit less than a year ago (27 May 2022), which was a great scaling improvement for getting rid of O(n^2) memory requirement in the length of attention.
I guess the next one we need will be one of the solutions that change the FLOPS for computing attention from n^2 to n*log(n) (there was already a paper that achieved it using resizable gauss filters where the size and the base convolution filter is learned separately).
A few of these kind of ,,breakthroughs''/algorithmic improvements are the only ones needed to get to AGI (self improving machine) in my opinion.
« I don't believe we've seen any evidence of this. »
I think chatgpt did show proof of what was considered at least until very recently « intelligence ». aka: understand enough about context and concepts to provide relevant answers to very complex and open questions.
Humans are confidently wrong about all kinds of things all the time. We don't call it hallucination other than for very specific, limited subsets. We call each other names over it, call it mistakes, stupidity, or lies. In fact, we structure large parts of society around it (the presence of multiple contradictory world religions means the majority of the worlds population goes through life being confidently wrong). It's just being confidently wrong about some subjects is more socially acceptable than others.
ChatGPT clearly needs reinforcement to know its limits the way humans are constantly reinforced from childhood that making confident claims about things we don't actually know well is often going to have negative consequences, but until we've seen the result of doing that, I don't think we have any foundation to say anything about whether ChatGPT's current willingness to make up answers means it's level of understanding is in any way fundamentally different from that of humans or not.
I think that the ChatGPT model (at least chatgpt-3.5-turbo) has gotten impressively better at this.
- In my most recent tests, it will tell you when the data you've provided doesn't match the task (instead of inventing an answer out of thin air).
- It will also add (unprompted) comments/notes before or after the result to disclaim a plausible reason why certain choices have been made or the answer isn't complete.
You have to take into account that not everyone wants the model to not hallucinate. There is a lot of competing pressure:
- Some people would like the model to say "As an AI model trained by OpenAI, I am not qualified to provide an answer because this data is not part of my training set" or something similar, because they want it to only talk when it's sure of the data. (I personally think this use case - using LLMs as search engines/databases of truth - is deeply flawed and not what LLMs are for ; but for a large enough GPT-n, it would work perfectly fine. There is a model size where the model would indeed contain the entire uncompressed Internet, after all)
- Some people want the model to never give such a denial, and always provide an answer in the require format, even if it requires the model to "bullshit" or improvise a bit. An example is, if that as a business user I provide the model with a blog article and ask for metadata in a JSON structure, I want the model to NEVER return "As an AI model..." and ALWAYS return valid JSON, even if the metadata is somewhat shaky or faulty. Most apps are more tolerant to BS than they are to empty/invalid responses. That's the whole reason behind all of those "don't reply out of character/say you are an AI" prompts you see floating around (which, in my experience, are completely useless and do not affect the result one bit)
So the reinforcement is constantly going in those opposite directions.
I agree it's better, but it's still awful at it. But I think it's one of the low hanging fruits to improve, though, just by reinforcing the limits of what it knows.
With respect to the competing draws here, I'm not sure they necessarily compete that much. E.g. being able to ask it to speculate but explain justifications or being able to provide a best guess or being able to ask it to just be as creative as it wants, would be sufficient. Alternatively one that knows how to embed indication of what it knows to be true vs. what is speculation or pure fiction. Of course we can "just" also train different models with different levels of / types of reinforced limits. Having both a "anything goes" model and a constrained version that knows what it does know might both be useful for different things.
You seem to propose a "if it quacks like a duck" scenario here (aka the Chinese Room), and applying it to people who are mistaken or acting on incorrect information.
Those people are reasoning, however rightly or wrongly, based on that wrong information. LLMs are not.
We do not have sufficient knowledge of what reasoning entails to be able to tell whether there's any meaningful distinction between human reasoning and LLMs. It's likely there is some difference. It's not clear whether there's any major qualitative difference.
Claiming we can tell that there's a distinction that merits saying people are reasoning and LLMs are not, is "hallucination" to me. It's making a claim there is insufficient evidence to make a reasoned statement about.
EDIT: Ironically, on feeding ChatGPT (w/GPT4) my comment and your reply and asking it to "compose a reply on behalf of 'vidarh'" it produced a reply that was far more willing to accept your claim that there is a fundamental difference (while otherwise giving a reasonable reaffirmation of my argument that reinforcement of the boundaries of its knowledge would reduce its "hallucinations")
It may be true that we don't have a real fundamental understanding of what human reasoning involves but I think we can be pretty certain that it's something different from just stringing words together.
The problem with this statement is that if LLMs are "just stringing words together, then we can't be pretty certain of that, because LLMs can demonstrably do things that it is wildly unintuitive that you'd be able to get from "just stringing words together". Either we can acknowledge that something more is going on, or our understanding of what is required for reasoning is even more woefully lacking.
E.g. ChatGPT can (badly) perform maths or symbolically evaluate code that is guaranteed not to be in its input in a way that would be very hard to explain as "just stringing words together".
Here is an example where I loosely describe a language that is a mix of an INTERCAL-inspired "come from" construct and Ruby, and then give it an example program and asks it what it will output:
> We're going to write a program in a "new" language that is a mix of Ruby and INTERCAL. We're going to take the "come from" statement and use it to allow "hijacking" the return of a function. Furthermore, we're going to do it conditionally. "come from <method> if <condition>" will execute the following block if <method> was executed and <condition> is true. In <condition>, "result" can be used to refer to the result of executing <method>.
>
> Can you deduce what this program will print, and why?
>
> def foo(str)
> str
> end
>
> come from foo if result == "hi" do
> print "foo returned hi"
> end
>
> puts "1:"
> puts foo("some string")
> puts "2:"
> foo("hi")
I explicitly avoided passing strings that'd indicate the expected output. There's an ambiguity introduced by passing a block rather than just letting the statement stand by itself (passing a block might indicate I'd like the statement to return at the end of the block). This was unintended from my side. I intended execution to just continue after the "come from" block, as that is consistent with INTERCAL. I'll get back to that ambiguity.
Here's ChatGPT(GPT4)'s output:
> Sure, let's analyze this program step by step. The new language is a mix of Ruby and INTERCAL, using the "come from" statement to conditionally hijack the return of a function.
>
> Here's the program:
[snipped a verbatim copy of the program I input for brevity]
> 1. The `foo` method is defined, taking a single argument `str` and returning it.
>
> 2. The "come from" statement is introduced with a condition to hijack the return of the `foo` method if the `result` is equal to "hi".
>
> 3. The program outputs "1:".
>
> 4. The program calls `foo("some string")` and outputs its result, which is "some string". The "come from" statement is not executed because the result is not equal to "hi".
>
> 5. The program outputs "2:".
>
> 6. The program calls `foo("hi")`. Since the result is "hi", the "come from" statement is executed, and "foo returned hi" is printed. Note that the return value of the `foo` method is not printed here, as there's no `puts` statement wrapping the call to `foo("hi")`.
>
> So, the output of the program will be:
>
> ```
> 1:
> some string
> 2:
> foo returned hi
> ```
This is what I intended.
This isn't a very complex problem, but it's the kind of problem that requires stepping outside how you're used to think to the point where I'd like to e.g. pose it as an interview question, but I'd worry about using it as I know from experience of conversations with people about INTERCAL's "come from" that a surprising number of people struggle with reasoning about it.
I asked a followup question:
> "come from" alters the control flow, and prevents a method call from returning to its call site. Suggest a couple of implementation and specification issues specific to dealing with the control flow and its implications.
It produced a detailed explanation of how it'd require stack unwinding, pointed out that you'd need to define how it is affected by dynamic dispatch in Ruby (e.g. which method should it actually intercept?), and gave an example of a scenario where "come from" could affect exception handling that I frankly hadn't thought of. If someone gave an answer like it did to this in a technical interview, I'd hire them.
I then followed up with one more:
> Are there other options to unwinding the stack? E.g. when the block passed to "come from" exits, what are our options for where code execution will continue?
It correctly ascertains that returning to the original call site is another option, and then deduced that this would make "come from" behave like an aspect-oriented "around" mechanism. It also produced several more esoteric suggestions and explained why they'd be a bad idea.
Because of looking through this I recognised that the block I'd added to the "come from" syntax made it somewhat ambiguous and might hint at different semantics, I asked this:
> Does the fact I passed a block to the "come from" statement give any hints as to which of these options I might have intended?
ChatGPT replied:
> Yes, the fact that you passed a block to the "come from" statement does provide some hints about your intended behavior. It suggests that you might want the "come from" statement to behave like a method interceptor or an "around" aspect, executing the block of code when the specified condition is met and then returning control back to the original call site. [snipped it pointing out that matches one of the options it had described before and copying it's previous explanation]
I was hopeful it'd pick up on this when specifically asked, but I'm frankly surprised it did. It's a subtlety of how idiomatic Ruby differs from INTERCAL, and something I didn't think about myself when I wrote the example, and so given I told it the "come from" was inspired by INTERCAL, it was right to assume at first that my intent was for it to pass through. But it's also right to recognise that the block given might imply other intent (and indeed either this construct should be named something else, like "around" or "after", or the block ought to go)
If this is just "stringing words together", then I know a lot of developers who are bad at just stringing words together.
> Humans are confidently wrong about all kinds of things all the time
there is a qualitative difference: humans may be wrong about facts because they think they are true, ChatGPT is wrong because it does not know what anything means. You cannot fix that, because it's just the way LLMs work.
For example, if asked about a URL for something, a human may remember it wrongly, but will in general say "I don't know, let me check", while ChatGPT will just spew something.
humans are constantly reinforced from childhood that making confident claims about things we don't actually know well is often going to have negative consequences
You seem to be saying that because it misunderstands certain things, everything else that is evidence of correct understanding also doesn't count as real "understanding". I don't see how this is a coherent position to hold, as we can just as easily apply it to humans who have misunderstood something.
Edit: and I also don't see why it has to be so black-and-white. IMHO there is no problem with saying it understands certain things, and doesn't understand other things. We are talking about general intelligence, not god-like omniscience.
i think you're conflating being self-conscious, and being intelligent. In AGI , A is also very important. A universal oracle stuck in a machine, able to correctly predict anything and answer all questions, but still having no "desire" or "will" or consciousness, would be an AGI ( imho )
maybe I am, but I would agree with the universal oracle being generally intelligent, I simply don't think our current tools are like that.
E.g. consider this exchange with chatgpt
> hey assistant, can you provide me with a link to learn about LLMs?
> Certainly! LLM stands for Master of Laws, which is a postgraduate degree in law. Here is a link to a website that provides information about LLM programs, including what they entail and how to apply:
> https://www.llmstudy.com/what-is-an-llm/what-is-an-llm-.aspx
> I hope you find this helpful! Let me know if you have any further questions.
> that is not a real link, can you provide another?
ChatGPT makes up a URL, twice. Does it understand URLs? Does it understand that the URL is wrong? I don't think it does, and I don't think it requires consciousness to do this interaction in a sane way.
Still this does not mean it's useless, it's super useful!
Understanding and assimilation can lead to generating relations between disjoint sets of tokens.
For example, "squeeze a tube to cut water flow" and "put pressure on a deep wound to stop blood loss" can only be related, if not already in the training data, if there is understanding and intelligence.
The ability to do that is intelligence.
Otherwise, it's just a search and optimization problem.
Firstly, transformer architectures are not "just a search and optimisation problem". They do generalise. Whether that generalisation is sufficient to be structurally equivalent to what we consider intelligence is an open question, but getting them to demonstrate that they generalise is easy (eg. ChatGPT can do math, albeit badly, with numbers large enough that it is infeasible for it to just have occurred in its training set)
Secondly, this poses the problem of 1) finding examples like the one you gave that it can't understand (regarding your specific example, see [1]), 2) ruling out that there was something "too close" drawing the equivalence in the training data, 3) ruling out that the failure to draw the equivalence is something structural (it can't have real understanding) rather than qualitative (it has real understanding, but it just isn't smart enough to understand the specific given problem)
So I'm back to my original question of how we would know if these are structurally different things in the first place.
[1] vidarh: How does squeezing a tube to cut water flow give you a hint as to what to do about a deep wound?
ChatGPT (GPT4): Squeezing a tube to cut off water flow demonstrates the basic principle of applying pressure to restrict or stop the flow of a fluid. Similarly, applying pressure to a deep wound can help control bleeding, which is a critical first aid measure when dealing with serious injuries.
[followed by a long list of what to do if encountering a deep wound]
It's all moot until we can agree on a definition of intelligence. I don't necessarily think it's relevant to the discussion whether an AI is truly intelligent or not. What matters is whether the technology is good enough to fool an "intelligent" human. It's good enough to do that today. I'd say we would have to train a "dumber" model that can respond with a simple "im not sure" instead of the typical responses we get today would make it even more believable, not less.
It's obvious it's an AI because it's too smart. It's too intelligent. Dumb it down, remove the AI branding, and it would fool the majority of the world.
> It's all moot until we can agree on a definition of intelligence.
We did agree on a definition intelligence! For 50 years, the Turing test was the unquestioned threshold beyond which machines would be considered "intelligent". The simple fact of the matter is that we've reached that point, but folks remain unimpressed, and so have set about moving the goalposts.
I am of the opinion that, when the dust settles, the point in time which will be selected as "the moment we achieved Artificial Intelligence", or even rudimentary AGI, will not be in the future, but in the past. This will be true because, once we take a step back, we'll remember that our definition of intelligence should not be so strict that it it excludes a significant percentage of the human population.
Consider this--is there any reasonable definition of intelligence which excludes Chat GPT 4.0, but which includes all human beings who fall in the bottom ~5% of IQ?
> We did agree on a definition intelligence! For 50 years, the Turing test was the unquestioned threshold beyond which machines would be considered "intelligent".
First of all it is now more then 70 years. Secondly the question of the paper isn't whether a computer is intelligent but whether a computer can win the imitation game, so naturally it doesn't contain any definition of intelligence within it.
> Secondly the question of the paper isn't whether a computer is intelligent but whether a computer can win the imitation game, so naturally it doesn't contain any definition of intelligence within it.
The game IS Turing's definition of intelligence! From the paper:
> I PROPOSE to consider the question, ‘Can machines think?’ This should begin with definitions of the meaning of the terms ‘machine’ and ‘think’. The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous. If the meaning of the words ‘machine’ and ‘think’ are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, ‘Can machines think?’ is to be sought in a statistical survey such as a Gallup poll. But this is absurd. Instead of attempting such a definition I shall replace the question by another which is closely related to it and is expressed in relatively unambiguous words. The new form of the problem can be described in terms of a game...
Isn't intelligence an emergent behavior? We use gradient descent because it's faster than others techniques like evolution. Also aren't systems that play strategic game at the highest level already intelligent in their tasks?
Lol, let me ask ChatGPT what it thinks about that. :)
> I don't believe we've seen any evidence of this
What kind of evidence would you like to have? Do you want a mathematical proof or what? There is evidence that we are making forward progress in solving problems which were previously in the domain of human cognition. There is evidence that yesterday's "impossible" problem become "possible" while the "hard" problems become "easy" or even "trivial". (Just look at this xkcd[1]. When it was published in 2014 telling whether or not a picture contained a bird was indeed a "5 year and a team of researchers" project. Today it is what, an afternoon? A tutorial you pick up to learn a new ML framework?)
There is also evidence that our solutions are tending toward more generalised ones. Previously you would need a "sentiment detection" network, and a separate "subject disambiguation" network to parse the meaning out of a text. Today you can achieve the same or better with an LLM trained to follow instructions.
Obviously these are not "hard evidence" that this path will lead to AGI. But it is certainly not unreasonable to think that it might.
But seriously, it's partially true most techniques we use today could be found initially envisaged in 90th ANN-related papers. I think Geoffrey Hinton summarized most of them well in his famous "Neural Networks for Machine Learning" lectures. Essentially, what is available today is compute resources unimaginable in 90th, so scaling up from a shallow 2-layered Multi Layer Perceptron to something like 96-layered deep architecture is possible only recently. We also found some of the tricks work better than others in practice when we scale up (like *ELU non-linearity, layer-norm, residual connections). What stays the same however is the general approach: training and validation sets, cross entropy loss, softmax, learnable parameters based on data-in/data-out training pairs, and differentiation chain rule. IMO this requires some innovative revision, especially generalization is still very weak in all architectures today.
I think here, a problem is we don't actually know how our 'intelengece' works, it's not what we think it is, our brain takes a huge amount of shortcuts to arrive at the results it gets, it's not outright intelligent in the way most assume, hence we are victim to all sorts of cognative basis and tricks.
Our ability to reason, from an evoultion perspective came about from needing to communicate with others, i.e to give reasons for things, it's this same system we use to be 'logical' or think we are being inteligent from, but it's effectively just a system to make excuses for things, hence why we might choose something in a store for reasons of falling for some form of marketing, but give a completely different logical sounding reason for doing so, hence conspiracy theorists, when confronted with evidence against that theory, don't change there theory, but change there excuse.
I believe this could be more profound and have bigger implications than we have realised.
This is no different to what a lot of AI models are doing right now when they are wrong and people are calling that out saying there is no intelligence behind that, as it's spouts some nonsense reason for being right, but the problem is, humans do this all the time and I think we do it way more than we realise, the trouble is we are so caught up in it, we cannot see it.
What we have with the human brain is something that wasn't designed logically but via evolution which gives the illusion of intelligence or arrives at mostly the same result, but in completely different ways than expected, therefore I think the path to an AGI, might not be quite what some are saying or expects it to be, because the human brain certainly doesn't work like that.
Our brains are different from our ancestors'. Proto-humans had less frontal cortex and more ability to process information in the immediate spacetime around them, more evenly distributed across the five senses.
> The parameters θ of the deep neural network in AlphaZero are trained by self-play reinforcement learning, starting from randomly initialised parameters θ. Games are played by selecting moves for both players by MCTS, at ∼ πππt. At the end of the game, the terminal position sT is scored according to the rules of the game to compute the game outcome z: −1 for a loss, 0 for a draw, and +1 for a win.
In this case sounds like better algorithm + a lot of self-generated data (but no prior knowledge other than rules).
All of the observable data, implicitly, plus an evaluation criterion. They can operate on that, including starting to mine it in a strategically effective order, because the whole game system except the other player is defined by a small set of rules.
Unfortunately this argument, while imho entirely valid and also frequently seen in discussion, is unable to stop the massive train that has been set in motion.
There are various types of retorts: i) the brain is also doing gradient descent; ii) what the brain does does not matter (if you can fake intelligence you have intellegence) iii) the pace is now so fast that has not happened in decades will happen in the next five years etc.
None of them is remotely convincing. I would really start getting worried if indeed some of the AI fanboys had a convincing argument. But we are where we are.
Somehow this whole episode will be just another turn of the screw in adopting algorithms in societal information processing. This in itself is fascinating and dangerous enough.
The brain is definitely not doing gradient descent. That is the biggest issue with using artificial neural networks as a neuroscience modeling tool. The fact other learning algorithms have not come close to gradient descent performance in most cases has therefore largely kept neuro theory as a niche in modern academic research -- despite its entanglement with the early development of artificial neural networks.
Not taking a stance on whether "intelligence" can emerge out of gradient descent but it's certainly not biological.
I don't think there is any strong argument for AGI (i.e., that it is somehow coming any time soon). In my explanatory framework all developments make sense as purely statistical algorithmic advances. The surprising and interesting applications involving images, language are not really changing that fundamental reality.
There is a case to be made that with sufficient ingenuity at some point people will expand the algorithmic toolkit into more flexible and powerful dimensions. It may integrate formal logic in some shape or form, or yet to be conceived mathematical constructs.
But the type of mental leap that I think would be required just to breakout of the statistical fitting straight-jacket cannot be invented to satisfy the timing of some market craze.
If you look at the broad outline of the development of mathematics there are many areas where we have hit a complexity wall and generations of talent have not advanced an iota.
Even if we condition on some future breakthrough, the next level of mathematical / algorithmic dexterity we might reach will follow its own intrinsic logic, which will probably be very interesting but may or may not have anything to do with human intelligence.
AGI ≠ a lot of AI. They are fundamentally different things.
The first computer was designed in 1837, long before a computer was ever built. We know how fusion reactions work, now we’re tweaking the engineering to harness it in a reactor.
We don’t know how human intelligence works. We don’t have designs or even a philosophy for AGI. Yet, the prevailing view is that our greatest invention will just suddenly “emerge.”
No other field I’m aware of so strongly purports it will reach its ultimate breakthrough without having a clue of the fundamentals of that breakthrough.
It’s like nuclear scientists saying “if we just do a lot of fission, we think fusion will just happen.”
your hypothesis is that there is a qualitatively different property to agi than whatever we have now. Most people here are just saying "if chatgpt is smarter that would be agi".
Going by the differences between gpt3.5 and gpt4 is really interesting. It is better able to reason in basically any problem I throw at it. Personally I think that a hypothetical system that is able to generate a sufficiently good response for ANY text input is AGI.
There aren't really any "gotcha" cases with this technology that I'm aware of where it just can't ever respond appropriately. Most clear failings of existing systems involve ever more contrived logic puzzles, which each successive generation is able to solve, and eventually at some point the required logic puzzle will be so dense few humans can solve it.
This isn't a case a case of "studying for the test" of popular internet examples either. I encourage you to try and invent your own gotchas for earlier versions then try them on newer models. Change the wording and order of logic puzzles, or encase them within scenarios to ensure its not responding to the format of the prompt
There are absolubtely cases of people overhypting it, or it overfitting to training data (see the debacle about it passing whatever bar exam, university test etc.). But despite the hype there is an underlying level of intelligence that is building and I use it to solve problems pretty much every day. I think of it atm as like a 4 year old that has inexplicably read every book ever written
In many ways, AGI and AI are opposites. We don’t actually “train” humans. We present them information. And a human can choose to ingest that information or - just as importantly - choose not to.
A 3yo learns English not through being force fed, but through their own curiosity and motivation to learn.
GPT and the like are proving to be amazing tools. The compute model of probabilistic algorithms (“plain” AI) is going to transform industries.
But the more sophisticated these tools get, the further from AGI they will become.
When we create AGI, it will begin with a blank slate, not the whole of human knowledge. It will study some topics deeply and be uninterested in others. It might draw its own conclusions instead of just taking the conclusions presented to it. Or it may decide not to study at all, preferring to instead become a hermit.
again, I believe the opposite to you. I literally think a smarter chatGPT would be AGI. However I agree, it wouldn't be the thing you describe. But recognise that your disagreement is not because of a misunderstanding of the existing technology, but over what an AGI is.
I think some people explitly dont want AGI to simulate emotion or self-defined goals because that would be little more than an artistic curiousity. Why intentionally make a tool less useful? Less willing to do the things you want it to do? Perhaps you hold some belief that simulating such things is necessary for some goal you have in mind, but personally I don't think that's true, for literally any goal
Sorry, I must have missed the rest of your prior comment (or posted before you made an edit to it?)
LLMs are not able to create new knowledge, only organize existing knowledge. Critically, we don't use induction to create new knowledge - but induction is all these LLMs can do. (e.g. To explain the origin of the universe, there is nothing we can induce, as the big bang is not observable.)
I don't see how training these AI models more will cause this property to emerge. But knowledge creation is the thing that we want when we say we want AGI.
The reason I describe AGI as such is because we only have one example of what an AGI is, namely humans. A popular idea is that machine AGI will look very different from human AGI (e.g. no consciousness or intrinsic motivation or qualia).
But this is a bold claim. It contends that there are multiple kinds of general intelligences vs some universal kind. Other universal concepts don't look like this. Consider the universality of computation: a computer is either Turing-complete or it isn't. There is no other kind of general purpose computer.
Once we created a Turing-complete computer, it could perform all conceivable computation. This is the power of universality. The first Turing-complete computer could technically play Call of Duty (though no one would have had the patience to play it).
It's not like there is some mode of ChatGPT that will produce AGI-level responses if given months or years of time to compute it.
Fundamentally, its ability to do whatever it is that AGIs do (AGI's version of Turing computation) is missing.
what would you consider a test of the ability to create new knowledge? I don't think there exists a meaningful distinction here. If I give an AI the same information astronomers had to figure out the big bang conclusion, it would probably be able to figure it out. They "induced" from a bunch of data points that there was one likely explanation.
Similarly if I gave it a phenomena I wanted to know about it could propose experiements to run to understand that thing.
taking independant action in the world is indeed beyond the limits of chatgpt on it's own, but its not hard to build systems relying heavily on chatgpt to do that, a simple loop calling it with information from sensors and a prompt to experiment with outputs is enough to get some kind of agent. It can process any information, and given more power it can draw ever-higher inferences from that information. This is analogous to a turing machine, something that can eventually get anywhere, but it will take an arbitrarily long time to do it.
I think of it kinda like the CPU in a Von Neumann architecture. A cpu on it's own doesn't do much of anything unless you tell it to. We have that part solved, we just need to settle on a good structure for repeatedly querying it.
Depends on the definition. In my mind, a system that can write code, poetry, and suggest a recipe based on a list of ingredients has a form of “general” intelligence. So by that measure we are there.
If we’re talking about something like agency or free will, that’s trickier.
The problem in this space is people use terms that nobody agrees on.
Just make humans dumber and more predictable. Problem solved. (This isn't that hard since humans will adapt to the stupidity level of the "AI" they use daily on their own without special social engineering efforts.)
Ultimately you are just a system who maintains homeostasis by modeling your environment and reprojecting it to find patterns which can be used to make actions with predictable outcomes. Is this intelligence? Is there only one correct way to achieve this goal?
Ultimately we're just a collection of atoms obeying the laws of physics, but reducing all the complexity that entails doesn't really accomplish anything
This is my frustration put really well. When people talk about how humans are also just a form of LLM (not that the above comment did exactly that). That might even be true, but simplifying things to that degree doesn't help actually discuss what is going on. The original comment is as far as I know correct... while I don't have a PhD some of my undergraduate work was in control theory and ML, and it works really well. The underlying methods we used were from NASA in the late 70s.. surely there is something more in the field no?
What frustrates me is when people say "neural networks cannot show intelligence because they are just a succession of linear layers", and methods that exist since the 70s".
I don't understand this argument, and how this has anything to do with intelligence.
I’m not saying they can’t show intelligence. I am saying that the techniques are very old, throwing more hardware seems to have done wonders. It’s been surprising to see how far it’s come. I don’t think that simplifying intelligence in humans and saying we are LLMs helps the conversation. That along with that these techniques being used while they’ve clearly been improved are fairly old. From an engineering standpoint it leaves me wondering if there are not more elegant solutions. That nature actually is fairly cool, it doesn’t help the convo to downplay wetware I think.
You can look at very small creatures on earth and think "surely there is more to a human than to this ant". And sure, there is, but also, there isn't. Just as basic life evolves into more complex life, why shouldn't the underlying methods from the NASA in the late 70s evolve into ChatGPT and eventually AGI?
Simplifying things to this degree can show how you get from A to B.
We can see how an ant is a more primitive form of the same kind of thing that a human is, and though they're not on the same evolutionary path, we can see how humans evolved from the most primitive forms of life, because we know what the endpoint looks like.
We don't know what 70s AI and ChatGPT will evolve into. That's why everyone keeps debating and prognosticating about it, because nobody actually knows. But whatever people mean by AGI, we don’t know if it will or can evolve from the AI platforms that have come before.
We do know that the thing that evolved into ChatGPT is entirely different from primitive cellular life, so there's no reason to believe that ChatGPT will keep evolving into some kind of humanistic, self-aware intelligence.
I’m not the only one who thinks there is something missing though. That simplification does work to show how to get from A to B, but I don’t think it is analogous when we don’t really understand how we go to human intelligence. None of us were there. All I am saying is that simplifying even mammalian intelligence down to just an LLM doesn’t explain much to me. It might be the case that old 70s math proofs from NASA evolve into a mechanical intelligence maybe? I just personally think that isn’t the case. Humans run on a lot less overall power (though biological necessities like bathrooms and food are real), we also can learn even from less text than a machine and infer from experience. That might not convince you, or you might have your own ideas about what intelligence is.
There's a good summary in the Introduction to Kevin Murphy's second tome [1]. He quotes Josh Tenenbaum, who said: "Intelligence is not just about pattern recognition and function approximation. It’s about modeling the world".
I agree, and I think Goodman & Tenennaum [2] is a great place to see other things that may pop up in the road to AGI. LLMs are great, but they do too much at once. I think moving towards AGI requires some form of symbolic reasoning, possibly combined with LLMs, which may play the role of intuition and kitchen-sink memory.
all claims and questions in your answer are still open research and full of inconclusive evidence. Ultimately all of this leads down the spiral to the question about ship of Theseus. I guess, we will find out!
+1. What's new "since the 80s" are faster computation, larger datasets, and a handful of mathematical breakthroughs that enable once-intractable algorithms. It's obvious that human general intelligence operates in frontiers that are plainly outside the scope of computation.
>It's obvious that human general intelligence operates in frontiers that are plainly outside the scope of computation.
IMO, the two most important properties of human cognition are that it evolved through natural selection, and that it is embodied, meaning that it is part of a feedback loop between perception and action. I don't see why either of these things requires having a biological brain or body. It's true that current methods aren't close to achieving these things, but there's nothing in principle stopping us from creating evolved, embodied intelligences either in simulations or in robots.
Despite our best efforts, we are deeply irrational. Our thinking is based on instinct, not on core principles; it's a top-down approach driven by feelings.
Off topic - a pet peeve of mine is seeing humans termed as “irrational”. Please forgive my rant, as it is not personally targeted at you.
We only seem “irrational” when we are talking about a narrow view of “rationality”, i.e., as defined by the cold hard logic of machines. We do not question why we have this definition of rationality. Our “irrationality” simply seems so because we have not bothered to understand the larger complexity of our evolutionary programming. It’s the same as not bothering to understand how a car works, and then claiming that the car works on magic. If one understands how it works, then it is no longer magic. In the case of humans, we may never fully understand how we work, but we can work towards a compassionate understanding of the same.
Absolutely on point. I would argue that irrationality isn't even possible. If person A thinks or behaves "irrationally" according to person B, there is simply a difference in perception between the two. A large percentage of those perceptions are created inside one's own mind which may or may not be aligned with the rest of the universe.
It sounds like your main objection is with conflation of "irrational" with "wrong". It's helpful to describe an extreme fear of heights as "irrational" but from an evolutionary point of view it might in fact be correct because it keeps those genes that give the rest of the population a healthy fear of heights in the gene pool. I think these are different concepts and a person can be both irrational and correct in their behaviour.
Why is instinct not computable? That seems way easier to compute than rational thinking based on principles, it's just "if this, do that" and machine learning should be able to do that easily
Everything in the universe must by definition be rational. Chemical reactions always happen a certain way, physics always works a certain way, numbers add up the same every time.
If something seems irrational, that's because there are rational things layered on top of one another in a way that creates a misapprehension in a partially-informed observer.
> Everything in the universe must by definition be rational.
Why? You assume nature must be comprehensible to us because you define it that way? Nature's not under any obligation to us.
> If something seems irrational, that's because there are rational things layered on top of one another in a way that creates a misapprehension in a partially-informed observer.
Possible irrational things: Collapse of the wave function, consciousness, why anything exists. There's also non-computable functions, paradoxes like the liar paradox, and the inability to refute various skeptical possibilities like living in a simulation.
Combine that with the fact that all models are wrong, some are just more useful. We can't model the entire universe with perfect accuracy without the model being the universe, which means there are always things left out. Our rational explanations are approximations.
We can't look at quantum mechanics and call it irrational. The way the universe works is by definition the right / correct / only way. If logic points one way and reality points another, the logic isn't taking everything into account.
You're confusing the map for the territory. Rationality is something that makes sense for us. That need not apply to nature. It's misplaced to say that nature is rational. Nature just is.
So nature cannot be irrational. For instance: we had the very logical Newtonian mechanics. But Mercury's orbit did not make sense. Later on Einstein saw another layer of logic that put both Newton and Mercury in more context, and now both make sense again.
When we see things now that make no sense, it indicates faults in our mental models, not in the things.
In the context of a person behaving or thinking "irrationally": their brain is just atoms and chemicals. It has to work right. If their behaviour and thoughts do not line up to reality, that's a feature of said brain.
Perhaps the irrationality is due to the brain giving too much weight to certain factors, or perhaps they are in fact behaving perfectly logically given what they know and you don't. Perhaps its workings can be improved in software, or in the next hardware generation.
But there's no spooky mechanism that can't in principle be modelled and implemented in a different architecture, ie, transistors.
Quantum mechanics works the same way every time. Being random doesn't mean they're not working the same way every time, the distribution never changes.
Given that a sufficiently resourced computer ought to be able to run a subatomic-level simulation of an entire human brain, and while acknowledging the usual counterpoint vis a vis C. elegans/OpenWorm but deeming it irrelevant on longer timescales, your take seems quite arrogant. “Outside the scope of computation” is an awfully broad claim.
No, you don't, but if you're taking all the data at a "subatomic level" you will have so much memory required than unless you have some kind of memory I haven't heard of, it will need to take that into account because the memory alone will have a volume of significant light-time dimensions.
You need something like 3 petabytes just to model the neurons. There are 100 trillion atoms in a neuron. And he said subatomic, so you need some significant factor of that, plus you have to save some state.
So 300 trillion petabytes+ might be a memory the size of the moon. I think relativity does matter at that scale. Also, you will probably have to mine silicon asteroids and invent an AI factory to make the memory, and power it with what, fusion?
So yeah, just model the brain at the subatomic level, bro, it's easy bro, and you're just a naysayer for saying it can't be done, bro.
But, please. If you have a solution to this, don't post it on HN, file a patent. Not just for the model, but for the revolutionary new dense memory, the location of silicon asteroids, the planetary memory factory, the power source for it, and so on.
If you were to try to build such a simulation, given the natural limits of real hardware, wouldn’t some form of relativity be a necessary feature of the simulation to avoid segfaults, rather than something simply to take into account while constructing the hardware?
In the sense that rather than something to take into account while building the system, it is an emergent property from the fact that any such system has resource limitations.
The other few things possible in next few years for past few years are:
1. Fusion
2. Self driving cars.
3. Ubiquitous, generally available and stable 8192 bit quantum computers.
4. High energy density grid scale batteries.
Just few more "this maybe the... what could revolutionize..." articles away.
What we forget is that it is in these people's business interest (executives making such statements) to state the exaggerated version of a future which drives the investors of all levels to allocate funds under FOMO which pumps up share price which is the end goal anyway, through a breakthrough or without.
EDIT: Merged another comment to keep all in one place.
I think we'll have full-blown AGI before fusion and before world-changing levels of quantum computing. Which is to say, we won't be around to see fusion and world-changing quantum :(
Or, more optimistically, AGI will accelerate scientific progress and we will have fusion, quantum computers, and self driving cars shortly after.
If AIs had the tendency to recursively devour worlds you would expect at least one alien AI to have conquered our galaxy by now. But the sky is quiet, so either we are the very first species to get this far, or there’s a natural ceiling to the pace of technological progress.
The directives that mandated eliminating its creators don't necessarily lead to interstellar conquest. Or more likely IMO, we're just too far apart and interstellar travel isn't feasible even for artificial super intelligence.
There is no evidence that an actually existing thing like a Von Neumann probe can survive interstellar travel. I think that the answer to all of these paradoxes is that special relativity just makes it basically impossible. That’s anathema to us nerds, I know.
That’s a very thin argument to hang a Pandora’s Box opening on. It only needs to devour one planet, though we appear to be doing that ourselves too.
There is a slight possibility that this is an unintelligent question, with humanity being ignorant as to why that's the case.
The universe we live in is only around 14 billion years old, which under the ideas of how long a universe could be around is actually a tiny immeasurably small blip on eternity. Before our universe started there was a vacuum metastability event somewhere that we at least assume had a nice even entropy. Again, that instability happened pretty damned recently...
So what if the first thing any super intelligence realizes is that at any moment our universe could have another such instability event? It's not going to work on exploring the vast distances of space... It's going to figure out if it can develop a plan to GTFO of here to a plain of reality that's more stable than this one.
Why do we need the G to do any of these things? I assume non general intelligence should be further along, so shouldn’t this recursive self improvement happen before the general part?
The future is here, its just unevenly distributed. If you live in a specific collection of neighborhoods in San Francisco or Phoenix, robo taxis are already here.
Honestly, I think robotaxis would have existed already by now, provided every vehicle is a robo-taxi. IIRC, robotaxis tend to perform smoothly in environments where they alone operate - I think Google proved that in some test concept many years before.
This chart shows how much people in the 1970s estimated should be invested to have fusion by 1990, to have it by 2000s and to "never" have it.
We ended up spending below the "never" amount for research over four decades so of course fusion never happened exactly as predicted.
I think the main difference is that no one was interested in investing in fusion
back then, while everyone is interested in investing in AGI now.
That is actually a really interesting find. Some of the verbiage is parallel to the hype we have around AGI right now too "fusion by 1990 instead of 2000..." does that not sound exactly like AGI in X year up from Y year that we read now?
That wasn’t the case at all for fusion. It is that to be viable different scientific breakthroughs in particle physics, and lasers. Especially to the point where the demo recently showed the viability of it. If you read up on it you’ll see that there was a lot of important steps that lead up to the short demo that was achieved. That is my point though, if you read the article even after the ellipses it wasn’t a monetary issue.
We're not reading news, we're reading press releases. And to be honest I would have hoped for a healthier dose of skepticism from HN. Vaporware and hype are common things in software. Ship it, let me play with it, then we'll talk about it.
Grid scale energy storage is basically solved once you realize that hydrogen storage is an option already. It's in the same boat as when photovoltaics got very cheap, but skeptics continued to deny that possibility for some years afterwards.
>once you realize that hydrogen storage is an option already
With unavoidable ~30% efficiency (or ~70%) energy loss, requiring an overcapacity factor of 2-3...
That's the same argument used against solar, that it is too inefficient. It's also greatly exaggerated, another thing it has in common with solar. Fuel cell systems are electrochemical systems and can be easily improved beyond that level. Especially with large facilities where waste heat can be recaptured and reused.
Personally I'm running the latest roomtemp superconducting, 4Giqb quantum memristor computer. It even remotely drives my fusion-powered car over the quantum internet!
Many things thought to be a distant dream have now become reality - cheap solar electricity, pocket computer for everyone, mRNA vaccine, affordable electric car with acceptable range, reusable rocket, satellite based broadband internet, image recognition, translation, and now generative AI.
Why didn’t we hear much about them from CEOs before they were available?
Also, which CEO promised fusion being available in the next few years? Or only wildly optimistic scientists?
Grid scale batteries don’t need to be “high energy density” like EVs. I would argue that cost is probably the biggest factor, and it’s already across the cost/benefit barrier in many markets.
This. It surprises me every time when battery or any "green" technology is discussed here and 90% of Americans keep parroting Trump's talking points. For example one of the comments above is talking about how EVs are just "a rich boy's toy".
We don’t all forget about incentives people might have in saying something; but we also don’t all assume that the first convenient explanation is adequate.
Numbers 1-4 obviously don’t have enormous recent progress nor the funding and mindshare poured into, and pouring in, as consequence.
That's heavily dependent on where you live. We're getting high speed internet pretty much everywhere in France. It's a program full of controversies, but here I am in a 3k inhabitant village with a 10gbps connection for 40€ a month... (Yes, 10Gbps, I'm throttled by my gigabit ethernet)
If you can't get fiber here you get a subsidised 4G / 5G router for a very reasonable price.
I really wonder what the next AI winter will be like.
Edit:
Perhaps OpenAI becomes a major tech player and we just see a cooling off of other AI investments as LLM becomes a known in terms of its strengths and weaknesses. Its abilities reach a natural limit which is still generally very useful.
Or maybe folks realize the degree of lies/mistruths inherent in its content is actually unmanageable and can’t be improved. After the hype wears off what it will be used for gets greatly curtailed and we see a big contraction.
And there’s so many interesting side tracks along the way. I’m hoping for another AOL Time Warner style shit show of a merger. That’d be fun and could really happen down any path.
> Or maybe folks realize the degree of lies/mistruths inherent in its content is actually unmanageable and can’t be improved. After the hype wears off what it will be used for gets greatly curtailed and we see a big contraction.
I think it's going to be this the future that's ahead us. There's enormous faith being put in next token predictors as the intelligence breakthrough just because their output coincidentally reassembles like something that's derived through a really intelligent process.
LLMs do not hallucinate sometimes. They hallucinate all the time, it just is a coincident that sometimes these autocompletion of Tokens aligns with the reality. Just by chance, not by craft.
> I don't understand the "coincidentally" argument.
Nothing is coincidental about those models. They were designed after processes in the brain. They underwent rigorous training to generate a function that probabilistically maps inputs to outputs. Eventually, it exceeded the threshold where most humans consider it to be intelligent. As these models grow larger, they will surpass human intelligence by far. Currently, large language models (LLMs) have fewer weights than human brains, with a difference of a factor in the thousands (based on my superficial research). But what happens when they have an equal or even 100,000 times more weights? These models will be able to model reality in ways humans cannot. Complex concepts like the connection between time and space, which are difficult for humans to grasp, will be easily understood by such models.
> LLMs do not hallucinate sometimes. They hallucinate all the time, it just is a coincident that sometimes these autocompletion of Tokens aligns with the reality. Just by chance, not by craft.
That is such a weird way to think about them. I'd rather say, they always provide the answer that is most probabilistic according to their internal model. Hallucination simply means, that the internal model is not good enough yet and needs to be improved, which it will.
>we just see a cooling off of other endeavors as LLM becomes a known in terms of its strengths and weaknesses and it’s abilities reach a natural limit which is still general useful.
This is my prediction
If the training set is just the internet, it will be not much different than someone who spends all their lives in their room in front of their computer.
I don't think that's a limit. If you give me $10M and 5 years, I can tell you a half-dozen ways to train models better than GPT4, primarily by having them do richer tasks than text completion, by having them evaluate themselves in richer ways, by having workflows around them, and by having shared models perform multiple types of tasks (e.g. text, image, controls, etc.).
It's not that I'm especially smart; many others could tell you ways to do the same thing now that there's an initial proof-of-concept. It's that these things are new enough no one has had those five years yet. GPT4 came out two months ago, ChatGPT maybe seven months ago, and GPT3 three years ago.
I can't predict if these things will level off, grow linearly, grow exponentially, Moore's Law style, or explode off into the singularity.
This reminds me of reinforcement learning and the Bellman equations.
Let a neural network based agent act in an environment. Record its rewards. Update the neural network so that the best actions become a little more likely, and the worse actions become a little less likely. Thus, we have demonstrated a training step that improves the agent.
Now just run that training step in a loop forever and you'll improve each step and reach Godlike intelligence, right? Right? Well, no, there is instability in "pulling yourself up by the bootstraps", and while I expect to see more GTP improvements, I believe there may be limitations we don't know about yet.
That sounds like technobabble, which is to say big words like "instability" and "Bellman equations," but with little connecting logic.
If there is meaning behind those words, please write something with multiple paragraphs.
I don't see any reason why there would be a limit, instability, or otherwise, any more or less than there is on biological evolution. I also don't see any reason why there wouldn't be a limit. We just have no idea at this point.
I do think a diversity of tasks is critical. An AGI should be able to not just complete human language but:
- Play a diversity of strategic games like chess or Starcraft
- Perform machine vision like Stable Diffusion
- Predict the output of Python programs (and vice-versa)
- Write (correct) mathematical proofs and arguments
- Control a robotic arm or airplane
... and so on.
We can train all of those independently -- and we could do so in a single network now -- but we're at the very early stages of architecture for how those things would integrate.
At some point, the best solution is a very sophisticated intelligence. How close deep learning in current -- and future -- architectures gets us to super-intelligence is an open question.
Absolutely. There's no doubt teams already training multi-modal models on subsets of the videos in YouTube (there's already published examples training on Minecraft videos).
There are plenty of efforts at the moment which indicate that you’re right.
I think we’ll learn to compose models for more diverse and capable outputs, validate those outputs better, and do all of it more efficiently. The end result seems like it could be far greater than the sum of its parts.
I don't get all the negativity unlike most humans LLM can be made to discard bad or incorrect data. Quickly transition to new ways of doing something. A human spending all their life in a room in front of the computer would not can not go through all the millions of scientific papers and journals. They would not look at unlimited amount of code or go through all the past and present law of every country . No human is capable of that.
We are entering into the real culmination of the information age where previously information was available but not exactly accessible or usable for a common person. Which like all tools will result in good and bad outcomes.
Ask any LLM about the cheese inventors in far East china. It'll make something up very plausible. Then all you have to do is to dig through libraries and historic records to verify that this all really is true and is really rooted in reality.
That's why you are better served asking LLMs for factual information when they are connected to tools like search. ChatGPT with search enabled, Bing.chat or https://www.phind.com/ will give you sourced and much more reliable output.
I'm not saying it's not useful. I'm saying that once it learns the content of the data sources that are fed to it, that's it. It will spit out some overlooked connections then stop. Maybe the cure for cancer can be found just by reading in between the lines of millions of old scientific papers but I doubt it.
I predict that the fight will be more about defining intelligence than inventing it.
No one knows what AGI is. There isn't going to be some switch that flips to take us from AI to AGI. These tools we have today will just keep getting incrementally better, and some new ones will pop up, and at some point we'll have to stop and say "yeah, this is good enough to qualify". And everyone will have their own opinion on what that point is. Plenty of people even think that we are there today, and there's nothing stopping Google or OpenAI from claiming it if they want.
And you can say the exact same for consciousness, sentience, self-awareness etc.
And when we run out of clues what AI can't do yet, it's AGI?
(You imply that a "can't do yet" will remain forever, which is the open question. If you ask me, AGI is only possible if the tech has ~unlimited agency, which implies control over computer and energy production facilities.)
No matter how good am AI system gets we can rely on hacker news commenters highlighting hallucinations and saying “it’s only doing a simple <whatever> task” while totally ignoring whatever the system is actually capable of.
I agree that the goalposts will be moving for a long time (at least for some people)
That's right, whether we call it almost intelligence or asymptotic intelligence, or just plain artifice.
There are reasons Ray Kurzweil used the term "spiritual" in the Age of Spiritual Machines. Among those reasons is that "spiritual" is much more difficult to define with any consensus among experts.
And indeed, there's an inflection point coming. What this is, is not at all clear. However, I'd predict that the answer lies with the realization that, given the limits of conversing with LLMS and GPT, the implication is that there's a human-computer sensemaking loop:
The difference with this HCI is that you'd not hire a human collaborator who lied to you with or without being aware of their own lack of veracity. Here, we'll burn fields full of GPUs at massive cost to get an answer, even though the outcome may be advertising the fact that the AI is wrong. There is learning, but it's going to be costly and painful.
I see this argument all the time and it perplexes me. A thing so huge to rival the wheel, the control of fire, etc, and yet people say that we might not be sure when it's here. If it is AGI it will change the world in dramatic ways, you will not spend time arguing that we rolled things on top of logs before we had the wheel. The freaking war chariot is charging your ranks, taking your home and your family in slavery, there is no room for doubt about the impact.
Is this like when the Google self driving car director Chris Urmson said in 2015 his 11 year old son would never need to take a drivers exam?
I have a hunch when they decide who to hire for upper management, they select for whoever promises the moon. The person making the promises may not even believe it.
It’s easier to call it AGI to “normies” rather than what it will probably be IRL which is many thousands of niche disruptions via the specific innovations of recent (LLM/deep learning/GPU+AI tooling+education investments)… that we will as usual have a difficult time predicting what it looks like even a decade from now.
Stuff like programming assistance, new markets driven by GPT style APIs, and the mountain of productivity gains will help subsidize and accelerate the next tranche of investment until we reach the next R&D milestone that will again totally “revolutionize” the world and lead to mass employment for the 10th time this century.
I look forward to having this conversation on HN again in 5-10yrs as the world slowly gets better at a slightly faster (yet less news worthy) rate.
Translation: “I realize that Google is behind on the AI race (and on our corporate strategy in general), so I’m going to fear monger you, the investment community, with claims of “AGI soon” to attract investments and prevent our stock from plummeting.”
I am curious to do a poll of what AGI means to each person. It stands for "Artificial General Intelligence" but some have different definitions.
For some, it might mean any that is as capable as a normal human. For some, non-biological life axiomatically cannot become AGI. For some, it might require literal omniscience and omnipotence and accepting anything as AGI means, to them, that they are being told to worship it as a God. For some, it might mean something more like an AI that is more competent than the most competent human at literally every task. For some, acknowledging it means that we must acknowledge it has person-like rights. For some it cannot be AGI if it lies. For some it cannot be AGI if it makes any mistake. For some it cannot be AGI until it has more power than humans. These are several definitions and implications that are partially mutually conflicting but I have seen different people say that AGI is each different one of those.
Yeah, even the linked article acknowledges the problem while defining AGI as a computer with “human-level cognitive abilities”, which is broad enough as to be borderline useless.
Whenever a major company references AGI, as is the case here, I mentally replace the term with “Skynet”, because I expect the statement’s aim is to instill fear.
If Google ever develops AGI, you can be sure they’ll call it something else.
It seems to me that software developers are all too eager to attribute properties such as “intelligence,” to LLM’s which lead to strange, reductive conclusions that this is all humans are: token matching algorithms.
There is much to intelligence that those who’ve been studying it in areas such as biology and ecology still do not understand. The role of emotions and how they work have a strong influence on our cognitive abilities and consciousness as a whole. This is rarely ever considered a part of intelligence in the AI space.
And I think that’s rather the interesting part of all of this: we skip the artificial part and leap straight into deus ex machina. It is artificial and limited to what we choose to implement.
Even unsupervised learning isn’t technically that marvellous under the hood. In the sense that it is unknowable and seemingly magical.
I don’t agree that we’re a few years away from Data (a character from Star Trek : The Next Generation and an artificial life form) and we have no idea if we’d ever be able to implement Lore (a related character from the same show that has the benefits of being able to simulate emotions).
I don't think it is necessarily important how the human brain works.
An interesting metaphor here is that aeroplanes don't work in a similar way to birds, the end goal is flight, not having wings that flap and are covered in feathers.
Precisely. LLM’s are an engineered system. Artificial.
What I keep hearing from the AGI folks is that we’re on the verge of replacing humans. That these systems, “think,” on their own and will be superior to us in every way: dangerous even!
I highly doubt they will be a danger on their own. Bing isn’t going to decide one day that it thinks you’ve been a bit distant lately and doesn’t want to answer your query until you apologize. It will answer the query because it’s an algorithm run on a computer that is designed to answer queries.
The danger of AGI still comes from people and corporations that wield them.
Although it would be very convenient if future cases against Google could absolve them of responsibility because of a “rogue AI.”
> I highly doubt they will be a danger on their own. Bing isn’t going to decide one day that it thinks you’ve been a bit distant lately and doesn’t want to answer your query until you apologize. It will answer the query because it’s an algorithm run on a computer that is designed to answer queries.
And what happens when someone designs an algorithm to run a company by itself and make money? Or to orchestrate military strategy? Or to be any sort of general purpose intelligence that tries to achieve complex goals? Once these systems prove themselves useful, there is going to be intense pressure for companies to make them smarter, and for governments to use them to gain advantages in geopolitics and military applications.
Yes, people wielding AGI is dangerous, AND it has a lot of potential to be dangerous all by itself, as soon as it becomes smarter than us. When we make something more intelligent than we are, I fear we might as well start a clock and start taking bets on how long humanity will be a) relevant and b) alive. I hope I'm wrong, that alignment turns out to be easier that I think, and that these machines turn out to be gentle, benevolent stewards of humanity, guiding us into the future. Or at least, that we have more time than I think to work on the problem.
By the way, about your scenario of Bing not answering your query until you apologize -- have you read the transcripts of Sydney? https://plainenglish.io/blog/bing-chats-sydney-do-you-believ.... It's not dangerous, except to people with existing mental health issues, but if you can see how unpredictable this tech can be when it's not smart, how do we expect to keep it on a particular safe path when it's a new form of machine intelligence smarter than we are?
Is there any way that a less intelligent species can control a significantly more intelligent one for any more than a very short time? Or even if not controlling the smarter species, just maintaining control of their own future. It feels as likely as designing a working perpetual motion device, to me.
That's probably a truer insight to the current discourse than it might initially appear. I believe when people were trying to develop flight there were people on the sidelines pointing to the lack of flapping and feathers and saying it'll never work.
That is precisely the point I am making. OP says we are a long way from achieving intelligence because we don't understand many of the components of human intelligence.
We don't need to believe that artificial intelligence will arise from similar properties as human intelligence. In much the same way aeroplanes don't fly using the same mechanics as reached via evolution.
Not sure what you mean when you say a statistical model is not a neural net. If we are talking LLMs, they are neural nets.
That's the trick, it just means it can do various task at human level. I think the original meaning of "general" is a reference to human intelligence kind of general, in other words Artificial Human Intelligence could be a better name for "true AGI".
Personhood has nothing to do with a hypothetical AGI happening. It actually boils down to hypothetical architecture even leading to AGI in the first place. It's a technical question, not a political question.
First it would need to actually have independent thought to be granted anything.
Well in real terms, isn't all it takes is something that's a convincing enough forgery? Like why the turing test was considered interesting at one point. I don't think drawing a line is necessary for us to quickly become caught off guard by developments in this space.
For the record, I don't believe we're close to AGI but I'm also pretty far from knowing anything about that field.
Seems like a no lose bet no matter the odds. If you win, life goes on and you can collect a bit of money. If you lose, money becomes worthless and the surface of the earth gets transformed into a computing substrate.
I don't know. Having interacted with LLMs at different levels, they resemble a very sophisticated, alien intelligence trying to pretend to be human. It's like me pretending to be a dog; even if I were to emulate a dog perfectly, I wouldn't have the same emotions; I'd be pretending.
We have no idea what emotions, motivations, behaviors, or goals AIs have, will have, or if they'll have something as of yet unconvinced that's not emotions or motivations, but just alien.
We evolved to self-preserve and breed. Modern AIs evolve to pretend to write human text. It's not clear there is any intention to survive, reproduce, or turn the surface of the earth into a computing substrate.
There's a million different dangers -- and I suspect the real ones are ones we haven't conceived of. Whether they'll materialize or how depends on on how we evolve them, and I expect we can't predict it.
To me, much more likely than earth-as-a-computing-substrate is humans-as-brainwashed-consumers. Market forces will push for AIs to write text which draws eyeballs. Those models won't care about truth, ethics, or much of anything other than getting you addicted to reading what they write (or watching what they create). At that point, we can destroy ourselves just fine.
But even more likely is something no one has thought of.
> We have no idea what emotions, motivations, behaviors, or goals AIs have, will have, or if they'll have something as of yet unconvinced that's not emotions or motivations, but just alien.
AIs don't have emotions, motivations or goals.
They don't pretend, because pretending implies intent, they don't have intent. They do what they're created to do.
Humans are already brainwashed consumers. Welcome to marketing/advertising and late-stage capitalism. The ability of human beings to do what you're describing is much more effective than that of AI at present, ergo the "danger" has been here for decades. Smoking? Junk food? Radium water? Fast fashion? Equestrian ivermectin? Shall I continue?
The level of confidence both sides give here is not warranted. We have no idea about what internal structures emerged within LLMs. We only know outer behavior.
From a humanist / secular perspective, humans evolved to make babies. Emotions are an emergent behavior to maximize the number of babies made, and their survival. Nothing less, and nothing more.
What analogues emerge when we train machines not to survive but to complete text?
We have no idea.
There's the "ghost in the machine" crowd, the sentient machine crowd, and the mechanical machine crowd. None have presented any compelling evidence, but all speak with complete confidence in their hypotheses.
1) The first AGI we create will immediately break free of our control
2) It will either have already been given, or will find some way to take, control of physical systems
3) It will create the Singularity
4) Its goals will be to advance itself at our expense
None of these are remotely givens. Even if we grant that AGI is possible with our current level of technology (which is also not at all a given), the Singularity is nothing but science fiction. It's an interesting idea, but there's no real reason to think it's close to what an AGI's capabilities would be like in reality.
There's lots of science fiction that doesn't have A.G.I. destroying the planet.
Commander Data, Hal 9000 (a bad guy but not that bad) Asimov's droids, even the all powerful A.I. of Neuromancer has no particular ill will towards humans.
Belief in AGI hardly means you need to believe in Armageddon.
There are betting markets that have odds up for this, you generally have to pay in crypto and the odds suck given your money is locked up for a decade plus(if they pay out at all). Importantly though, what do you think the odds are? I don't think anyone is saying 50/50 in a few years, just greater than 0%.
If an AGI means equivalent to a human (not just online, but in full robotic form, even if compute is not done in the body), then Id put 0% in the next 5 years. Maybe like 5% in 10 years, but that human will be very much limited, since it will most likely be trained using traditional methods, which means for the physical interaction part it will have to be trained on a representative subset of physical tasks, and just like ChatGPT, it will fall short in areas.
If an AGI means that something is sentient enough to be able to rewrite itself to be better, id put 0% in the next 50 years. There needs to be way to run accurate simulations of reality faster than reality, which is a fundamental physics problem as well as a computation one.
What is going to happen is more and more efficient information compression that will appear like AI, but under the hood it will basically be just emergent software. It will be totally possible in the future to ask an agent and get a full step by step plan of building a personal VTOL in your garage without specialized equipment, but all that is is just information from different domains rolled into one compression algorithm with efficient language based search.
I think we might also assume that we will believe we have created it before we actually do, but then simultaneously deny that we have created it after we already have.
I wouldn't be surprised if after this interview he got a few millions $$ more in his stock worth. It's damage management. They, and he personally, didn't see LLM potential.
The thing that's interested me about the last 5 or so years is how much its become the case that everyone is now seeking to be a thought leader. It seems like at some point the VC crowd decided that because money was basically free they needed other ways of sourcing investment opportunities so they started trying to become "thought leaders". Which kind of makes sense, if you do it well the branding can be an advantage. But it runs into this problem of toxic positivity. No one wants to hear "Well, this technology, like every technology before it, is promising but unlikely to fully solve 99.99% of the problems it looks promising on today". So instead of having a real conversation about the viability of autonomous vehicles you've got motivated actors and lobotomized supporters making end-of-days style claims about technology, and each time these claims turn out to be totally unfounded we all just move on to the next topic. Can we stop and disect the failing gig-economy model, web3 movement, and elf-driving before we listen to those same people about AI?
That's kind of what I'd like to know, before the CEO of DeepMind opines on AI I'd like to hear his thoughts on what he learned about his last predictions.
AI, when it was originally coined, was equivalent to what is meant by AGI now. Through enough hype and misapplication of the term to any new algorithm, AI eventually acquired in practice the meaning "any new nontrivial algorithm that does things that humans can perform" (until the shine wears off, in which case it's no longer AI - even A star was originally considered AI). Looks like AGI is going to go down the same path. I wonder what the replacement term for AGI will be.
Like many discussions around AI/AGI, the comments below show just how lacking we are in even basic vocabulary... to describe & communicate basic concepts. "Superintelligence" vs. "Useful Cognitive Function" etc.
Since 'AI' and 'AGI' are so poorly defined, it's allowed marketers to make un-disprovable claims that they have it for sale. These terms are now hyperbolic synonyms for 'algorithm'.
If I'm trying to solve a generic problem, I try to break it down into pieces which I have familiarity with and barring that I will look to google to fill the gaps. I might try think of alternative approaches and compare them, and reflect on individual steps to try and improve them. I'll review my work and test it as I go to ensure it is actually solving the problem.
Given some of the capabilities I have seen with chain-of-thought processing, embeddings, and vector databases - it does seem to me conceivable that computers could be made to do all of this. One gap maybe is being able to pick up that the problem statement is wrong, i.e. not solving the right problem, or being aware of subtle undocumented details about the system and the business that need to be factored in. The AI needs to actually know about these subtle details somehow in order to factor them in.
Perhaps that is the big problem that remains - capturing subtle details about the world.
I get that AGI has a formal definition, but if I take the words "Artificial General Intelligence" on their own, I would say they describe GPT4 perfectly. It can plan my holiday, write a cover letter, code up scripts, explain music theory to me, etc. It is artificial, clearly a form of intelligence, and has a general application.
AGI is the point when humans lose control, but between now and then they will lose a lot of white collar employment jobs. Based on the increase in productivity of anyone equipped with AI tools, we may see a lot of self-employment become more and more viable and lucrative.
I'll consider us lucky if losing control means something intelligent has taken control. Like most car accidents, it could end up just being a series of dumb mistakes.
ZiiS told us all our approaches to this problem are fruitless and will never result in anything resembling AGI. How exactly he knows this is unknown to us, because we are still researching the basic properties of the systems we are currently producing at a staggering rate.
Nevertheless it is quite a relief. We can now let this all go and return to frolicking in the meadows again.
My point was not that the current approaches are fruitless; it is that we have no information to decide if they will lead to AGI in a few years or not. Whilst this fits a dictionary definition of "possible" they are trying to make it sound "likely" with no evidence.
Excuse me, I was being snarky. I agree that we are low on information, but I guess we disagree on how low. You imply we have so little information the outcome is basically random, which I think is a bit too extreme.
I don't find it far-fetched to imagine current AI research - not just LLMs - to get us to AGI-likeness in years (5-10). I deliberately use "likeness", because I also think we will never agree on what it means and if we ever reach it.
Digressing here because no one will read it anyway, but if my knowledge of humanity is anything to go by, we will be surrounded by robots and software systems of varying levels of intelligence doing just about anything a normal human could do and we would still be discussing if we ever reach mythical AGI.
Personally I think abstract, pure form AGI is impossible, even in biological systems. We don't scale infinitely and I don't even mean in the physical "too-much-data-to-handle" sense, I mean qualitatively. I think there is a very real ceiling to what kinds of mentation are possible by us and thus what results are achievable. Again, not just quantitatively, but qualitatively.
We have examples of this in the animal realm. Some of them show signs of for all intents and purposes "general" intelligence. But they will never scale to human levels of cognition. Even simple human concepts are out of reach, yet they outperform us on other problems (maze solving, memory, etc). You cannot teach a great ape what you can teach our children. People have tried. They are thus general, because you can teach them just about anything within their "bandwidth", but they are still constrained. I very strongly suspect the same holds for us. It's just that we have no superiors (or even peers) to compare to.
Yes 100% they will decide when they have surpassed; not just because we will be emotional about it; but becase by definition they will be better at deciding things by then.
The current approach to AGI, as seen with projects like AutoGPT and BabyAGI, is to call an LLM (GPT-4 is the current choice) to solve and generate useful content recursively. Tasks are output, so for every task generated the LLM is asked to recursively solve that task, which could involve creating new tasks.
So far some small wins have been seen. Common problems involve looping and doing nothing. It's still early for these projects.
I'm confused about what the goal of "AI" is anymore. Does anyone else feel similar?
Researches seem fixated to create something like "a really smart person or intellectual slave", LLMs make it seem like we're close to that I guess?
Is there not a whole lot more usefulness in solving actual problems with "AI" without the potential risks and baggage of creating, I don't know, is it super capable idiot savant?
The approach deep mind was taking before entering this sort of "talking computer" arms race seemed actually quite useful and less "disruptive". Things like AlphaFold are more inline with what I was hoping to see going forwards. Now I just don't really know what the the plan is?
Is it to one day be able to sit at your computer and say things like: "I want you to solve cancer" and "replace my secretary, she's too expensive", "build as many paper clips as possible but don't hurt anyone?", and expect that it will kind of do it? Do we expect that everyone will freely have access to these systems or just a few people? Is this sustainable and practical long term?
In one interview I saw, Ilya Sutskever has claimed every town, state and country will have sort of AI representative, and in another basically stating we might not have to work and become "enlightened beings". Why on earth do we need an AGI to become "enlightened beings", has this gentlemen never read any literature on "enlightenment"?
I'm really struggling to see how the type of world some of these researchers seem to be gunning for is actually ideal or even wise even if achievable, especially given the mounting levels of anxiety around LLMs and their implications. Are the ethics ever actually considered?
It seems like "building AGI" is kind of like, building a something (loosely defined) which could have a lot of negative unintended side effects, but for what now? Just intellectual curiosity ? Fulfilling a Sci-Fi fetish?
Personally, I think this is what has spooked Geoff Hinton, he has seen an acceleration toward something, but he realizes we have zero idea what to do when or if when we build the "something". He now realizes the military or bad actors will take advantage of these AGI(?) and he might be alive to have to see the consequences of that.
>Why on earth do we need an AGI to become "enlightened beings", has this gentlemen never read any literature on "enlightenment"?
I think you might be misconstruing his definition of 'enlightenment'. In this instance I believe he is referring to humanity having the time and freedom for the pursuit of knowledge and intellectual reasoning. There are a number of instances in fictional literature were the authors discuss worlds in which civilization has achieved a state of enlightenment and dedicate their lives exclusively to science, philosophy and the arts (Olaf Stapleton's First and Last Men, and Starmaker provide several)
>but for what now? Just intellectual curiosity ?
The purpose is out of exercise to understand it. But again, if we might produce an AGI which is capable of digesting massive amounts of knowledge, but with the ability to reason we might be able to achieve certain goals like fusion energy sooner than expected. IMO solving the energy problem in itself is worth the risk as our uncontrolled use of fossil fuels is an existential threat to life on this planet, not to mention it would rebalance the distribution on wealth and power around the world.
As for Hinton, I think you are spot on with why he reacted this way. However, I facepalm every time the 'killer robots' example comes up because it is the worst example of the threats these things pose compared the following:
- AI being used to manipulate public opinion on a massive scale creating a new age of demagoguery.
- AI being used to fabricate different realities within an information sphere by generating text, imagery and audio/video media supporting a certain set of narratives. Essentially a type of super propaganda.
- AI being granted trust to by humans to perform certain functions which it is not capable of, because humans are too ignorant to subject it to proper scrutiny.
- AI being used to manipulate either markets, financial institutions or economies at a massive scale, dramatically shifting the balance of power around the world.
Essentially, the general public's concept of warfare needs to be broadened to consider non-kinetic conflict. Warfare is effectively a pursuit in the change of policy in other nations(or groups) by many different means. The last resort of these methods is inevitably a kinetic engagement, where all other approaches have failed to achieve the desired result.
Altering this viewpoint allows us to reassess what we consider to be weapons rather than tools, and gives us a better idea of where we should expect these threats to emerge first.
I don't think there's been any progress on AGI, actually. Even though there's a lot of powerful AI programs out there, they're really nothing more than complex tools created to solve problems. A few other commenters mentioned that there's no good definition for what AGI would actually mean, but a good starting point that I think we can all agree on is that an AGI would need to have some level of autonomy. That being said, even if you did make an AI that was autonomous, I still think that it's a leap to call it an AGI. The way I look at it, if you did all the math by hand, you wouldn't call it intelligent, so you shouldn't call it intelligent if the math was done by a computer.
I don't think that doing the math by hand to simulate a machine intelligence makes it not intelligent, any more than doing a simulation of all the electrochemical signals in a human brain by hand would make a human not intelligent. Aside from the infeasible amount of time it would take, it's the same thing.
As for autonomy, LLMs don't have autonomy by themselves. But they can be pretty easily combined with other systems, connected to the outside world, in a way that seems pretty darned autonomous to me. (https://github.com/Significant-Gravitas/Auto-GPT). And that's basically a duct-tape-and-string version. Given how new these LLMs are, it's likely we're barely scratching the surface.
Several comments have touched on this, but wouldn't you define an AGI as something that could learn to drive a car? I think every (reasonable ( person understands self driving is >10 years away. So shouldn't AGI be?
Self-driving cars are colloquially held to a much higher standard than human drivers are. Nothing currently exists that meets that standard. But humans meet the GI standard of AGI. And strong evidence shows that humans exist.
Thus, the standard for AGI is lower than the standard for self-driving cars. Existence of an AGI does not imply existence of self-driving cars.
To put it another way: take the worst human driver who still qualifies as "could learn to drive a car." (It's assumed the person is intelligent.) Construct a Turing-like test where we observe a car possibly being driven by that person, but we don't know who or what is actually driving the car. The car drives over a curb, kills a cat crossing the street, narrowly misses a dozen pedestrians in crosswalks, and finally parks poorly in a mall parking lot -- just like we would expect from the world's worst human driver. After observing that car trip, would we celebrate the long-awaited arrival of self-driving cars?
Aren't the best self driving cars causing traffic jams in SF because they can't handle everyday tasks? And can any self driving car handle suboptimal environments like snow? Or just unmarked roads? Or... left turns in traffic?
Just because you can't handle every situation that could occur when driving it doesn't mean you don't have the ability to drive. I can't parallel park, but I would still consider myself able to drive. I make mistakes driving where I'm not supposed to yet I'm still a driver.
But you're not stopping in the middle of the street for long periods in confusion because of a common situation. There's levels to this and the self driving cars are not yet at an acceptable level.
Not necessarily. It could be that AGI exists and can learn how to drive a car, but we don't have good enough cameras/LIDARs/other sensors to make it work. Or the computation/battery power needed to do so isn't possible to fit in a car.
AGI is about inventing the brain, while the car is an entire body.
Disabled people who have functional brains still have substantial body function (otherwise, their brain – which is part of the body and dependent on a number of other parts – would stop functioning.)
“G” means general. If it can learn to drive a car, build a car, sell cars, write poetry about cars, and teach a class about the history of cars, then it’s scratching the surface of AGI, because there’s more to life than cars.
5 years ago L5 autonomy was being released any day now, but now nobody talks about it anymore. It seems like a handful of self driving cars have been driving around in circles in a couple cities forever now and everyone just keeps giving the same demo.
I tried to classify images using tensorflow and inception using a pre trained model, it was awfully slow. I asked almost everywhere for a faster and less accurate model, no answer.
I'm starting to suspect that GPU vendors are the ones selling AI.
If you think about it, software is often used by the hardware industry to increase sales, it's the law of Wirth. Seems like it's the norm now.
These people are confusing learning with intelligence. The latter is a property of a conscious being while a lot more beings have the ability to learn, a basic example would be genetic adaptation or immediate environmental adaptations in plants, neither of which could be considered intelligence.
Learning is an automating activity, when you've learnt something it means you can do it without being intelligent about it, on 'autopilot' as it were.
A crucial aspect of intelligence is the denial of learning when appropriate, i.e. not acting according to learning, which seems to be a conscious, prefrontal cortex related ability, and hence closely connected with anxiety and suspicion.
Evolutionarily learning works well with slow changes in evolutionary pressure, intelligence gets useful when the pressure is erratic and evolutionary signals are unreliable in mediating information about how to survive and reproduce efficiently.
The AGI believers in the capitalist class aren't alone in confusing this, learning is much simpler and more immediately rewarding than intelligence so the bourgeoisie has made learning their ideal for all of modernity and replaced educational exams based on more or less intelligent conversation with formal measurements of learning.
Always found it interesting that GPT4 sparks of AGI paper was written mid-2022 meaning they probably knew of everything Twitter AI is discussing 1.5 years ago. Explains why it feels like Microsoft is cranking out so much software these days