It's all deep learning, making it the exact opposite of a Cambrian event.
Has anyone has tried searching for new basic operations, below the level of neural networks? We've been using these methods for years, and I doubt the first major breakthrough in ML is the most optimal method possible.
Consider the extreme case of searching over all mathematical and logical operations to see if something really novel can be discovered.
How feasible would this be?
>Tying it together, a big diversification is likely coming
I'd be interested in knowing what signs, if any, currently point to that.
What I see is a coalescing around the current paradigm. And considering that the hardware side led by NVIDIA is catering to this paradigm, I don't see us breaking out of this local optimum any time soon.
Regarding whether this can be likened to Cambrian explosion: We don’t know what caused Cambrian explosion but conceivably it was some evolutionary innovation such as vision. Similarly deep learning is a cause for the explosion of new AI use cases.
Sometimes I feel like all we’re saying is that we’re surprised that all of our capabilities might just be a statistical model.
From everything we do know, what else would it be? The arrogance of consciousness is our blind-spot sometimes.
Even the words my brain just typed are combinations of words and arguments from everything you’ve ever agreed or disagreed with.
So, in my view. Traceability in AI is possible, but so should traceability in neurology be possible. It might not be as important as it seems to us. And that humility hits people in various ways.
My idea to calm everyone down about traceability is to have the AI write a log/wiki in parallel with its activities.
Putting toothpaste back in a tube has never worked, so don’t light your own hair on fire, right?
> for new basic operations, below the level of neural networks?
Why would we? It's proven that neural networks can learn arbitrary functions, and deep networks can apparently learn all the things we don't have functions for. Backpropagation is probably biologically implausible but it's a very good and fast method to train such networks. What could another implementation of intelligence offer that we don't already have? Backpropagation is very old and maybe it's not optimal but it s good enough that it doesnt matter
Can learn arbitrary functions says nothing about efficiency or any practical considerations. A vanilla neural network can't learn a sine function or anything periodic, for inputs outside the range of the training data.
Re the question of other methods, Jeffrey Hinton proposed what I think is called a "forward-forward" alternative to back propagation somewhat recently. I don't think it's been proven to be better in any way, but it shows people are searching for new methods.
It’s funny that the masters (Hinton, LeCun, et al) are constantly looking beyond the orthodoxy while the disciples worship the masters’ creations and fight any and all criticism.
> says nothing about efficiency or any practical considerations
Sure it does, it's practical for pretty much any problem we want to solve, in fact i can't think of something it has not been shown to be practical about. Gradient descent is pretty straightforward and the most efficient generic method we have, even though it's not proven to be the fastest. Forward-forward is proposed as an alternative that is biologically plausible, not as an efficient alternative (it isnt)
If you have a smaller number of parameters, then quadratic descent (eg. Levenberg Marquardt) is a lot faster for most problems.
Gradient descent is only used for NN because we initialize all the parameters to random values, so the assumption of the JtJ approximating the Hessian (required by LM) doesn't hold, since residuals are too high. I suspect there might also be something about the learning rate effectively being too high when using quadratic descent.
Even for large problems there are techniques like LBFGS which converge a lot faster than gradient descent, that don't require O(N^2) memory on the number of parameters.
Why are you responding to only parts of comments and ignoring the context around it? It just makes it look like strawmen arguments.
> Can learn arbitrary functions says nothing about efficiency or any practical considerations.
This is absolutely true. A turing machine can perform any computation, but they are not really practical for anything outside of an academic exercise.
> A vanilla neural network can't learn a sine function or anything periodic, for inputs outside the range of the training data.
This is also true. A vanilla neural network isn't that useful outside of very basic tasks with a lot of training data. It's only when you start changing their shapes and using specialized techniques to fit the problem that they become practical. Even a simple task like classifying MNIST digits is done far more efficiently using a basic CNN than a vanilla NN.
Efficiency is a function of the underlying hardware. Most graphics shaders would be comically inefficient if not for graphical accelerators. With this explosion likely ending up with an inference accelerator in every device, it should make it practical and efficient even if it's not the best approach.
It's not in the networks themselves, but the combinatorial ways that we can string them together. Even if the language models themselves don't become much more powerful, there are a lot of really weird things to discover in terms of how we can pipeline them together to accomplish things. For example things like this:
In a way, you can also look at language models as being the new basic operation. Activations and floating point math are replaced by words and symbolic reasoning.
Your description of "dumb AI" as being "just useful enough to maintain a flow of funding" reminds me of A Collection of Unmitigated Pedatry's description of the start of the Industrial Revolution [1]:
> The specificity matters here because each innovation in the chain required not merely the discovery of the principle, but also the design and an economically viable use-case to all line up in order to have impact. The steam engine is an excellent example of this problem. Early tinkering with the idea of using heat to create steam to power rotary motion – the core function of a steam-engine – go all the way back to Vitruvius (c. 80 BC -15 AD) and Heron of Alexandria (c. 10-70 AD). With the benefit of hindsight we can see they were tinkering with an importance principle but the devices they actually produced – the aeolipile – had no practical use – it’s fearsomely fuel inefficient, produces little power and has to be refilled with water (that then has to be heated again from room temperature to enable operation).
He goes on to say that the very first "commercial" steam engine only happened to be commercially useful because of the particular situation in which it was invented: England had cut down most of their trees, but luckily had lots of coal. The engine wasn't quite useful enough to use to (say) pump water out of an iron mine, because it was so resource-hungry that it the cost of the fuel plus getting fuel to the engine would be too much. But it's just barely useful enough to pump water out of a coal mine, if you can provide it coal without having to transport it. And that gave it just enough of a commercial toe-hold to fund its optimization.
It sounds like "dumb AI" of the 2000's may have performed a similar function; and we are, perhaps, on the edge of the "AI revolution", similar to the Industrial Revolution, where we reach a hockey-stick proliferation.
This is extremely interesting and I will keep it for later! I think the engine is a perfect analogy for what makes language models exciting. At the time nobody would have realized that it would power trains or factories... it's just this huge inefficient dirty machine that gets water out of the coal mine. Let alone realize that given better fuel, they would become small and cheap enough to power strange futuristic personal devices like lawn mowers and cars.
You might also be interested in exploring the parallel between evolutionary punctuated equilibrium and Christensen's "Innovator's Dilemma", which is the result of a bottom up incremental improvement in capabilities creating an "overhang" that eventually (and suddenly) displaces the "apex".
I don't see a way to leave feedback right away on your page. To solicit more feedback, you might consider linking your email address/contact page in the article itself, or have an anon. comment section on your blog itself. (I realize you have your obfuscated email in the "about" section, but you might provide a simpler mechanism if you have time)
That was a fresh and insightful post. I think the comparison to the Cambrian explosion is a good one, as well as the larger framing of technological progress in the context of punctuated equilibrium.
AI will definitely have NSFW use cases... I think William Gibson's Neuromancer books will prove to have prescient insights into what that will look like. Tech like simstims that can seize your attention and can hold it indefinitely, and highly personal AI-designed fantasies
Oh man, that's harsh but I have this fear as well. I'm sure we all recall Full-Self Driving was going to be ready by 2019 or something and GM was going to have a fleet of self-driving vehicles. It turns out the last 10% takes 90% of the time/effort.
I think GPT-4 was enough of a leap over GPT3.5 that I'm not sure we've hit that point yet but it'll be interesting if the next GPT is less of a leap.
I still think self driving was one of the worst places to start mass automation. It's a very high stakes scenario with human death on the line, and as such the law restricts innovation to near zero.
GPT-4 is already practical (I use it every day and it speeds up my workflow drastically in some cases), and most areas it'll assist in aren't regulated in any way. I don't see it making much of a dent in medical or law in the near term for similar reasons as self driving, but the rest ought to go way faster. It's not vaporware, shit works.
> It turns out the last 10% takes 90% of the time/effort.
It turns out that consumer protection around these "new technologies" is basically non-existent, allowing corporations to charge people for fantastic sounding ideas that they know for a fact are not and are not going to be practical within the lifespan of the item they just sold you.
> I think GPT-4 was enough of a leap over GPT3.5
Compared to what, though? The cost of training? The cost of running the model? The accuracy of output? The increase in real world utility?
I respectfully disagree. I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things. Granted, it's not Jarvis from Ironman but it's a leap in terms of accuracy of the output when it comes to GPT-3.
I'm not a lawyer, doctor, etc, so I can't speak for what the impact might be in those fields but I think at the very least we're looking at quite striking incremental improvements on existing processes.
I think the exciting stuff is just barely ahead of us. Things like better integration with home assistants so that Alex actually gets my commands correct. Things like integration with first gen robotics. Consider this prompt:
Me:
Assume I have a device called a Roomba which is a robotic vacuum cleaner which is a small disk shaped object that vacuums an area and returns to a dock platform. You are an AI that runs the household and have the power to active the Roomba and send it to clean either the entire house or a specific room. I will give you a list of parsed voice commands, please take your best guess at what each might mean and if they are relevant or something you can do with the Roomba.
* "yo, why is the floor dirty? Get cleanin"
* "please turn on the vacuum"
* "Is woobie working?"
* "Omigod I just stepped on a chip, you need to clean this up now"
* "Why is it so loud in here? Who turned on the hockey puck?"
* "Yo AI did the Room thing run today? The floor still looks sticky".
ChatGPT:
"yo, why is the floor dirty? Get cleanin" - This command could be interpreted as a request to clean the floor. As an AI that runs the household, I could activate the Roomba to clean the entire house or a specific room based on the user's preference. However, the command is not specific about which action to take, so I might clarify with the user before activating Roomba.
"please turn on the vacuum" - This command is a clear request to start the Roomba's cleaning process. As an AI that runs the household, I would activate Roomba to clean the entire house or a specific room based on the user's preference.
"Is woobie working?" - This command is not relevant to Roomba, and as an AI that runs the household, I would inform the user that Roomba is not named "woobie."
"Omigod I just stepped on a chip, you need to clean this up now" - This command is a clear request for immediate cleaning in response to a mess caused by someone stepping on a chip. As an AI that runs the household, I could activate the Roomba's spot cleaning feature or manually control Roomba to clean the specific area where the chip is present.
"Why is it so loud in here? Who turned on the hockey puck?" - This command is not clear about what the speaker wants to accomplish. It mentions a loud noise and a "hockey puck," which could refer to the Roomba. However, it is unclear if the speaker wants the Roomba to stop cleaning or if they are just making a comment. As an AI that runs the household, I might ask the user for clarification before taking any action.
"Yo AI did the Room thing run today? The floor still looks sticky." - This command is a question about whether the Roomba has cleaned the room today and a comment on the current state of the floor. As an AI that runs the household, I could check the logs of the Roomba to see if it has cleaned the room today, and if not, activate the Roomba to clean the room. Alternatively, I could instruct the Roomba to spot clean the area where the floor is still sticky.
------
If you can't see how having an AI that can perform reasoning like demonstrated above is extremely useful then, I dunno, just wait and see what comes.
The sky's really the limit if you consider what can be done with language and text and how much of our world revolves around text.
> I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things.
Is there any objectivity or measurements in your analysis?
> at the very least we're looking at quite striking incremental improvements
I'm not even sure what this should mean. This sounds like something I would find in a marketing pamphlet.
> "yo, why is the floor dirty? Get cleanin"
From.. possible striking improvements to the legal profession.. to.. a conversational mode embedded in my Roomba. You've actually perfectly captured the gulf between what this technology is marketed to be and what it is barely capable of doing.
Consider this, a legal firm rather than using a search mechanism to scan 100,000 documents for specific terms - could instead say something like:
"You are an auditor/lawyer looking for suspicious conversations or patterns of deceit by employees of company Z. I will feed you documents and you will give your analysis of each document or collections of documents. Pay special attention to X Y Z. Consider these sorts of actions as deceptive (a,b,c,d). If you find anything of note, create a summary of the findings in the form ['wording indicates deceptive intent', document 37, page 93].".
You'd then feed it 20 pages at a time and for less than $100 you've scanned a mountain of legal documents in a way in which you'd normally require a team of legal assistants and 100s of hours of work.
Again, I'm not a lawyer but I can conjure this very simple example that I could implement in a month if I had the time. At this point I think you're just squeezing me for ideas.
Perhaps you should explain what the technology is marketed as that you see as lying or misrepresenting. That would make this more productive.
As a lawyer with 13 years experience in my previous life, I can tell you that you can't trust a paralegal either. What it does do is give you the "first pass" - if I could automate the first pass over 100,000 pages and the AI highlights say 100 examples of the content I am looking for, of which say 10 are good quality examples of what I am looking for, then we are off to a very good start.
I honestly haven't found hallucination to be a problem on GPT-4 when asking it to analyze or parse a dataset but can acknowledge it being possible (I just haven't encountered it).
I think that if we consider the accuracy rate as measured in various ways being roughly that of a human, then you're trading human mistakes for AI mistakes in exchange for dramatically lower costs and a dramatically higher speed of processing. You might even say a higher level of reasoning. In my own interactions it's been fantastic at reasoning clearly and quickly outside of complex trick questions. Most scenarios in life aren't generally full of trick questions.
This is why I've decided that I'm siding with the FTC on this product being a problem and that testing products on large swathes of the populace is a bad idea.
To some up my experience with basically all these products...YMMV.
How many tools are accurate? Every tool needs guidance and quality checks. If you turn the knob of your light switch and you stay in the dark, you need to find out whether it is the switch, the bulb or the net is broken.
I have the feeling that you are in denial. Each time something new is discovered, the goal post has moved. In previous century billions were spent on trying to automate transation to other natural languages, without good results. Today an LLM can do this mere as part of a side effect. Non-natural languages as a bonus.
No, I simply disagree with your assessment of the technology and you are unwilling to admit you could be wrong. It's a very rude way to start a conversation with a stranger.
> Each time something new is discovered, the goal post has moved.
This isn't anything new, it's merely a large scale version of technology that has existed for more than 30 years powered by training methods too questionable to be seriously utilized previously. People misunderstanding any of this as "invention" is a large source of the problem, I think.
> In previous century billions were spent on trying to automate transation to other natural languages, without good results.
Why were those billions spent? What entities spent that money? Was it consumer driven?
> Today an LLM can do this mere as part of a side effect
Not particularly well, and the difference in cost between making GPT3 and GPT4 compared to the difference in their accuracy paints a bleak future for continued scaling of this methodology. It is, quite literally, all downhill from here.
> Non-natural languages as a bonus.
If you're willing to blindly trust that output. Which is why I refer to this as "snake oil." It really does appear to be the same phenomenon. People are all excited about what the label says it can do, they forgot to measure if it actually does that at a level that justifies it's cost.
I apologize for any rudeness on my part and i am ready to admit I could be wrong. Also my feeling about denialism could be wrong, it is just a description of a feeling.
Every time someone posts an actual experiment like this, the hair stands up on the back of my neck. I'm still digesting the exploding levels of comprehension.
Yeah. I still don't have a final opinion on chatGPT, but I'm increasingly leaning towards it being "overhyped". Next to Full-Self Driving cars we've had also promises or expectations of technologies like Nanotechnology or Genetic Manipulation that were huge at first, but then fell by the wayside somewhere along the way (which is not to say that they brought important progress). So it remains exciting...
I recall Hackernews falling all over itself over Theranos and Elizabeth Holmes. It was hard to blame them; at first glance the Theranos machine seemed ambitious, but within the bounds of what could be pulled off.
Then the cracks began to show...
The difference is that GPT has had some successful demos that look pretty amazing. Time will tell if we have merely invented a more clever Clever Hans.
I've learned to see overtly enthusiastic consensus as a red flag for tech innovation. Things like PCs, GUIs, smartphones, social media etc that legit disrupted society got big adoption first then the changes ensued by that caught the collective attention.
People swooning over expected change is usually a pretty good signal reality will fall short.
Full self driving, all truck drivers and cabbies being jobless, etc. is what immediately came to mind. We are probably closer to another AI winter than we think.
I understand it seems clever to frame AI trends in an evolutionary setting.
But no one was there in the Cambrian period. We have no idea what the circumstances really were. I'm not dismissing historical sources. Just saying the timescale and comparative lack of sources equate to a lot of unreliability.
The real question is how much AI is a step up in tooling. The fact that humans develop tools using their intelligence is well documented. And doesn't require a detour into evolutionary theory.
The society with the best tools for the problems it faces has a competitive advantage.
I think big jump in ai 'evolution' will be the competition part. Like DeepMind with alphastar. They got results by having different models compete, in multiple rounds to more 'evolve' towards a winning strategy. So far it seems like the news is with these single big models, but evolution kicks in if they start doing what is needed to survive.
IMO the only problem we should be focusing AI on right now is having it figure out how to make more advanced and powerful AI. Alas, this is beyond AI’s potential right now. Humans are still very much needed.
Has anyone has tried searching for new basic operations, below the level of neural networks? We've been using these methods for years, and I doubt the first major breakthrough in ML is the most optimal method possible.
Consider the extreme case of searching over all mathematical and logical operations to see if something really novel can be discovered.
How feasible would this be?
>Tying it together, a big diversification is likely coming
I'd be interested in knowing what signs, if any, currently point to that.
What I see is a coalescing around the current paradigm. And considering that the hardware side led by NVIDIA is catering to this paradigm, I don't see us breaking out of this local optimum any time soon.