The Cambrian Period of AI

optimalsolver · on March 30, 2023

It's all deep learning, making it the exact opposite of a Cambrian event.

Has anyone has tried searching for new basic operations, below the level of neural networks? We've been using these methods for years, and I doubt the first major breakthrough in ML is the most optimal method possible.

Consider the extreme case of searching over all mathematical and logical operations to see if something really novel can be discovered.

How feasible would this be?

>Tying it together, a big diversification is likely coming

I'd be interested in knowing what signs, if any, currently point to that.

What I see is a coalescing around the current paradigm. And considering that the hardware side led by NVIDIA is catering to this paradigm, I don't see us breaking out of this local optimum any time soon.

tarvaina · on March 30, 2023

Regarding whether this can be likened to Cambrian explosion: We don’t know what caused Cambrian explosion but conceivably it was some evolutionary innovation such as vision. Similarly deep learning is a cause for the explosion of new AI use cases.

b33j0r · on March 30, 2023

Sometimes I feel like all we’re saying is that we’re surprised that all of our capabilities might just be a statistical model.

From everything we do know, what else would it be? The arrogance of consciousness is our blind-spot sometimes.

Even the words my brain just typed are combinations of words and arguments from everything you’ve ever agreed or disagreed with.

So, in my view. Traceability in AI is possible, but so should traceability in neurology be possible. It might not be as important as it seems to us. And that humility hits people in various ways.

My idea to calm everyone down about traceability is to have the AI write a log/wiki in parallel with its activities.

Putting toothpaste back in a tube has never worked, so don’t light your own hair on fire, right?

barking_biscuit · on March 31, 2023

>Putting toothpaste back in a tube has never worked

The confetti has left the cannon.

seydor · on March 30, 2023

> for new basic operations, below the level of neural networks?

Why would we? It's proven that neural networks can learn arbitrary functions, and deep networks can apparently learn all the things we don't have functions for. Backpropagation is probably biologically implausible but it's a very good and fast method to train such networks. What could another implementation of intelligence offer that we don't already have? Backpropagation is very old and maybe it's not optimal but it s good enough that it doesnt matter

version_five · on March 30, 2023

Can learn arbitrary functions says nothing about efficiency or any practical considerations. A vanilla neural network can't learn a sine function or anything periodic, for inputs outside the range of the training data.

Re the question of other methods, Jeffrey Hinton proposed what I think is called a "forward-forward" alternative to back propagation somewhat recently. I don't think it's been proven to be better in any way, but it shows people are searching for new methods.

agnosticmantis · on March 30, 2023

It’s funny that the masters (Hinton, LeCun, et al) are constantly looking beyond the orthodoxy while the disciples worship the masters’ creations and fight any and all criticism.

Wondering where I saw that pattern before…

seydor · on March 30, 2023

> says nothing about efficiency or any practical considerations

Sure it does, it's practical for pretty much any problem we want to solve, in fact i can't think of something it has not been shown to be practical about. Gradient descent is pretty straightforward and the most efficient generic method we have, even though it's not proven to be the fastest. Forward-forward is proposed as an alternative that is biologically plausible, not as an efficient alternative (it isnt)

snovv_crash · on March 30, 2023

If you have a smaller number of parameters, then quadratic descent (eg. Levenberg Marquardt) is a lot faster for most problems.

Gradient descent is only used for NN because we initialize all the parameters to random values, so the assumption of the JtJ approximating the Hessian (required by LM) doesn't hold, since residuals are too high. I suspect there might also be something about the learning rate effectively being too high when using quadratic descent.

Even for large problems there are techniques like LBFGS which converge a lot faster than gradient descent, that don't require O(N^2) memory on the number of parameters.

slabity · on March 30, 2023

Why are you responding to only parts of comments and ignoring the context around it? It just makes it look like strawmen arguments.

> Can learn arbitrary functions says nothing about efficiency or any practical considerations.

This is absolutely true. A turing machine can perform any computation, but they are not really practical for anything outside of an academic exercise.

> A vanilla neural network can't learn a sine function or anything periodic, for inputs outside the range of the training data.

This is also true. A vanilla neural network isn't that useful outside of very basic tasks with a lot of training data. It's only when you start changing their shapes and using specialized techniques to fit the problem that they become practical. Even a simple task like classifying MNIST digits is done far more efficiently using a basic CNN than a vanilla NN.

moffkalast · on March 30, 2023

Efficiency is a function of the underlying hardware. Most graphics shaders would be comically inefficient if not for graphical accelerators. With this explosion likely ending up with an inference accelerator in every device, it should make it practical and efficient even if it's not the best approach.

mejutoco · on March 30, 2023

Arbitrary functions that are differentiable, unless I am mistaken.

seydor · on March 30, 2023

bounded iirc, the activation functions need to be differentiable, but in practice they dont need to be strictly

mejutoco · on March 31, 2023

I thought gradient descent only worked on differentiable functions.

https://en.wikipedia.org/wiki/Gradient_descent

lachlan_gray · on March 31, 2023

It's not in the networks themselves, but the combinatorial ways that we can string them together. Even if the language models themselves don't become much more powerful, there are a lot of really weird things to discover in terms of how we can pipeline them together to accomplish things. For example things like this:

https://yoheinakajima.com/task-driven-autonomous-agent-utili...

In a way, you can also look at language models as being the new basic operation. Activations and floating point math are replaced by words and symbolic reasoning.

lachlan_gray · on March 30, 2023

Hey hn, this is my first blog post. The community here inspired me to finally try it out.

I would really appreciate any feedback on clarity, or anything that causes good or bad gut reactions.

gwd · on March 30, 2023

Your description of "dumb AI" as being "just useful enough to maintain a flow of funding" reminds me of A Collection of Unmitigated Pedatry's description of the start of the Industrial Revolution [1]:

> The specificity matters here because each innovation in the chain required not merely the discovery of the principle, but also the design and an economically viable use-case to all line up in order to have impact. The steam engine is an excellent example of this problem. Early tinkering with the idea of using heat to create steam to power rotary motion – the core function of a steam-engine – go all the way back to Vitruvius (c. 80 BC -15 AD) and Heron of Alexandria (c. 10-70 AD). With the benefit of hindsight we can see they were tinkering with an importance principle but the devices they actually produced – the aeolipile – had no practical use – it’s fearsomely fuel inefficient, produces little power and has to be refilled with water (that then has to be heated again from room temperature to enable operation).

He goes on to say that the very first "commercial" steam engine only happened to be commercially useful because of the particular situation in which it was invented: England had cut down most of their trees, but luckily had lots of coal. The engine wasn't quite useful enough to use to (say) pump water out of an iron mine, because it was so resource-hungry that it the cost of the fuel plus getting fuel to the engine would be too much. But it's just barely useful enough to pump water out of a coal mine, if you can provide it coal without having to transport it. And that gave it just enough of a commercial toe-hold to fund its optimization.

It sounds like "dumb AI" of the 2000's may have performed a similar function; and we are, perhaps, on the edge of the "AI revolution", similar to the Industrial Revolution, where we reach a hockey-stick proliferation.

EDIT: Fixed link

[1] https://acoup.blog/2022/08/26/collections-why-no-roman-indus...

lachlan_gray · on March 31, 2023

This is extremely interesting and I will keep it for later! I think the engine is a perfect analogy for what makes language models exciting. At the time nobody would have realized that it would power trains or factories... it's just this huge inefficient dirty machine that gets water out of the coal mine. Let alone realize that given better fuel, they would become small and cheap enough to power strange futuristic personal devices like lawn mowers and cars.

webmaven · on March 31, 2023

You might also be interested in exploring the parallel between evolutionary punctuated equilibrium and Christensen's "Innovator's Dilemma", which is the result of a bottom up incremental improvement in capabilities creating an "overhang" that eventually (and suddenly) displaces the "apex".

smburdick · on March 30, 2023

I don't see a way to leave feedback right away on your page. To solicit more feedback, you might consider linking your email address/contact page in the article itself, or have an anon. comment section on your blog itself. (I realize you have your obfuscated email in the "about" section, but you might provide a simpler mechanism if you have time)

Happy blogging!

lachlan_gray · on March 31, 2023

Thanks for the idea! I'm looking into this now, looks like there are batteries-included services I can plunk down into the website

kakapo88 · on March 30, 2023

That was a fresh and insightful post. I think the comparison to the Cambrian explosion is a good one, as well as the larger framing of technological progress in the context of punctuated equilibrium.

lachlan_gray · on March 31, 2023

I'm glad you enjoyed it!

the_gipsy · on March 30, 2023

I enjoyed the read. Nice take on the future of AI without going handwavy.

lachlan_gray · on March 31, 2023

that's good to hear- my hands like to wave and I tried hard to keep them at bay

whoibrar · on March 31, 2023

Well Written, Reading this felt like a breath of fresh air amid all the fog and pollution by "AI will take all our jobs" and "How to $$$ with AI".

wslh · on March 30, 2023

Could you please add a few sentences/paragraphs about your core idea because now NSFW? Obviously without using ChatGPT ;-)

lachlan_gray · on March 31, 2023

AI will definitely have NSFW use cases... I think William Gibson's Neuromancer books will prove to have prescient insights into what that will look like. Tech like simstims that can seize your attention and can hold it indefinitely, and highly personal AI-designed fantasies

anoy8888 · on March 30, 2023

Well written . Thanks

lachlan_gray · on March 31, 2023

thanks!

manojlds · on March 30, 2023

Or is this the String Theory period of AI.

dudeinhawaii · on March 30, 2023

Oh man, that's harsh but I have this fear as well. I'm sure we all recall Full-Self Driving was going to be ready by 2019 or something and GM was going to have a fleet of self-driving vehicles. It turns out the last 10% takes 90% of the time/effort.

I think GPT-4 was enough of a leap over GPT3.5 that I'm not sure we've hit that point yet but it'll be interesting if the next GPT is less of a leap.

moffkalast · on March 30, 2023

I still think self driving was one of the worst places to start mass automation. It's a very high stakes scenario with human death on the line, and as such the law restricts innovation to near zero.

GPT-4 is already practical (I use it every day and it speeds up my workflow drastically in some cases), and most areas it'll assist in aren't regulated in any way. I don't see it making much of a dent in medical or law in the near term for similar reasons as self driving, but the rest ought to go way faster. It's not vaporware, shit works.

akira2501 · on March 30, 2023

> It turns out the last 10% takes 90% of the time/effort.

It turns out that consumer protection around these "new technologies" is basically non-existent, allowing corporations to charge people for fantastic sounding ideas that they know for a fact are not and are not going to be practical within the lifespan of the item they just sold you.

> I think GPT-4 was enough of a leap over GPT3.5

Compared to what, though? The cost of training? The cost of running the model? The accuracy of output? The increase in real world utility?

It's the "Wild West Snake Oil" period of AI.

dudeinhawaii · on March 30, 2023

I respectfully disagree. I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things. Granted, it's not Jarvis from Ironman but it's a leap in terms of accuracy of the output when it comes to GPT-3.

I'm not a lawyer, doctor, etc, so I can't speak for what the impact might be in those fields but I think at the very least we're looking at quite striking incremental improvements on existing processes.

I think the exciting stuff is just barely ahead of us. Things like better integration with home assistants so that Alex actually gets my commands correct. Things like integration with first gen robotics. Consider this prompt:

Me: Assume I have a device called a Roomba which is a robotic vacuum cleaner which is a small disk shaped object that vacuums an area and returns to a dock platform. You are an AI that runs the household and have the power to active the Roomba and send it to clean either the entire house or a specific room. I will give you a list of parsed voice commands, please take your best guess at what each might mean and if they are relevant or something you can do with the Roomba.

* "yo, why is the floor dirty? Get cleanin"

* "please turn on the vacuum"

* "Is woobie working?"

* "Omigod I just stepped on a chip, you need to clean this up now"

* "Why is it so loud in here? Who turned on the hockey puck?"

* "Yo AI did the Room thing run today? The floor still looks sticky".

ChatGPT:

"yo, why is the floor dirty? Get cleanin" - This command could be interpreted as a request to clean the floor. As an AI that runs the household, I could activate the Roomba to clean the entire house or a specific room based on the user's preference. However, the command is not specific about which action to take, so I might clarify with the user before activating Roomba.

"please turn on the vacuum" - This command is a clear request to start the Roomba's cleaning process. As an AI that runs the household, I would activate Roomba to clean the entire house or a specific room based on the user's preference.

"Is woobie working?" - This command is not relevant to Roomba, and as an AI that runs the household, I would inform the user that Roomba is not named "woobie."

"Omigod I just stepped on a chip, you need to clean this up now" - This command is a clear request for immediate cleaning in response to a mess caused by someone stepping on a chip. As an AI that runs the household, I could activate the Roomba's spot cleaning feature or manually control Roomba to clean the specific area where the chip is present.

"Why is it so loud in here? Who turned on the hockey puck?" - This command is not clear about what the speaker wants to accomplish. It mentions a loud noise and a "hockey puck," which could refer to the Roomba. However, it is unclear if the speaker wants the Roomba to stop cleaning or if they are just making a comment. As an AI that runs the household, I might ask the user for clarification before taking any action.

"Yo AI did the Room thing run today? The floor still looks sticky." - This command is a question about whether the Roomba has cleaned the room today and a comment on the current state of the floor. As an AI that runs the household, I could check the logs of the Roomba to see if it has cleaned the room today, and if not, activate the Roomba to clean the room. Alternatively, I could instruct the Roomba to spot clean the area where the floor is still sticky.

------

If you can't see how having an AI that can perform reasoning like demonstrated above is extremely useful then, I dunno, just wait and see what comes.

The sky's really the limit if you consider what can be done with language and text and how much of our world revolves around text.

Edit: formatting

akira2501 · on March 30, 2023

> I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things.

Is there any objectivity or measurements in your analysis?

> at the very least we're looking at quite striking incremental improvements

I'm not even sure what this should mean. This sounds like something I would find in a marketing pamphlet.

> "yo, why is the floor dirty? Get cleanin"

From.. possible striking improvements to the legal profession.. to.. a conversational mode embedded in my Roomba. You've actually perfectly captured the gulf between what this technology is marketed to be and what it is barely capable of doing.

dudeinhawaii · on March 30, 2023

Consider this, a legal firm rather than using a search mechanism to scan 100,000 documents for specific terms - could instead say something like:

"You are an auditor/lawyer looking for suspicious conversations or patterns of deceit by employees of company Z. I will feed you documents and you will give your analysis of each document or collections of documents. Pay special attention to X Y Z. Consider these sorts of actions as deceptive (a,b,c,d). If you find anything of note, create a summary of the findings in the form ['wording indicates deceptive intent', document 37, page 93].".

You'd then feed it 20 pages at a time and for less than $100 you've scanned a mountain of legal documents in a way in which you'd normally require a team of legal assistants and 100s of hours of work.

Again, I'm not a lawyer but I can conjure this very simple example that I could implement in a month if I had the time. At this point I think you're just squeezing me for ideas.

Perhaps you should explain what the technology is marketed as that you see as lying or misrepresenting. That would make this more productive.

ChatGTP · on March 31, 2023

Key problem being, you cannot trust that what you're getting is accurate information, which somewhat negates having such a tool in the first place.

nixlim · on March 31, 2023

As a lawyer with 13 years experience in my previous life, I can tell you that you can't trust a paralegal either. What it does do is give you the "first pass" - if I could automate the first pass over 100,000 pages and the AI highlights say 100 examples of the content I am looking for, of which say 10 are good quality examples of what I am looking for, then we are off to a very good start.

dudeinhawaii · on March 31, 2023

I honestly haven't found hallucination to be a problem on GPT-4 when asking it to analyze or parse a dataset but can acknowledge it being possible (I just haven't encountered it).

I think that if we consider the accuracy rate as measured in various ways being roughly that of a human, then you're trading human mistakes for AI mistakes in exchange for dramatically lower costs and a dramatically higher speed of processing. You might even say a higher level of reasoning. In my own interactions it's been fantastic at reasoning clearly and quickly outside of complex trick questions. Most scenarios in life aren't generally full of trick questions.

ChatGTP · on March 31, 2023

This is why I've decided that I'm siding with the FTC on this product being a problem and that testing products on large swathes of the populace is a bad idea.

To some up my experience with basically all these products...YMMV.

nuancebydefault · on March 31, 2023

How many tools are accurate? Every tool needs guidance and quality checks. If you turn the knob of your light switch and you stay in the dark, you need to find out whether it is the switch, the bulb or the net is broken.

nuancebydefault · on March 31, 2023

I have the feeling that you are in denial. Each time something new is discovered, the goal post has moved. In previous century billions were spent on trying to automate transation to other natural languages, without good results. Today an LLM can do this mere as part of a side effect. Non-natural languages as a bonus.

akira2501 · on March 31, 2023

> I have the feeling that you are in denial.

No, I simply disagree with your assessment of the technology and you are unwilling to admit you could be wrong. It's a very rude way to start a conversation with a stranger.

> Each time something new is discovered, the goal post has moved.

This isn't anything new, it's merely a large scale version of technology that has existed for more than 30 years powered by training methods too questionable to be seriously utilized previously. People misunderstanding any of this as "invention" is a large source of the problem, I think.

> In previous century billions were spent on trying to automate transation to other natural languages, without good results.

Why were those billions spent? What entities spent that money? Was it consumer driven?

> Today an LLM can do this mere as part of a side effect

Not particularly well, and the difference in cost between making GPT3 and GPT4 compared to the difference in their accuracy paints a bleak future for continued scaling of this methodology. It is, quite literally, all downhill from here.

> Non-natural languages as a bonus.

If you're willing to blindly trust that output. Which is why I refer to this as "snake oil." It really does appear to be the same phenomenon. People are all excited about what the label says it can do, they forgot to measure if it actually does that at a level that justifies it's cost.

nuancebydefault · on April 2, 2023

I apologize for any rudeness on my part and i am ready to admit I could be wrong. Also my feeling about denialism could be wrong, it is just a description of a feeling.

slfnflctd · on March 30, 2023

Every time someone posts an actual experiment like this, the hair stands up on the back of my neck. I'm still digesting the exploding levels of comprehension.

ChatGTP · on March 31, 2023

mind = blown

smartmic · on March 30, 2023

Yeah. I still don't have a final opinion on chatGPT, but I'm increasingly leaning towards it being "overhyped". Next to Full-Self Driving cars we've had also promises or expectations of technologies like Nanotechnology or Genetic Manipulation that were huge at first, but then fell by the wayside somewhere along the way (which is not to say that they brought important progress). So it remains exciting...

bitwize · on March 30, 2023

I recall Hackernews falling all over itself over Theranos and Elizabeth Holmes. It was hard to blame them; at first glance the Theranos machine seemed ambitious, but within the bounds of what could be pulled off.

Then the cracks began to show...

The difference is that GPT has had some successful demos that look pretty amazing. Time will tell if we have merely invented a more clever Clever Hans.

namaria · on March 31, 2023

I've learned to see overtly enthusiastic consensus as a red flag for tech innovation. Things like PCs, GUIs, smartphones, social media etc that legit disrupted society got big adoption first then the changes ensued by that caught the collective attention.

People swooning over expected change is usually a pretty good signal reality will fall short.

mberning · on March 30, 2023

Full self driving, all truck drivers and cabbies being jobless, etc. is what immediately came to mind. We are probably closer to another AI winter than we think.

binarymax · on March 30, 2023

Well, String Theory hasn't made any practical impact on the day-to-day life of millions of people.

This is a bit different.

BartjeD · on March 30, 2023

I understand it seems clever to frame AI trends in an evolutionary setting.

But no one was there in the Cambrian period. We have no idea what the circumstances really were. I'm not dismissing historical sources. Just saying the timescale and comparative lack of sources equate to a lot of unreliability.

The real question is how much AI is a step up in tooling. The fact that humans develop tools using their intelligence is well documented. And doesn't require a detour into evolutionary theory.

The society with the best tools for the problems it faces has a competitive advantage.

svachalek · on March 30, 2023

At some point in this story though, it's likely AI is the intelligence and humans are the tooling. I don't think we necessarily know how that ends.

FrustratedMonky · on March 30, 2023

I think big jump in ai 'evolution' will be the competition part. Like DeepMind with alphastar. They got results by having different models compete, in multiple rounds to more 'evolve' towards a winning strategy. So far it seems like the news is with these single big models, but evolution kicks in if they start doing what is needed to survive.

xwdv · on March 30, 2023

IMO the only problem we should be focusing AI on right now is having it figure out how to make more advanced and powerful AI. Alas, this is beyond AI’s potential right now. Humans are still very much needed.

JohnFen · on March 30, 2023

Are you lamenting the fact that humans are still of value?

goatlover · on March 31, 2023

That's a fantastic way to get misaligned AI.

boringuser2 · on March 30, 2023

FrustratedMonky · on March 30, 2023

LOL. Love it. It's really metaphors all the way down.

devindotcom · on March 30, 2023

Heh, I started an article with the same metaphor was too busy to finish it. We'll see what the fossil record looks like in a few years though.

th16row · on March 31, 2023

Or not, and perhaps this is all there is to it and every new model is less revolutionary than the last one. Diminishing returns and all that.

earthboundkid · on March 30, 2023

It's 1980 and every LLM is a Wang.