The Georgia Tech Knowledge Based AI course involved building a program to answer ravens progressive matrix questions. The course was offered in the online MS program, so thousands of students have taken the course. The most impressive result I saw was one student who got nearly perfect results in about 25 lines of Python code.
This may be a case where humans do well on the test, but you can do very well on the test without doing anything the way a human would. The fact that GPTs aren’t very good at the test isn’t probably evidence that they’re not really very smart, but it doesn’t really mean that if we fix them to do very well on the test that they’ve gotten any smarter.
Obviously “Python” was used…and then it used numpy because the image format in the assignment was numpy arrays. However, the 25 lines was basically “sum the rows and sum the columns then compare those vectors” or something like that. This wasn’t really a case of all the complexity being hidden in a dependency; it was a case of finding a very simple heuristic that made the problem trivial.
At least one, obviously, and a rather large one at that.
The point of the comment you replied to is that conventional CV software can recognize the patterns in tests like Ravens Progressive Matrices just fine, and a simple logic tree can then solve them, while the LLM approach is still struggling to get the same result.
This a commonplace shortcoming of the current generation of LLMs: ironically, they often fail at tasks which computers can do perfectly using conventional software.
There are an infinite number of algorithms to compute A from Q, given a se tof (Q, A). Almost none, surely, are intelligent.
These proxy measure of intelligence are just arguments from ignorance, "I don't know how the machine computed A from Q, therefore...".
But of course some of us do know how the machine did it; we can quite easily describe the algorithm. It just turns out no one wants to because its really dumb.
Esp. if the alg is, as in all ML, "start with billions/trillions of data points in the (Q, A) space; generate a compressed representation ZipQA; and for novel Q' find decompressed A located close to Q similar to Q'"
There are no theories of intelligence which would label that intelligence.
And let me say, most such "theories" are ad-hoc PR that are rigged to make whatever the latest gizmo "intelligent".
Any plausible theory begins form the initial intuition, "intelligence is what you do when you dont know what you're doing".
What is more special about how human intelligence works? In the end we are all particles and it all could be trillions of data points very simplistically interacting with each other resulting in emergent behaviour and complex, intelligent results.
We know how common molecules can interact with each other. Does this mean that anything built on top of them is not intelligent?
No argument with the particles/neurons/matter approach to the subject.
It is sound and if you look at us compositionally there is nothing magic about whats going on.
There is, though, something about intuition or instinctual behavior which can constantly recombine/reapply itself to a task at hand.
I know many will balk at intuition, and maybe its only at the very best a heuristic, but i think we need to at least unravel what it is and how it operates before we can understand what makes something classify as human-like intelligence. Is it merely executing a process which we can put our minds into with practice, or is it demonstrating something more general, higher-level.
Well look, compared to the electrified bits of sand in my laptop i'd strongly defend pregnancy as something vastly more "magical" if those are the terms we must use.
People who thing organic adaption, sensory-motor adaption, somatosensory representation building... ie.., all those things which ooze-and-grow so that a paino player can play, or we can here type... are these magic?
Well I think it's exactly the opposite. It's a very anti-intellectual nihilism that all that need be known about the world is the electromagnetic properties of silicon-based transitors.
Those who use the word "magic" in this debate are really like atheists about the moon. It all sounds very smart to deny the moon exists, but in the end, it's actually just a lack of knowledge dressed up as enlightened cynicism.
There are more things to discover in a single cell of our body that we have ever known; and may ever know. All the theories of science needed to explain its operation would exhaust every page we have ever printed. We know a fraction of what we need to know.
And each bit of that fraction reveals an entire universe of "magical" processes unreplicated by copper wires or silicon switches.
You make good points. I think it's a typical trait of the way computer scientists and programmers tend to think. Computer science has made great strides over the decades through abstraction, as well as distillation of complex systems into simpler properties that can easily be computed.
As a result of the combination of this method of thinking and the Dunning-Kruger effect, people in our field tend to apply this to the entire world, even where it doesn't fit very well, like biology, geopolitics, sociology, psychology, etc.
You see a lot of this on HN. People who seem to think they've figured out some very deep truth about another field that can be explained in one hand-waving paragraph, when really there are lots of important details they're ignoring that make their ideas trivially wrong.
Economists have a similar thing going on, I feel. Though I'm not an economist.
As an aside both my parents are prominent economists, I myself have a degree in economics, and I have spent much of my life with a birds eye view of the economics profession and I can emphatically confirm that your feeling is correct.
Economics is zoology presented in the language of physics. Economists are monkeys who've broken into the uniform closet and are now dressed as zookeepers.
I aspire, at best, to be one of the children outside the zoo laughing. I fear I might be the monkey who stole the key...
Remember always, computer science is just discrete mathematics with some automatic whiteboards. It is not science.
And that's the heart of the problem. The CSci crowd have a somewhat well-motivated inclination to treat abstractions as real objects of study; but have been severely misdirected by learning statistics without the scientific method.
This has created a monster: the abstract objects of study are just the associations statistics makes available.
You mix those two together and you have flat-out pseudoscience.
Not sure I agree in this regard. We are after all, aiming to create a mental model which describes reproducible steps for creating general intelligence.
That is, the product is ultimately going to be some set of abstractions or another.
I am not sure what more scientific method you could propose. And we can, in this field produce actual reproducible experiments. Really, more so than any other field.
There's nothing to replicate. ML models are associative statistical models of historical data.
There are no experimental conditions, no causal properties, no modelled causal mechanisms, no theories at all. "Replication" means that you can reproduce an experiment designed to validate a causal hypothesis.
Fitting a function to data isnt an experiment, it's just a way of compressing the data into a more efficient representation. That's all ML is. There are no explanations here (of the data) to assess.
I do think there's an empirical study of ML models and that could be a science. Its output could include things like,
"the reason prompt Q generates A1..An is because documents D1..Dn were in the training data; these documents were created by people P1..Pn for reasons R1..Rn. The answer A1..An related to D1..Dn in so-and-so way. The quality of the answers is Q1..Qn, and derives from the properties of the documents generated by people with beliefs/knowledge/etc. K1..Kn"
This explains how the distribution of the weights produces useful output by giving the causal process that leads to training data distributions.
The relationship between the weights and the training data itself is *not* causal.
Eg., X = 0,1,2,3; Y = A,A,B,B; f(x; w) = A if x <= w else B
w = 1 because the rule x <= 1 partitions Y st. P(x|w) is maximised. These are statistical and logical relationships ("partitions", "maximises").
A causal relationship is between a causal property of an object (extended in space and time) to another causal property by a physical mechanism that reliably and necessarily brings about some effect.
So, "the heat of the boiling water cooked the carrot because heat is... the energetic motion of molecules ... and cooking is .... and so heating brings about cooking necessarily because..."
heating, water, cooking, carrot, motion, molecules, etc.. -- their relationships here are not abstract; they are concretely in space and time, causally effecting each other, etc. etc.
So what do you call the process of discovering those causal properties?
Was physics not actually a science until we uncovered quarks, since we weren’t sure what caused the differences in subatomic particles?
(I’m not a physicist, but I hope that illustrates my point)
Keep in mind most ML papers on arxiv are just describing phenomena we find with these large statistical models.
Also there’s more to CS than ML.
You're conflating the need to use physical devices to find relationships, with the character of those relationships.
I need to use my hand, a pen and paper to draw a mathematical formula. That formula (say, 2+2=4) expresses no causal relationships.
The whole field of computer science is largely concerned with abstract (typically logical) relationships between mathematics objects; or in the case of ML, statistical ones.
Computer science has no scientific methodology for producing scientific explanations -- it isnt science. It is science in the old german sense of just "a systematic study".
Scientists conduct experiments in which they hold fixed some causal variables (ie., causally efficiacious physical properties), and vary others, according to an explanatory framework. They do this in order to explore the space of possible explanations.
I can think of no case in the whole field of csci in which there are cases where causal variables are held fixed; since there is no study of them. Computer science does not study even voltage, or silicon, or anything as physical objects with causal properties (that is electrical egnineering, physics, etc.).
Computer science ought just be called "applied discrete mathematics"
I see where you’re coming from, but I think there’s more to it than that, specifically with non determinism.
So if I observe some phenomena in a bit of software that was built to translate language, say the ability to summarize text.
Then I dig into that software and decide to change a specific portion of it, keeping same all other aspects of the software and its runtime, then I notice it’s no longer able to summarize text.
In that case I’ve discovered a causal relationship between the portion I changed and the phenomenon of text summarization.
Even though the program was constructed, there are unknown aspects.
How is that not the process of science?
Sorry if this is just my question from earlier, rephrased, but I still don’t see how this isn’t a scientific method.
Intuition is a process of slight blind guesses in a system that was built/proposed by a similar organism, in a way that resembles previous systems. Once you get into things like advanced physics, advanced biology, etc, intuition evaporates. Remember these SR/GR things? How intuitive were these? I’d say the current AI is pure intuition in this Q'-ZipQA-A' sense, cause all it does is blind guessing the descent path.
Its wild how this alwayyysss is the argument. Its just "oh so you think humans are special!" >:| and a gigantic "what-if"
its a purely emotional provocation and a universe sized leap, not an argument for LLM's having intelligence or sentience. Anything could be anything, wowww! This goes back to what the other person was saying, "I cannot reason about what is going on behind the curtain, therefore..."
Not arguing LLMs have sentience, but more so whether something that could be considered as "simplistic" as "statistics" could yield in a more complex result.
Whether LLMs have intelligence depends on your definition of intelligence.
Could connections of artificial neurons arranged in certain way as a result of training on data yield in human level intelligence?
Always remember that ML is starting from training data, ie., from a very very very large number of (Prompt,Answer) pairs.
Remember also, that companies like OpenAI are tracking what prompts fail and adding it to their datasets. So their initial data is ever-more just a record of questions-and-answers.
Given this, we should expect that the vast majority of questions we have, of ChatGPT will be answered very very well.
What has this to do with intelligence? Nothing at all.
Intelligence is not answering questions correctly. It's what you use when you dont have the answers and arent even clear on the question.
This is not how ML works generally nor LLMs like ChatGPT specifically.
What you're describing sounds like RLHF, which changes the style of responses and impacts things like refusals but does not add to a model's intelligence (in fact it reduces model intelligence).
An LLM's intelligence comes from pretraining in which there are no prompts, or answers, only corpus and perplexity.
You can map the weights to an uncompressed (Q,A) space, and then map them back again to weight-space -- all without loss of information. The actual domain space they were compressed from is irrelevant; their values are equivalent.
All knowledge/predictions are encoded as a chain of probabilities that something is true, otherwise, what else is it? My brain calculates 0.8 * 0.3 * 0.5 * 0.6 in order to make a 3-pointer, but Michael Jordan's brain ... well his mitochondria does a triple back flip and inverts the voltage into a tachyon particle.
Particles interacting (causally) though a physical mechanism that gives rise to say "wholes" with novel causal properties is not a statistical process. So your premise contradicts your conclusion.
Statistics is an analysis of association, not of causation. The frequency of (Q, A) pairs follows a distribution that is not constrained, or caused by, or explained by, how Q and A are actually related.
For example, recently there was some scandal at microsoft that if you used "pro choice" in prompts you got "demonic cartoons". Why? Presumably because "pro choice" are symbols that accompany such political cartoons in the data set.
So does Q = "pro choice", and A = "cartoons of hell" occur at notable frequency because hell has caused anything? Or because there's a unique semantic mechanism where by "pro choice" means "hell" and so on.
NO.
It is absolutely insane to suggest that we have rigged all our text output so as to align one set of symbols (Q) alongside another (A) such that, Q is the necessary explanation of A. I doubt this is even possible, since most Qs dont have unique As -- so there is actually *no function* to approximate.
In any case, your whole comment is an argument from ignorance as I complained in mine. What you don't know about life, about machines, about intelligence justifies no conclusions at all (esp., "everything in life could be").
And let's be clear. Lots of people do know the answers to your questions, they arent hard to answer. It's just not in any ad companies interest to lead their description of these systems by presenting good-faith research.
Everything printed in the media today is just a game of stock manipulation using the "prognosticator loopwhole" whereby the CEO of nvidia can "prophesy the future" in which his hardware is "of course" essential -- without being held to account for his statements. So when that stock hits its ATH and crashes, no one can sue.
I think we should change this; remove this loophole and suddenly tech boards and propagandists will be much much more reserved.
What could be "statistics" is our intelligence learning from past events, either by natural selection in the scope of generations or our brains during our lifetime. If certain A outcome has occured enough times for Q input, it has resulted in such a structure that is best given the resources available to reach that.
Suppose you touch a fireplace once, do you touch it again? No.
OK, here's something much stranger. Suppose you see your friend touch the fireplace, he recoils in pain. Do you touch it? No.
Hmm... whence statistics? There is no frequency association here, in either case. And in the second, even no experience of the fireplace.
The entire history of science is supposed to be about the failure of statistics to produce explanations. It is a great sin that we have allowed pseudosciences to flourish in which this lesson isnt even understood; and worse, to allow statistical showmen with their magic lanterns to preach on the scientific method. To a point where it seems, almost, science as an ideal has been completely lost.
The entire point was to throw away entirely our reliance on frequency and association -- this is ancient superstition. And instead, to explain the world by necessary mechanisms born of causal properties which interact in complex ways that can never uniquely reveal themselves by direct measurement.
> The entire point was to throw away entirely our reliance on frequency and association -- this is ancient superstition. And instead, to explain the world by necessary mechanisms born of causal properties which interact in complex ways that can never uniquely reveal themselves by direct measurement.
Who said that?
You make it sound like this was some important trend in the past, that got derailed by the evil statisticians (spoiler: there never was such a trend that was big enough to have momentum).
Your rant against statistics is all nice and dandy, but when you have to translate that website from a foreign language into English automatically, when you ask ChatGPT to generate you some code for a project you work on, or when you are glad that Google Maps predicted your arrival time at the office correctly, you rely on the statistics you vilify in essential ways. You basically are a customer of statistics every day (unless you live under a rock, which I don't think you do).
Statistics is good because it works, and it works well in 90% cases which is enough. What you advocate for so zealously (whatever such a causally validated theory would be) currently doesn't.
Well if you want something like the actual history... we have francis bacon getting us close to an abductive (ie., explanatory) method, decartes helped a bit -- then a great catastrophe befell science called Hume.
Since Hume it become popular to somehow rig measurement to make it necessarily informative (Kant), or to claim that measurement has no necessarily informative relation to reality at all (in the end, Russell, Ayer et al.).
It took a while to dig out of that nightmarish hole that philosophers largely created, back into the cold light of experimental reality.
It wasnt statisticians who made the mess; it was first philosophers, and today, people learning statistical methods without learning statistics at all.
Thankfully philosophers started actually reading science, and then they realised they'd go it all wrong. So today, professional research philosophy is allied against the forces of nonsense.
As far as the success of causal explanations, you owe to that everything, including the very machine which runs ChatGPT. That we can make little trinkets on association alone pales in comparison to what we have done by knowing how the world works.
I get the Chomskyan objection re. statistical machine learning, I am partial to it.
But consider these LLMs and such as extremely crude simulations of biological neural networks. They aren't just any statistics; these are biomimetic computations. Then we can in principle "do science" here. We can study skyscrapers and bridges; we can study LLMs and say some scientific things about them. That is quite different than maybe what is going on in industry, but AFAIK there are lots of academic computer scientists who are trying to do just that, bring the science back to the study of these artifacts so that we can have some theorems and proofs, etc. That is - hopefully - more sophisticated a science than trying to poke and prod at a black box and call that empirical science.
The only relationship between artificial neural networks and biology is the vague principle of an activation threshold. In all other ways, the way biological neural networks are organized are ignored. The majority of ANN characteristics are instead influenced by statistics and mathematical arguments, and simple tinkering.
For some ways in which these differ:
- real neurons are vastly complex cells which seem to perform significant non-trivial computations of their own, including memory, and can't be abstracted as a single activation function
- real neural networks include "broadcast neurons" that affect other neurons based on their geometric organization, not on direct synapse connections
- there are various neurotransmitters in the brain with different effects, not just a single type of "signal"
- several glands affect thought and computation in the brain that are not made of neurons
- real neural networks are not organized in a simple set of layers, they form much more complex graphs
And these are just talking about the structure. The way we actually operate these networks has nothing in common with how real neural networks work. Real neural networks have no separate training and inference phases: at all times, they both learn and produce actionable results. Also, the way in which we train ANNs, backpropagation with stochastic gradient descent, is entirely unrelated from how real neural networks learn (which we don't really understand in any significant amount).
So no, I don't think it makes any sense to say that ANNs are a form of biomimetic computation. They are at best as similar to brains as a nylon coat is to an animal's coat.
(P.S. Just to head off a possible diction issue - biomimetics just means taking something from nature as inspiration for doing something, it doesn't mean the reverse which is to "try to understand / emulate nature completely well". E.g. solar panels arranged on a stalk to maximize light is acceptably biomimetic and there is no issue about whether solar panels are faithful enough to model chloroplasts.)
I'm coming from the context of theoretical models of computation, of which there are only so many general ones - Turing machines, lambda calculus, finite state machines, neural networks, Petri nets, a bunch of other historical ones, ... etc. etc. Consider just two, the Turing machine model, versus the most abstract possible neural network. We know that the two are formally computationally equivalent.
Abstractly, the distinguishing feature of theoretical neural networks is that they do computations through graphs. Our brains are graphs (and graphs with spatial constraints as well as many other constraints and things). The actually-existing LLMs are graphs.
Consider, C++ code is not only better modeled by the not-graph Turing machine model, it is also easily an instance of a Turing machine. These man-made computers are instances as well as modeled by von-Neumann architectures, which can be thought of as a real implementation of the Turing machine model of computation proper.
I think this conceptual relationship could be the same for biological brains. They are doing some kind of computable computation. They are not only best modeled by some kind of extremely sophisticated - but computable - neural network model of computation that nobody knows how to define yet (well, Yann LeCun has some powerpoint slides about that apparently). They are also an instance of that highly abstract, theoretical model. It's a consequence of the Church-Turing thesis which I generally buy (because of aforementioned equivalence, etc.): if one thinks the lambda calculus is a better model than neural network for the human brain, I'd like to see it! (It turns out there are cellular models of computation as well, called membrane models.) But that's the granularity I'm interested in.
In different words, the fact that many neural network models (rather, metamodels like "the category of all LLMs") can be bad models or rudimentary models is not a dealbreaker in my opinion, since that is analogous to focusing on implementation details. The goal of scientific research (along the neural network paradigm) would be to try sort that out further (in the form of theory and proofs, in opposition to further "statistical tinkering"). Hope that argument wasn't too handwavy.
If we define biomimetic so broadly that merely some vague inspiration from nature is enough, than I would say the Turing machine is also a biomimetic model. After all, Turing very explicitly modeled it after the activity of a mathematician working with a notebook. The read head represents the eyes of the mathematician scanning the notebook for symbols, the write head is their pencil, and the tape is the notebook itself.
Now, whether CPUs are an instance of a Turing machine or not is again quite debatable, but it's ultimately moot.
I think what matters more for deciding whether it makes sense to call a model biomimetic or not is whether it draws more than a passing inspiration from biology. Do practitioners keep referring back to biology to advance their design (not exclusively, but at least occasionally) or is it studied using other tools? Computers are obviously not biomimetic by this definition, as, beyond the original inspiration, no one has really looked at how mathematicians do their computations on paper to help build a better computer - the field evolved entirely detached from the model that inspired it.
With ANNs, admittedly, the situation is slightly murkier. The majority of advancements happen on mathematical grounds (e.g. choosing nonlinear activation functions to be able to approximate non-linear functions; various enhancements for faster or more stable floating point computations) or from broader computer science/engineering (faster GPUs, the single biggest factor in the advancement of the field).
However, there have been occasional forays back into biology, like the inspiration behind CNNs, and perhaps attention in Transformers. So, perhaps even by my own token, there is some (small) amount of biomimetic feedback in the design of ANNs.
>After all, Turing very explicitly modeled it after the activity of a mathematician working with a notebook. The read head represents the eyes of the mathematician scanning the notebook for symbols, the write head is their pencil, and the tape is the notebook itself.
My feeling on this is complete opposite to yours. To me, this is completely valid mode of discovery, and possibly even what led to the thought of the Turing machine. We are after all, interested in mimicking/reproducing the we way think. So it's perfectly sensible that one would "think about how we think" to try and come up with a model of computation.
I dont care at all about this argument of whether to call something biomemic or not. Thats just semantics. What you associate with meaning "biomemic" is subject to interpretation and one can only establish an objective criteria for it by asserting ones own mental model is the only correct one.
> My feeling on this is complete opposite to yours. To me, this is completely valid mode of discovery, and possibly even what led to the thought of the Turing machine. We are after all, interested in mimicking/reproducing the we way think. So it's perfectly sensible that one would "think about how we think" to try and come up with a model of computation.
I'm not sure if you thought I was being sarcastic, but what I was describing there is literally how Turing came up with the idea, he describes this in the paper where he introduces the concept of computable numbers [0]. I just summarized the non-math bits of his paper.
If you haven't read it, I highly recommend it, it's really easily digestible if you ignore the more mathematical parts. This particular argument appears in section 9.
In the part where it says that all that there is in the universe that is perceivable at human levels of energy and scales is made out of particles in the standard model (dark matter doesn't interact in a way where it could influence us, and dark energy only has effects on extraordinarily large scales).
All measurements and all experiments ever done with matter and fields confirm that it behaves according to the laws of quantum mechanics. Those laws leave absolutely no room for a self that is not an emergent phenomenon of some kind. They also don't leave room for something like a free will that allows "you" to control "your body" by your "will", which is what I assume you might mean by a soul. That is, they clearly show that me writing this reply could have been (in principle) foretold, or at least had a calculated probability.
>There are no theories of intelligence which would label that intelligence.
Actually, compression-is-intelligence has been a relatively popular theory for a couple decades. The Hutter prize from 2006 is based around that premise.
The idea is that compression and abstraction are fundamentally the same operation. If you had a perfect compressor that could compress the digits of pi into the 10-line algorithm that created them, in a deep sense it has understood pi and could generate as many more digits as you want.
Compression as intelligence is "popular" only with people who don't study intelligence. The number of problems with it a vastly too many to list.
Perhaps the most obvious is its a form of inductivism, as it supposes that you can build representations of the targets of the measure domain from the measurement domain itself.
This is just the same as drawing lines around patterns of stars, calling them "star signs" and then believing they are actual objects.
This has absolutely nothing to do with intelligence; and is simply a description of statistical AI. It is no coincidence that this "theory" is only even known in that community, let alone popular elsewhere.
You cannot derive theories (explanations, concepts, understanding, representations) by induction (compression) across measurements. There's lots and lots of work on this, and it is beyond reproach -- for a popular introduction see the first three chapters of David Deutch's fabric of reality.
"Compressionists" might say that representations of the target domain are "compressed" with respect to their measures insofar as they take up less space. This is quite a funnily ridiculous thing to say: of course they do. This has nothing to do with the process of compression... it is a feature only of there being an infinite number of ways to measure any given causal property.
The role "compression" here plays, insofar as it is relevant at all, is a trivial an uninformative observation that "explanations are finite, and measures infinite" -- one is not a compression of the other.
I dont follow. What are "actual" objects? Everything we conceptualize is some abstraction of data. We the see the world in trained concepts.
A "door" for example isn't a fundamental object of the universe. Its a collection of atoms or quarks or whatever you consider fundamental objects of reality, (or even a completely abstract object having some resemblance in space/time), but each part in itself means nothing. It is the collection of them in a certain configuration which we recognize as being a "door".
It is precisely the "compression" of of bunch of data points into a simpler concept.
The same goes for concepts like shapes, diagonal, alternating, etc. There are endless infinite patterns from which we learn to distinguish
When you, eg., take all the positions of stars (etc.) in the sky and compress them, you do not get newton's universal theory of gravity -- indeed it is impossible to induce this theory via compression. It is impossible to induce any theory of physics (and indeed, any explanation) via compression of measurement data.
The only sense in which physics is "compressed" wrt measurement data is just that it's conducted using finite sequences of mathematical symbols given a causal semantics.
The only "data" which can be compressed into, eg., newton's universal law requires you already knowing that law in order to collect it. Almost all measurements of the sky need to be adjusted, or ignored, by already knowing the theory.
Its impossible to induce anything through compression alone as it is reductory. We also need the ability to add and combine things.
The vary nature of the inquiry of physics is to find the simplest mental conception which explains the greatest number of physical phenomena. To improve any system or model while retaining all its qualities, it is necessary to deconstruct it into simpler bits so that it can be reconstructed into something simpler.
Indeed, one can conceive of a physics model in which every observable phenomena is a unique entity which has the properties of doing exactly what is being observed. But attempting to find common properties between phenomena, we can reduce them to simpler explanations, and this can happen in iterative steps. In the the primitive studies of physics, this was reducing the elements into "fire", "water", "steam", "earth", etc. Then we broke this model down further, by saying all elements have something in common, in that they are atoms. And we then attempted to explain commonalities by which atoms behave by breaking it down further.
I remember getting a lot of flak for saying a purely statistical framework is not going to achieve human level intelligence, but I still firmly believe that.
I also believe the path forward is research in knowledge representation, and even now when I search for it, I can barely find anything interesting happening in the field. ML has gotten so much interest and hype because it’s produced fast practical results, but I think it’s going to reach a standstill without something fundamentally new.
I tend to agree, and it’s weird but there are probably lots of actual ML practitioners that have never even heard of the neat vs scruffy debate. Naturally most that have heard of it will feel that the issue is completely resolved already in their favor. On the whole not a very open minded climate.
Credit where it’s due for the wild success of fancy stats, but we should stay interested in hybrid systems with more emphasis on logic, symbols, graphs, interactions and whatever other data structures seem rich and expressive.
Call me old school but frankly I prefer the society-of-mind flavor of system should ultimately be in charge of things like driving cars, running court proceedings, optimizing cities or whole economies. Let it use fancy stats as components and subsystems, sure.. but let it produce coherent arguments or critiques that can actually be understood and summarized and debugged.
You make a very interesting point. Human understanding and logic can be very rationally explained. A judge can for example give a very though response of exactly why they made their verdict. I think that would be an excellent benchmark for AI.
This seems rather impossible when your understanding of the world is connection of billions of messy and uncertain parameters. But perhaps this is the first step? Maybe we can take the neural nets trained by Ml and create constructions on top of it.
I think this is effectively provable from extraordinarily plausible premises.
1. We want to infer A from Q.
2. Most A we dont know, or have no data for, or the data is *in the future*.
3. Most Q we cannot conceptualise accurately
since we have no explanatory theory in which to phrase it or to provide measures of it.
4. All statistical approaches require knowing frequencies of (Q, A) pairs (by def.)
5. In the cases where there is a unique objective frequency of (Q,A) we often cannot know it (2, 3)
6. In most cases there is no unique objective frequency
(eg., there is no single animal any given photograph corresponds to,
nor any objective frequency of such association).
So, conclusion:
In most cases the statistical approach either necessarily fails (its about future data; its about non-objective associations; it's impossible to measure or obtain objective frequences); OR if it doesnt necessarily fail, fails in practice (it is to expensive, or otherwise impossible, to obtain the authoritative QA-frequency).
Now, of course, if your grift is generating nice cartoons or stealing cheap copy from ebooks you can convince the audience in the magical power of associating text tokens. This, of course, should be ignored when addressing the bigger methodological questions.
Bit of a tangent from the thread but what have been the most valuable advances in knowledge representation in the last 20 years? Any articles you could share would be lovely!
I'm not expert and I don't know anything unfortunately. It is something I have spent countless hours walking around my room and pondering myself though for the last 3-4 years. I think I have some interesting ideas and I would love to get a PhD studying it, if I ever get enough financial independence that I don't have to worry about money.
That's a great argument and way of reversing the argument-from-ignorance line.
That said, I think people who argue from ignorance suppose we don't know how AI works either. Since the admen selling it tell them that.
We know exactly and precisely how AI works; we can fully explain it. What we don't know are circumstantial parts of the explanation (eg., what properties of the training data + alg gave rise to the specific weight w1=0.01).
This is like knowing why the thermometer reads 21 deg C (since the motion of the molecules in the water, etc. etc.) -- but not knowing which molecules specifically bounced off it.
This confusion about "what we dont know" allows the prophetic tech grifter class to prognosticate in their interest. Since "we dont know how AI works, then it might work such that i'll be very rich, so invest with me!"
What I find odd is the human obsession with making AI as smart as a human.
In the early periods of AI image generation, you would see AI generate images so uncanny no human could make something like that. It was genuinely unique and otherhumanly, worth of being called art.
We "fixed" that bug so we could make the world's most expensive face swap app.
This may be a case where humans do well on the test, but you can do very well on the test without doing anything the way a human would. The fact that GPTs aren’t very good at the test isn’t probably evidence that they’re not really very smart, but it doesn’t really mean that if we fix them to do very well on the test that they’ve gotten any smarter.