Haven't LLMs simply obsoleted OpenCyc? What could introducing OpenCyc add to LLM...

cookiengineer · 2024-12-10T05:37:49 1733809069

LLMs have just ignored the fundamental problem of reasoning: symbolic inference. They haven't "solved" it, they just don't give a damn about logical correctness.

wordpad25 · 2024-12-10T08:26:34 1733819194

logical correctness as in formal logic is a huge step down

LLMs understand context and meaning and genuine intention

2ro · 2024-12-10T08:46:43 1733820403

Cyc apparently addresses this issue with what are termed "microtheories" - in one theory something can be so, and in a different theory it can be not so:

https://cyc.com/archives/glossary/microtheory/

optimalsolver · 2024-12-11T21:49:51 1733953791

>A microtheory (Mt), also referred to as a context, is a Cyc constant denoting assertions which are grouped together because they share a set of assumptions

This sounds like a world model with extra-steps, and a rather brittle one at that.

How do you choose between two conflicting "microtheories"?

drdaeman · 2024-12-10T05:02:45 1733806965

I’m not familiar with Cyc/OpenCyc, but it seems that it’s not just a knowledge base, but also does inference and reasoning - while LLMs don’t reason and will happily produce completely illogical statements.

rcxdude · 2024-12-10T07:47:28 1733816848

Such systems tend to be equally good at producing nonsense: mainly because it's really hard to make a consistent set of 'facts', and once you have inconsistencies, creative enough logic can produce nonsense in any part of the system.

2ro · 2024-12-10T09:20:59 1733822459

Cyc apparently addresses this issue with what are termed "microtheories" - in one theory something can be so, and in a different theory it can be not so: https://cyc.com/archives/glossary/microtheory/

p1esk · 2024-12-10T06:30:00 1733812200

Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.

n2d4 · 2024-12-10T06:46:33 1733813193

Give it anything that sounds like a riddle, but isn't. Just one example:

> H: The surgeon, who is the boy's father, says "I can't operate on this boy, he's my son!" Who is the surgeon of the boy?

> O1: The surgeon is the boy’s mother.

Also, just because humans don't always think rationally doesn't mean ChatGPT does.

pezezin · 2024-12-10T06:55:12 1733813712

Haha, you are right, I just asked Copilot, and it replied this:

> This is a classic riddle! The surgeon is actually the boy's mother. The riddle plays on the assumption that a surgeon is typically male, but in this case, the surgeon is the boy's mother.

> Did you enjoy this riddle? Do you have any more you'd like to share or solve?

throw310822 · 2024-12-10T07:36:02 1733816162

Ha, good one! Claude gets it wrong too, except for apologizing and correcting itself when questioned:

"I was trying to find a clever twist that isn't actually there. The riddle appears to just be a straightforward statement - a father who is a surgeon saying he can't operate on his son"

More than being illogical, it seems that LLMs can be too hasty and too easily attracted by known patterns. People do the same.

varjag · 2024-12-10T08:03:37 1733817817

It's amazing how great these canned apologies work at anthropomorphising LLMs. It wasn't really in haste, it simply failed because the nuance fell below noise in its training set data but you rectified it with your follow-up correction.

throw310822 · 2024-12-10T08:24:50 1733819090

Well, first of all it failed twice: first it spat out the canned riddle answer, then once I asked it to "double check" it said "sorry, I was wrong: the surgeon IS the boy's father, so there must be a second surgeon..."

Then the follow up correction did have the effect of making it look harder at the question. It actually wrote:

"Let me look at EXACTLY what's given" (with the all caps).

It's not very different from a person that decides to focus harder on a problem once it was fooled by it a couple of times already because it is trickier than it seems. So yes, surprisingly human, with all its flaws.

varjag · 2024-12-10T11:01:23 1733828483

But thing is it wasn't trickier than it seemed. It was simply an outlier entry, like the flipped tortoise question that tripped the android in the Bladerunner interrogation scene. It was not able to think harder without your input.

seymore_12 · 2024-12-10T08:22:26 1733818946

Grok gives this as an excuse for answering "The surgeon is the boy's mother." :

<<Because the surgeon, who is the boy's father, says, "I can't operate on this boy, he's my son!" This indicates that there is another parent involved who is also a surgeon. Given that the statement specifies the boy's father cannot operate, the other surgeon must be the boy's mother.>> Sounds plausible and on the first read, almost logical.

wanderer2323 · 2024-12-10T06:51:27 1733813487

Easy, from my recent chat with o1: (Asked about left null space)

‘’’ these are the vectors that when viewed as linear functionals, annihilate every column of A . <…> Another way to view it: these are the vectors orthogonal to the row space. ‘’’

It’s quite obvious that vectors that “annihilate the columns” would be orthogonal to the column space not the row space.

I don’t know if you think o1 is magic. It still hallucinates, just less often and less obvious.

brokensegue · 2024-12-10T06:53:36 1733813616

average humans don't know what "column spaces" are or what "orthogonal" means

Sabinus · 2024-12-10T07:07:27 1733814447

Average humans don't (usually) confidently give you answers to questions they do now know the meaning of. Nor would you ask them.

throw310822 · 2024-12-10T07:38:26 1733816306

Ah hum. The discriminant is whether they know that they don't know. If they don't, they will happily spit out whatever comes to their mind.

leklund · 2024-12-10T17:25:46 1733851546

Sure average humans don’t do that, but this is hackernews where it’s completely normal for commenters to confidently answer questions and opine on topics they know absolutely nothing about.

mdp2021 · 2024-12-10T07:24:10 1733815450

And why would the "average human" count?!

"Support, the calculator gave a bad result for 345987*14569" // "Yes, well, also your average human would"

...That why we do not ask "average humans"!

rzzzt · 2024-12-10T09:15:21 1733822121

"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

So the result might not necessarily be bad, it's just that the machine _can_ detect that you entered the wrong figures! By the way, the answer is 7.

brokensegue · 2024-12-10T17:25:03 1733851503

average human matters here because the OP said

> Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.

mdp2021 · 2024-12-11T07:57:33 1733903853

> because the OP said

And the whole point is nonsensical. If you discussed whether it would be ethically acceptable to canaries it would make more sense.

"The database is losing records...!" // "Also people forget." : that remains not a good point.

brokensegue · 2024-12-11T19:48:57 1733946537

Because the cost competitive alternative to llms are often just ordinary humans

mdp2021 · 2024-12-12T08:56:00 1733993760

Following the trail as you did originally: you do not hire "ordinary humans", you hire "good ones for the job"; going for a "cost competitive" bargain can be suicidal in private enterprise and criminal in public ones.

Sticking instead to the core matter: the architecture is faulty, unsatisfactory by design, and must be fixed. We are playing with the partials of research and getting some results, even some useful tools, but the idea that this is not the real thing must be clear - also since this two years plus old boom brought another horribly ugly cultural degradation ("spitting out prejudice as normal").

brokensegue · 2024-12-12T13:23:35 1734009815

I interpreted the op's argument to be that

> For simple tasks where we would alternatively hire only ordinary humans AIs have similar error rates.

Yes if a task requires deep expertise or great care the AI is a bad choice. But lots of tasks don't. And in those kinds of tasks even ordinary humans are already too expensive to be economically viable

mdp2021 · 2024-12-18T09:38:01 1734514681

Sorry for the delay. If you are still there:

> But lots of tasks

Do you have good examples of tasks in which dubious verbal prompt could be an acceptable outcome?

By the way, I noticed:

> AI

Do not confuse LLMs with general AI. Notably, general AI was also implemented in system where critical failures would be intolerable - i.e., made to be reliable, or part of a finally reliable process.

brokensegue · 2024-12-19T13:08:08 1734613688

Yes lots of low importance tasks. E.g. assigning a provisional filename to an in progress document

Checking documents for compliance with a corporate style guide

tpm · 2024-12-10T07:38:25 1733816305

LLMs don't know what is true (they have no way of knowing that), but they can babble about any topic. OpenCyc contains 'truth'. If they can be meaningfully combined, it could be good.

It's the same as using LLM for programming, when you have a way to evaluate the output, then it's fine, if not, you can't trust the output as it could be completely hallucinated.

rurban · 2024-12-10T07:19:38 1733815178

No, they are completely orthogonal.

LLM are likelyhood completers, and classifiers. OpenCYC brings some logic and rationale into the classifiers. Without rationale LLM will continue hallucinating, spitting out nonsense.