Hacker News new | past | comments | ask | show | jobs | submit login

Haven't LLMs simply obsoleted OpenCyc? What could introducing OpenCyc add to LLMs, and why wouldn't allowing the LLM to look up Wikipedia articles accomplish the same thing?



LLMs have just ignored the fundamental problem of reasoning: symbolic inference. They haven't "solved" it, they just don't give a damn about logical correctness.


logical correctness as in formal logic is a huge step down

LLMs understand context and meaning and genuine intention


Cyc apparently addresses this issue with what are termed "microtheories" - in one theory something can be so, and in a different theory it can be not so:

https://cyc.com/archives/glossary/microtheory/


>A microtheory (Mt), also referred to as a context, is a Cyc constant denoting assertions which are grouped together because they share a set of assumptions

This sounds like a world model with extra-steps, and a rather brittle one at that.

How do you choose between two conflicting "microtheories"?


I’m not familiar with Cyc/OpenCyc, but it seems that it’s not just a knowledge base, but also does inference and reasoning - while LLMs don’t reason and will happily produce completely illogical statements.


Such systems tend to be equally good at producing nonsense: mainly because it's really hard to make a consistent set of 'facts', and once you have inconsistencies, creative enough logic can produce nonsense in any part of the system.


Cyc apparently addresses this issue with what are termed "microtheories" - in one theory something can be so, and in a different theory it can be not so: https://cyc.com/archives/glossary/microtheory/


Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.


Give it anything that sounds like a riddle, but isn't. Just one example:

> H: The surgeon, who is the boy's father, says "I can't operate on this boy, he's my son!" Who is the surgeon of the boy?

> O1: The surgeon is the boy’s mother.

Also, just because humans don't always think rationally doesn't mean ChatGPT does.


Haha, you are right, I just asked Copilot, and it replied this:

> This is a classic riddle! The surgeon is actually the boy's mother. The riddle plays on the assumption that a surgeon is typically male, but in this case, the surgeon is the boy's mother.

> Did you enjoy this riddle? Do you have any more you'd like to share or solve?


Ha, good one! Claude gets it wrong too, except for apologizing and correcting itself when questioned:

"I was trying to find a clever twist that isn't actually there. The riddle appears to just be a straightforward statement - a father who is a surgeon saying he can't operate on his son"

More than being illogical, it seems that LLMs can be too hasty and too easily attracted by known patterns. People do the same.


It's amazing how great these canned apologies work at anthropomorphising LLMs. It wasn't really in haste, it simply failed because the nuance fell below noise in its training set data but you rectified it with your follow-up correction.


Well, first of all it failed twice: first it spat out the canned riddle answer, then once I asked it to "double check" it said "sorry, I was wrong: the surgeon IS the boy's father, so there must be a second surgeon..."

Then the follow up correction did have the effect of making it look harder at the question. It actually wrote:

"Let me look at EXACTLY what's given" (with the all caps).

It's not very different from a person that decides to focus harder on a problem once it was fooled by it a couple of times already because it is trickier than it seems. So yes, surprisingly human, with all its flaws.


But thing is it wasn't trickier than it seemed. It was simply an outlier entry, like the flipped tortoise question that tripped the android in the Bladerunner interrogation scene. It was not able to think harder without your input.


Grok gives this as an excuse for answering "The surgeon is the boy's mother." :

<<Because the surgeon, who is the boy's father, says, "I can't operate on this boy, he's my son!" This indicates that there is another parent involved who is also a surgeon. Given that the statement specifies the boy's father cannot operate, the other surgeon must be the boy's mother.>> Sounds plausible and on the first read, almost logical.


Easy, from my recent chat with o1: (Asked about left null space)

‘’’ these are the vectors that when viewed as linear functionals, annihilate every column of A . <…> Another way to view it: these are the vectors orthogonal to the row space. ‘’’

It’s quite obvious that vectors that “annihilate the columns” would be orthogonal to the column space not the row space.

I don’t know if you think o1 is magic. It still hallucinates, just less often and less obvious.


average humans don't know what "column spaces" are or what "orthogonal" means


Average humans don't (usually) confidently give you answers to questions they do now know the meaning of. Nor would you ask them.


Ah hum. The discriminant is whether they know that they don't know. If they don't, they will happily spit out whatever comes to their mind.


Sure average humans don’t do that, but this is hackernews where it’s completely normal for commenters to confidently answer questions and opine on topics they know absolutely nothing about.


And why would the "average human" count?!

"Support, the calculator gave a bad result for 345987*14569" // "Yes, well, also your average human would"

...That why we do not ask "average humans"!


"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

So the result might not necessarily be bad, it's just that the machine _can_ detect that you entered the wrong figures! By the way, the answer is 7.


average human matters here because the OP said

> Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.


> because the OP said

And the whole point is nonsensical. If you discussed whether it would be ethically acceptable to canaries it would make more sense.

"The database is losing records...!" // "Also people forget." : that remains not a good point.


Because the cost competitive alternative to llms are often just ordinary humans


Following the trail as you did originally: you do not hire "ordinary humans", you hire "good ones for the job"; going for a "cost competitive" bargain can be suicidal in private enterprise and criminal in public ones.

Sticking instead to the core matter: the architecture is faulty, unsatisfactory by design, and must be fixed. We are playing with the partials of research and getting some results, even some useful tools, but the idea that this is not the real thing must be clear - also since this two years plus old boom brought another horribly ugly cultural degradation ("spitting out prejudice as normal").


I interpreted the op's argument to be that

> For simple tasks where we would alternatively hire only ordinary humans AIs have similar error rates.

Yes if a task requires deep expertise or great care the AI is a bad choice. But lots of tasks don't. And in those kinds of tasks even ordinary humans are already too expensive to be economically viable


Sorry for the delay. If you are still there:

> But lots of tasks

Do you have good examples of tasks in which dubious verbal prompt could be an acceptable outcome?

By the way, I noticed:

> AI

Do not confuse LLMs with general AI. Notably, general AI was also implemented in system where critical failures would be intolerable - i.e., made to be reliable, or part of a finally reliable process.


Yes lots of low importance tasks. E.g. assigning a provisional filename to an in progress document

Checking documents for compliance with a corporate style guide


LLMs don't know what is true (they have no way of knowing that), but they can babble about any topic. OpenCyc contains 'truth'. If they can be meaningfully combined, it could be good.

It's the same as using LLM for programming, when you have a way to evaluate the output, then it's fine, if not, you can't trust the output as it could be completely hallucinated.


No, they are completely orthogonal.

LLM are likelyhood completers, and classifiers. OpenCYC brings some logic and rationale into the classifiers. Without rationale LLM will continue hallucinating, spitting out nonsense.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: