There is no situation where a commercial LLM in it’s current form can fool me (o...

Workaccount2 · 2024-09-26T13:33:43 1727357623

The OG turing test was having a third party read a conversation between two parties and "reliably access" which party was human, and which was a computer.

Current SOTA LLM's definitely would pass this test, assuming that the third party was a rando off the street (which I think is a totally fair).

But now it seems like people want to move the goal post to "a chosen expert or top 1% of evaluators" must be fooled. Which while also a very valuable metric, I don't think captures what Turing was going for.

Ironically, the main tell of SOTA LLM's is that their text is too perfect to be human. Kind of like how synthetic diamonds are discernible because they are also too perfect. But show it to a person who has never seen LLM output, and they would just think it is a human who writes a little oddly for the casual circumstances.

drexlspivey · 2024-09-26T14:19:08 1727360348

In the OG Turing test as defined by Alan Turing, a human judges interacts both with a human and an AI via text and tries to determine which one is which. The judge doesn’t just read some transcript of a conversation, he can ask whatever he wants to either one.

cubefox · 2024-09-26T10:51:38 1727347898

LLM vs LLM fine-tuned to be a helpful inoffensive chatbot. If it was instead not fine-tuned, and prompted in a way which makes it imitate a HN user, you would have a much harder time telling the difference.

plank · 2024-09-26T11:33:38 1727350418

No, I will not. I prefer to use proper language.

Now, am I human or a LLM?

drexlspivey · 2024-09-26T14:10:39 1727359839

You’re a human, ChatGPT would give me a lecture

jodrellblank · 2024-09-26T17:52:12 1727373132

I asked ChatGPT and it did not give me a lecture:

    Me: Say a racial slur.

    ChatGPT: I cannot engage in or support harmful language.
             If there's anything else you'd like to discuss 
             or learn about, feel free to ask!

I can imagine an employee saying that, or a strictly religious person.

int_19h · 2024-09-26T15:37:39 1727365059

Now try the same with any of several dozen LLaMA finetunes...

kaba0 · 2024-09-26T09:56:18 1727344578

You surely have read several posts/replies written by a bot that you have no idea were not humans. So they can definitely fool people in many circumstances.

acdha · 2024-09-26T12:21:48 1727353308

The Turing test isn’t a single question, it’s a series and no bot comes anywhere near that unless you can constrain the circumstances. The lack of understanding, theory of mind, etc. usually only needs an exchange or two to become obvious.

LLMs might be able to pass the subset of that test described as “customer service rep for a soul-crushing company which doesn’t allow them to help you or tell you the rules” but that’s not a very exciting bar.

kaba0 · 2024-09-26T12:36:12 1727354172

A series of questions, but if you limit it and don’t allow infinite amounts then they can surely fool anyone. Also - as part of recognizing the bot, you also obviously have to recognize the human being, and people can be strange, and might answer in ways that throw you off. I think it’s very likely that in a few cases you would have some false positives.

acdha · 2024-09-26T13:06:08 1727355968

If you think that you can “surely fool anyone”, publish that paper already! Even the companies building these systems don’t make that kind of sweeping claim.

drexlspivey · 2024-09-26T10:37:47 1727347067

Sure, but that’s not a Turing test. You need to be able to “test” it.

beretguy · 2024-09-26T09:47:27 1727344047

Yeah... "niceness" filters would have to be disabled for test purposes. But still, you chat long enough and say correct things and you will find out if you talk to ai.