Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> but all it does is regurgitate existing works

I am genuinely intrigued by this point of view and so would love to hear people who hold it's reasoning.

Over the last few days I've seen hundreds of poems and stories from ones about climate change in the style of a sonnet to peanut butter sandwiches getting stuck in toasters in the style of the bible. I even asked it to make a text adventure game for me to play where I could put in any instruction, leading to a unique series of events and narrative.

Is the claim that these were all simply copy and pastes of something on the internet in their entirety? And that as such the internet already seems to contain essentially every permutation of everything I could ask ChatGPT, as to me this sounds highly implausible.

If the claim is that whilst these are not direct copy pastes, it is essentially a remix of lots of different things people have said before on the internet repurposed to a different end, is that not literally just what language is? Humans use common sayings, idioms, slang and phrases all the time, never mind the 'tropes' and story lines that are reused constantly. Coders use common patterns and styles and copy from stackoverflow. In fact language literally only works because we all share it and share the meaning of it.

If we are saying that all ChatGPT does is remix existing language and phrases to a new purpose... to me we are saying ChatGPT does the same thing as humans.

Any thoughts would be appreciated.



If you spend enough time with it you can quickly start to see the building blocks. More than just the similarities you see in several works by the same person.

If you’ve ever played a video game where the levels are programmatically generated, you’ll eventually notice the modular nature of everything. It becomes very noticeable and it’s different from noticing natural patterns that arise in a completely handcrafted level.

GPT is exactly like that.

You won’t notice it with short output. But ask it to keep going and write the next verse of that sonnet. Keep going and it gets VERY repetitive in a way that a human poet doesn’t.

Also keep in mind that the examples you’re seeing are a form of sample bias. You’re seeing output curated by humans.


But isn’t it possible that humans are “just” doing the same but with a much higher foundational resolution?


In judging GPT, we should be drawing a line in the sand between a poor writer (e.g. a college student) and an excellent one (a professional, ideally a domain expert). GPT is impressing us now only because we have very low expectations for a computer to write like a human. But when judged against a capable human, especially when you focus on the _information_ that's conveyed, GPT sucks.

Few of us would want to read a book written as flaccidly as GPT does, a wandering narrative that cannot connect the semantics of one sentence to the next. GPT is capable only of writing _sets_ of sentences, not a progression of them where a concept is introduced and subsequent text tests and contrasts a convincing set of complements and alternatives. Given LLM architectures (and inherent black boxes), that's likely to never change since they don't understand logic or dependency, sufficiency, or necessity. Without those skills, no LLM will ever be able to _develop_ an idea or expand on the implications proposed in an introduction. Have you ever seen GPT conclude a paragraph with a bang? And you never will.

You'll note too that GPT has not attempted to write dialog. I think there's good reason for that. Talk strips away all the excess verbiage. Humans say only what's essential to convey our message. Pointless blather that leads nowhere is maddening to listen to for long. But without a clear focus, that's exactly what I'd expect from GPT: wandering speech that's totally disengaging. Surprising the reader simply is not in its DNA.

So if you're happy reading narratives that wander, that don't lead anywhere, with characters lacking personality, and a narrative style lacking the essentials of every engaging tale (exposition, conflict, suspense, climax, resolution)... then perhaps GPT is for you. But if that's GPT's great promise -- to generate businessspeak that's "tolerable" but even less engaging than the marketing schlock that immerses us now -- then thanks, but no thanks.


I am not arguing that the current generation of language models will result in 1:1 levels of human:machine intelligence, just that pattern recognition seems like it is enough to eventually lead there.

> You'll note too that GPT has not attempted to write dialog.

Here is one of my prompts, is this not dialogue?

"The year is 1960. Choose 3 famous physicists. They are having a debate about whether black holes exist. Show the transcript."

  Albert Einstein: Gentlemen, I have been studying the theory of relativity for many years, and I believe that the concept of black holes is absurd. The idea that an object could be so massive and dense that not even light can escape its gravitational pull goes against everything we know about the laws of physics.

  Stephen Hawking: Mr. Einstein, I must respectfully disagree. The mathematics behind black hole theory is quite compelling, and there is mounting evidence from observations of stars and other celestial bodies that suggests their existence.

  Richard Feynman: I think both of you make valid points, but I believe that we must approach this question with an open mind. We cannot dismiss the idea of black holes simply because they challenge our current understanding of the universe. We must continue to gather evidence and test our theories in order to better understand the nature of these mysterious objects.

  Einstein: But surely, Mr. Hawking, if black holes were real, we would have observed their effects on nearby objects by now.

  Hawking: Not necessarily, Mr. Einstein. Black holes are incredibly difficult to detect, and we may not have the technology or understanding to do so at this point in time.

  Feynman: I agree with Mr. Hawking. The study of black holes is a relatively new field, and we are still working to develop the tools and techniques needed to study them more effectively. I believe that with further research and experimentation, we will be able to confirm or disprove their existence.

  Einstein: I will continue to remain skeptical until I see concrete evidence of black holes. But I am open to being proven wrong if the evidence warrants it.

  Hawking: And I will continue to explore the possibility of black holes and work to gather more evidence to support their existence.

  Feynman: I think that is a fair and reasonable approach. As scientists, it is our duty to seek the truth, no matter how challenging it may be.


Sure it’s possible. I mean every one of us is a unique arrangement of the same fundamental building blocks.

But there’s no reason to assume that something works the same way we do just because someone named it a neural network.


The key concept here for me is pattern recognition. It lines up with my experience when trying to practice creativity.

For example, for decades it has been my opinion that my own supposedly higher than average pattern matching skills were the main thing I brought to the table at work.

When I play drums, keys, or guitar, coming up with new patterns and styles to play feels magical when it happens in real-time, but if I break it down later, I almost always see the influences.


>In fact language literally only works because we all share it and share the meaning of it.

Most human communication is to some extent adversarial and constantly subverts the assumption of shared meaning on purpose or by accident. It's extremely difficult to "solve" because cooperative communication often appears to subvert the basic principles too.

Have you ever read about Grice's maxims of conversation? Once you think in those terms, the frequent absence of cooperation in most human conversation is obvious.

https://en.wikipedia.org/wiki/Cooperative_principle


>Is the claim that these were all simply copy and pastes of something on the internet in their entirety? And that as such the internet already seems to contain essentially every permutation of everything I could ask ChatGPT, as to me this sounds highly implausible.

It's the training data supplied to GPT3 (as explained by OpenAI themselves), so yes, it is literally true. You are just seeing snippets of the internet, re-formed and regurgitated.

It can only do what you ask.


So I appreciate the jist of your point but the way these models work is rather more complicated that copying and pasting snippets and so it certainly is not 'literally' true. The models are trained to predict sub-word level tokens from the internet training dataset, so the level of re-formation and regurgitation in a generated sentence can be vast, to the point of final sentence being novel it's own right.


100% of the training data of the salty jelly in a human skull, is a sensory input.

Were your argument flawless and your conclusion correct, then all human creativity would "literally" be a remix of things in the natural world, as even when we remix things made by other humans that too would ultimately derive back to nature.

This can certainly be asserted depending on how you wish to use those words, but just as it is not useful at predicting what our abilities are — for example, a perfect intellect could predict quantum mechanics from scratch by watching a camp fire die down on a rainy night, but we didn't do that in one step in the neolithic — so too this isn't useful at telling us what the limits of GPT-family LLMs might be, as that which has been built on such inputs, both in the case of humans and this particular AI, greatly exceeds the imagination of any single individual.


In some sense, every AI tool is "regurgitating its training set". In the case of NLP, that training set will be a corpus of documents, utterance-intent pairs, lists of Named Entities, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: