IMO this leaves out some salient details. For example, I'd say ChatGPT is a very, very good junior developer. The kind of junior developer that loves computer science, has been screwing around with miscellaneous algorithms and data structures its whole life, has a near-perfect memory, and is awake 24/7/365, but has never had to architect a data-intensive system, write future-proof code, or write code for other developers. Of course, these last three things are a big deal, but the rest of the list makes for a ridiculously useful teammate.
It also has a very broad knowledge of programming languages and frameworks. It's able to onboard you with ease and answer most of qour questions. The trick is to recognize when it's confidently incorrect and hallucinating API calls.
What do you mean when you say this? Most people use hallucinate to mean "writes things that aren't true". It clearly and demonstrably is able to write at least some code that is valid and write some things that are true.
These models don't have a frame of reference to ground themselves in reality with, so they don't really have a base "truth". Everything is equally valid if it is likely.
A human in a hallucinogenic state could hallucinate a lot of things that are true. The hallucination can feature real characters and places, and could happen to follow the normal rules of physics, but they are not guaranteed to do so. And since the individual has essentially become detached from reality, they have no way of knowing which is which.
It's not a perfect analogy, but it helps with understanding that the model "writing things that aren't true" is not some statistical quirk or bug that can be solved with a bandaid, but rather is fundamental to the models themselves. In fact, it might be more truthful to say that the models are always making things up, but that often the things they are making up happen to be true and/or useful.
Precisely, the model is just regurgitating and pattern matching using a large enough training set where the outputs happen to look factual or logical. Meaning that we're just anthropomorphizing these concepts onto statical models, so it's not much different than Jesus Toast.
I think this is a great way to think about it. Hallucinations are the default and an LLM app is one that channels hallucinations rather than avoids them.
Yeah, the junior dev analogy misses on the core capabilities. The ability to spit out syntactically correct blocks of code in a second or two is a massive win, even if it requires careful review.
Yup, it's been a help for me. Had a buddy who asked me if I could automate his workflow in wordpress for post submissions he had to deal with. I asked chatgpt with a little prodding to create me a little workflow. I cleaned it up a bit and threw it in AWS lambda for literally 0$. He was super thankful and hooked me up with a bunch of meat (his dad is a butcher) and I spent maybe an hour on it.
The whole thing is a really accurate expansion on the analogy. It even extends further to explain how it tends to forget certain requirements it was just told and tends to hallucinate at times.
Well, besides the prose, ChatGPT generates a perfectly valid looking code mashup of e.g. Qt, wxWidgets and its hallucinations on top of that. Humans don't do that :)
I'm actually not so sure about that statement. For example, knowing if the code will be executed on a raspberry pi, a HPC with 10TB RAM and 512 CPUs, or a home desktop with 128GB RAM and 8 core CPU will greatly affect how the task may be done. Also, if code aesthetics are important with dependencies that allow it, or fewer dependencies are required, or if performance is more important, or if saving disk space is paramount, etc.
All of these considerations (or if the need to run on any of them easily) heavily change the direction of what should be written, even after the language and such have been chosen.
So, yeah - effectively you do need to specify quite a bit to a senior dev, if you want specific properties in the output - so it's obvious that this needs to be specified to a linguistic interface to coding like these LLMs.
I guess it depends on how you'd define "senior" in this context, someone who knows lots of techstack or someone who has an idea. Of course that doesn't directly map to people's skills because most people develop skills in various dimensions at once.