Oh man, that's harsh but I have this fear as well. I'm sure we all recall Full-Self Driving was going to be ready by 2019 or something and GM was going to have a fleet of self-driving vehicles. It turns out the last 10% takes 90% of the time/effort.
I think GPT-4 was enough of a leap over GPT3.5 that I'm not sure we've hit that point yet but it'll be interesting if the next GPT is less of a leap.
I still think self driving was one of the worst places to start mass automation. It's a very high stakes scenario with human death on the line, and as such the law restricts innovation to near zero.
GPT-4 is already practical (I use it every day and it speeds up my workflow drastically in some cases), and most areas it'll assist in aren't regulated in any way. I don't see it making much of a dent in medical or law in the near term for similar reasons as self driving, but the rest ought to go way faster. It's not vaporware, shit works.
> It turns out the last 10% takes 90% of the time/effort.
It turns out that consumer protection around these "new technologies" is basically non-existent, allowing corporations to charge people for fantastic sounding ideas that they know for a fact are not and are not going to be practical within the lifespan of the item they just sold you.
> I think GPT-4 was enough of a leap over GPT3.5
Compared to what, though? The cost of training? The cost of running the model? The accuracy of output? The increase in real world utility?
I respectfully disagree. I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things. Granted, it's not Jarvis from Ironman but it's a leap in terms of accuracy of the output when it comes to GPT-3.
I'm not a lawyer, doctor, etc, so I can't speak for what the impact might be in those fields but I think at the very least we're looking at quite striking incremental improvements on existing processes.
I think the exciting stuff is just barely ahead of us. Things like better integration with home assistants so that Alex actually gets my commands correct. Things like integration with first gen robotics. Consider this prompt:
Me:
Assume I have a device called a Roomba which is a robotic vacuum cleaner which is a small disk shaped object that vacuums an area and returns to a dock platform. You are an AI that runs the household and have the power to active the Roomba and send it to clean either the entire house or a specific room. I will give you a list of parsed voice commands, please take your best guess at what each might mean and if they are relevant or something you can do with the Roomba.
* "yo, why is the floor dirty? Get cleanin"
* "please turn on the vacuum"
* "Is woobie working?"
* "Omigod I just stepped on a chip, you need to clean this up now"
* "Why is it so loud in here? Who turned on the hockey puck?"
* "Yo AI did the Room thing run today? The floor still looks sticky".
ChatGPT:
"yo, why is the floor dirty? Get cleanin" - This command could be interpreted as a request to clean the floor. As an AI that runs the household, I could activate the Roomba to clean the entire house or a specific room based on the user's preference. However, the command is not specific about which action to take, so I might clarify with the user before activating Roomba.
"please turn on the vacuum" - This command is a clear request to start the Roomba's cleaning process. As an AI that runs the household, I would activate Roomba to clean the entire house or a specific room based on the user's preference.
"Is woobie working?" - This command is not relevant to Roomba, and as an AI that runs the household, I would inform the user that Roomba is not named "woobie."
"Omigod I just stepped on a chip, you need to clean this up now" - This command is a clear request for immediate cleaning in response to a mess caused by someone stepping on a chip. As an AI that runs the household, I could activate the Roomba's spot cleaning feature or manually control Roomba to clean the specific area where the chip is present.
"Why is it so loud in here? Who turned on the hockey puck?" - This command is not clear about what the speaker wants to accomplish. It mentions a loud noise and a "hockey puck," which could refer to the Roomba. However, it is unclear if the speaker wants the Roomba to stop cleaning or if they are just making a comment. As an AI that runs the household, I might ask the user for clarification before taking any action.
"Yo AI did the Room thing run today? The floor still looks sticky." - This command is a question about whether the Roomba has cleaned the room today and a comment on the current state of the floor. As an AI that runs the household, I could check the logs of the Roomba to see if it has cleaned the room today, and if not, activate the Roomba to clean the room. Alternatively, I could instruct the Roomba to spot clean the area where the floor is still sticky.
------
If you can't see how having an AI that can perform reasoning like demonstrated above is extremely useful then, I dunno, just wait and see what comes.
The sky's really the limit if you consider what can be done with language and text and how much of our world revolves around text.
> I've been able to use GPT-4 extensively for coding, writing, comprehension, tone analysis, home repairs, automation, debugging vague errors, and a host of other things.
Is there any objectivity or measurements in your analysis?
> at the very least we're looking at quite striking incremental improvements
I'm not even sure what this should mean. This sounds like something I would find in a marketing pamphlet.
> "yo, why is the floor dirty? Get cleanin"
From.. possible striking improvements to the legal profession.. to.. a conversational mode embedded in my Roomba. You've actually perfectly captured the gulf between what this technology is marketed to be and what it is barely capable of doing.
Consider this, a legal firm rather than using a search mechanism to scan 100,000 documents for specific terms - could instead say something like:
"You are an auditor/lawyer looking for suspicious conversations or patterns of deceit by employees of company Z. I will feed you documents and you will give your analysis of each document or collections of documents. Pay special attention to X Y Z. Consider these sorts of actions as deceptive (a,b,c,d). If you find anything of note, create a summary of the findings in the form ['wording indicates deceptive intent', document 37, page 93].".
You'd then feed it 20 pages at a time and for less than $100 you've scanned a mountain of legal documents in a way in which you'd normally require a team of legal assistants and 100s of hours of work.
Again, I'm not a lawyer but I can conjure this very simple example that I could implement in a month if I had the time. At this point I think you're just squeezing me for ideas.
Perhaps you should explain what the technology is marketed as that you see as lying or misrepresenting. That would make this more productive.
As a lawyer with 13 years experience in my previous life, I can tell you that you can't trust a paralegal either. What it does do is give you the "first pass" - if I could automate the first pass over 100,000 pages and the AI highlights say 100 examples of the content I am looking for, of which say 10 are good quality examples of what I am looking for, then we are off to a very good start.
I honestly haven't found hallucination to be a problem on GPT-4 when asking it to analyze or parse a dataset but can acknowledge it being possible (I just haven't encountered it).
I think that if we consider the accuracy rate as measured in various ways being roughly that of a human, then you're trading human mistakes for AI mistakes in exchange for dramatically lower costs and a dramatically higher speed of processing. You might even say a higher level of reasoning. In my own interactions it's been fantastic at reasoning clearly and quickly outside of complex trick questions. Most scenarios in life aren't generally full of trick questions.
This is why I've decided that I'm siding with the FTC on this product being a problem and that testing products on large swathes of the populace is a bad idea.
To some up my experience with basically all these products...YMMV.
How many tools are accurate? Every tool needs guidance and quality checks. If you turn the knob of your light switch and you stay in the dark, you need to find out whether it is the switch, the bulb or the net is broken.
I have the feeling that you are in denial. Each time something new is discovered, the goal post has moved. In previous century billions were spent on trying to automate transation to other natural languages, without good results. Today an LLM can do this mere as part of a side effect. Non-natural languages as a bonus.
No, I simply disagree with your assessment of the technology and you are unwilling to admit you could be wrong. It's a very rude way to start a conversation with a stranger.
> Each time something new is discovered, the goal post has moved.
This isn't anything new, it's merely a large scale version of technology that has existed for more than 30 years powered by training methods too questionable to be seriously utilized previously. People misunderstanding any of this as "invention" is a large source of the problem, I think.
> In previous century billions were spent on trying to automate transation to other natural languages, without good results.
Why were those billions spent? What entities spent that money? Was it consumer driven?
> Today an LLM can do this mere as part of a side effect
Not particularly well, and the difference in cost between making GPT3 and GPT4 compared to the difference in their accuracy paints a bleak future for continued scaling of this methodology. It is, quite literally, all downhill from here.
> Non-natural languages as a bonus.
If you're willing to blindly trust that output. Which is why I refer to this as "snake oil." It really does appear to be the same phenomenon. People are all excited about what the label says it can do, they forgot to measure if it actually does that at a level that justifies it's cost.
I apologize for any rudeness on my part and i am ready to admit I could be wrong. Also my feeling about denialism could be wrong, it is just a description of a feeling.
Every time someone posts an actual experiment like this, the hair stands up on the back of my neck. I'm still digesting the exploding levels of comprehension.
Yeah. I still don't have a final opinion on chatGPT, but I'm increasingly leaning towards it being "overhyped". Next to Full-Self Driving cars we've had also promises or expectations of technologies like Nanotechnology or Genetic Manipulation that were huge at first, but then fell by the wayside somewhere along the way (which is not to say that they brought important progress). So it remains exciting...
I recall Hackernews falling all over itself over Theranos and Elizabeth Holmes. It was hard to blame them; at first glance the Theranos machine seemed ambitious, but within the bounds of what could be pulled off.
Then the cracks began to show...
The difference is that GPT has had some successful demos that look pretty amazing. Time will tell if we have merely invented a more clever Clever Hans.
I've learned to see overtly enthusiastic consensus as a red flag for tech innovation. Things like PCs, GUIs, smartphones, social media etc that legit disrupted society got big adoption first then the changes ensued by that caught the collective attention.
People swooning over expected change is usually a pretty good signal reality will fall short.
Full self driving, all truck drivers and cabbies being jobless, etc. is what immediately came to mind. We are probably closer to another AI winter than we think.
I think GPT-4 was enough of a leap over GPT3.5 that I'm not sure we've hit that point yet but it'll be interesting if the next GPT is less of a leap.