Controversial take: LLMs are the first time in a while that I have felt like eme...

abixb · 2024-12-18T19:31:58 1734550318

I agree. A good chunk of the tech trends in the last decade were indeed rent seeking, but silent revolution was happening in the transformers and the neural network architecture domain, which made today's products possible.

And I'd wager that there are silent revolutions happening all across colossus that's the tech industry that will become apparent in the next decade.

Jeff Bezos put it best during his recent interview at the 2024 NYTimes Dealbook Summit, "We're living in multiple golden ages at the same time." There's never been a better time to be alive.

TRiG_Ireland · 2024-12-19T00:05:41 1734566741

That's easy for a billionaire to say, isn't it? Jeff Bezos is not exactly a reliable narrator here. His business practices are built on exploitation and externalising his costs (such as the massive environmental damage).

hypeatei · 2024-12-18T19:33:12 1734550392

I agree about an abundance of apps, but what type of value are LLMs adding?

It can sometimes be useful to input a more "human" search and have something get spit out but 60% of the time it completely lies to you. I'm talking about questions related to web specifications which are public documents. Section numbers, standards names, etc.. will be completely made up.

rtsil · 2024-12-18T20:14:37 1734552877

Off the top of my head, and just for the last couple of months, and only outside of work (where its value is even more immense), it has saved one of my indoor plants, told me how to handle a major boiler problem that would have left us without a working boiler during a weekend in the winder, with the next "emergency" repairman only available on Monday, advised me to use Kopia as backup solution for my personal files instead of Syncthing, helped me choose the right type of glass for a painting frame, answered a couple of questions about bikes and helped me when I was stuck in an harmonic analysis of a piece of music. All of that are extremely valuable to me (if only for the time not wasted googling answers), and in none of them its potential hallucinating would have been an issue. And I can't count the number of times where "specialists" in bike repairs or plumbing told me something incorrect or outright false, so I've learned to deal with hallucinations already!

vel0city · 2024-12-18T20:21:55 1734553315

> And I can't count the number of times where "specialists" in bike repairs or plumbing told me something incorrect or outright false, so I've learned to deal with hallucinations already!

So much this. So many times I've argued with hired experts saying "can't be done" just to see yes, it can be done.

swatcoder · 2024-12-18T20:35:27 1734554127

Yes, but which of those things would you not have resolved just as well 10 years ago? All those possibilities were added by the maturing web itself, as a genuinely novel change from having to source books or experts/friends in the days before.

I'm glad ChatGPT didn't lead you astray, but I'm not seeing what it's added here besides shuffling up the user interface in a way that you presently and subjectively prefer?

TeMPOraL · 2024-12-18T23:32:30 1734564750

> I'm not seeing what it's added here besides shuffling up the user interface in a way that you presently and subjectively prefer?

This. But in the same sense the past 50 years merely changed interface from dusty textbooks in libraries to Google Search, and the past 100 years gave us dusty textbooks over writing to Royal Society, and that just replaced the option of asking a local whisperer or hoping you'll find answers on the Sunday mass.

Do not underestimate the power of being able to get an answer to your problem described, visualized, and perhaps complete with interactive demo to explore it further, in time it would previously take you to formulate the right search query that finally gives you relevant information.

EDIT:

And that's on top of all the arbitrary data transformations prior tools couldn't do. E.g. I'm increasingly often using GPT and Claude models to turn photos of (possibly hand-written) notes or posters into iCAL files I can immediately import into our family shared calendar.

Another frequent use case, data normalization. Paste a whole dump of inconsistently structured data multiple people collected (say, addresses of various local businesses that helped a local NGO and now are supposed to get a thank-you card for Christmas). Like, you get 200 rows of addresses in a single column, with spelling mistakes, repetitions, junk at the end, arbitrary capitalization, wrong order of address segments, and such; you need to separate it out into 5+ columns (name line 1, name line 2, street address, zip code, city, etc.) and have it all normalized.

The fastest and most robust way to do it as a one-off job, today, is to paste the whole thing to GPT-4o or Claude 3.5 Sonnet, tell it how the output should look (give one-two examples, mention some mistakes you saw), then send the message and wait 30 seconds for the job to be done for you.

(Yes, it may make mistakes - it didn't for me in recent memory, but it can. But for that, I quickly add an extra verification column for each one in LLM output, and do a simple case-insensitive substring match with original, and eyeball any data row that shows an error. And guess what, the formulas don't take much time either, since LLMs are good at writing them for you, too!)

rtsil · 2024-12-18T21:07:31 1734556051

My plant would have been dead. As for the rest, sure, I would have resolved them eventually, after many frustrated hours of googling and trial and error.

Time is my most precious thing, I already don't have enough time to do all the things that I want to do, I don't want to waste that trying to find and test solutions when ChatGPT gives me instant answers. I'd rather spend time playing with my cats or riding a bike instead. It's not a matter of UI, it's a matter of preventing waste of time, energy and money, and less frustration. For that alone, €20/month is a very good value. And that's just for my personal life.

swatcoder · 2024-12-18T22:43:00 1734561780

"many hours of frustrated googling and trial and error" isn't a familiar experience to me, but I'll trust that it is for you. I'm glad you see that as behind you now with this. I suppose you must not be alone.

djeastm · 2024-12-19T01:22:44 1734571364

>besides shuffling up the user interface

I wouldn't discount this effect. As someone with sensory issues, one thing I like about ChatGPT as opposed to the "raw" internet is that I can see the answer to my questions in a nice and calm textual format without some website who created the article specifically to catch my search terms, but is trying to get me to deceptively click on ads or pull me into buying something through their affiliate links. That's absolutely increased my own enjoyment and productivity.

fragmede · 2024-12-18T20:56:31 1734555391

objectively, it takes less time to ask a question and get a direct answer than it does to search for some words, leaf through a couple of results, find one that has the information you want, and then read that page. If I want to know the height of the Eiffel tower, being told it's 1083 meters tall is faster than searching for its website, finding the stats section, then locating that information on the page. Google realizes that, so they pull that info out of the page and just put it on the results page for you.

dayjah · 2024-12-18T19:44:27 1734551067

This is a thin edge of the wedge issue, right? ChatGPT is pretty darn good for most things. I’ve used it extensively for the past 18 months and only in a few cases would I say it “completely lied to me”.

My general rubric is: “would I trust someone on Reddit to correctly guide me on this”. If the answer is “yes” then ChatGPT is likely going to do well. If the volume on a particular subject is low / susceptible to false information then it’ll lie.

Recently it lied hard about how to configure MikroTik routers. I lost many hours. But for a large construction project recently it completely balled out.

Are you doing cutting edge / complicated stuff? Have you examples of where it lies?

hypeatei · 2024-12-18T20:06:14 1734552374

> Have you examples of where it lies?

No specific prompts, but most were related to the XHR/Fetch specs and behaviors within. It would say "X.Y.Z sections defines this" but that section didn't exist at all and the answer provided was not accurate.

> My general rubric is: “would I trust someone on Reddit to correctly guide me on this”. If the answer is “yes” then ChatGPT is likely going to do well

I see. Well, I don't know if I find that very valuable but if others do, then so be it.

adamc · 2024-12-18T20:14:30 1734552870

I've asked it for things like book recommendations and gotten:

  - completely made up books
  - real books that were only marginally related
  - real books with really bad reviews

I'd estimate that only 30-40% of the time did I find the results at all useful.

airstrike · 2024-12-18T20:31:13 1734553873

stop using it like a database that you can query

bigfudge · 2024-12-18T22:14:32 1734560072

Agreed this is a bad idea in the case you are replying to, but I love ChatGPT as a way to recover the name of a book or film I’ve forgotten. I recently prompted for “a book about nuclear wasteland dominated by a church” and it gave me A canticle for Leibowitz (which is great). I’m not sure how easy that would be any other way.

airstrike · 2024-12-19T03:39:34 1734579574

absolutely, I've been using it for that at lot as well and it's remarkably well suited for it. there's really no better tool for the job.

it's just that every thread about LLMs in AI invariably has someone complaining about best results from a query best described as `SELECT * FROM...`

cruffle_duffle · 2024-12-18T22:06:14 1734559574

I wonder how many people are promoting it correctly. You can’t just query it like you might for google or something. It works best with lots of context and back and forth. And yeah, for many things you are going to get directional answers not exact ones (esp with “rote memory” like exact quotes from a book or something.)

therein · 2024-12-18T20:00:52 1734552052

I don't want to turn this into another Claude lies less than ChatGPT subthread but since you mentioned configuration of MikroTik routers I felt like I should.

ChatGPT lies a lot about RouterOS, I don't know why. Claude helped me a lot on the other hand with all things MikroTik.

dayjah · 2024-12-18T22:47:05 1734562025

Thanks! I’ll give it a shot when I get to the vlan stuff I’ve on deck

dghlsakjg · 2024-12-18T20:34:10 1734554050

I find it useful, and it brings value to me (literally: I exchange valuable money for API access), even if it doesn't for you. Many other people report the exact same thing. Just because you don't find value in a technology, doesn't mean that others don't.

In the past week I have used it for helping write a script in a framework I'm not super familiar with (OpenSCAD), I was able to finish a project in 5 minutes that otherwise would have taken me hours. I have used it to help make movie recommendations (none of them were hallucinated). I have used it to translate a conversation with a non-english speaker, etc. There are other tools that can help me do all of these things, but none quite as fast or painlessly.

It might not be useful for your use case of asking questions related to specific web specs, but that doesn't mean that the technology has no value. Horses for courses...

rajamaka · 2024-12-18T19:55:13 1734551713

The value to me is by having an on-demand junior developer working alongside me for the price of $20 a month

hypeatei · 2024-12-18T20:12:50 1734552770

My experience with code completion tools (i.e. single line/method snippets) has been positive. But, anything more complicated seems to fall apart rather quickly.

stonedge · 2024-12-18T21:39:50 1734557990

I have upgraded to the $200 Pro tier, and, with o1-pro, all of my tasks delegated to the "junior" have been so much better. It takes longer to complete, of course, but the overall duration is less because I'm not having to go back and correct it as much as I was with 4o. It's been able to figure out problems that 4o continually failed on.

ipaddr · 2024-12-18T20:09:58 1734552598

Mine is a senior developer with memory lapses.

airstrike · 2024-12-18T20:30:36 1734553836

And that sometimes you need to bully a bit to get coerced answers out of... feels bad

drusepth · 2024-12-18T20:11:26 1734552686

LLMs have been a personal tutor to me for the last year, able to explain anything and everything I've been curious about professionally and personally. I changed jobs to new technologies in large part because I effectively had an assistant able to help cover any gaps in knowledge I had, train me up quickly, and offer ongoing help on the job.

They can make stuff up, but saying "60% of the time they lie to you" hasn't been true for years.

krger · 2024-12-18T20:31:46 1734553906

>They can make stuff up, but saying "60% of the time they lie to you" hasn't been true for years.

If you're using them to fill knowledge gaps, what scaffolding have you set up to ensure that those gaps aren't being filled with incorrect-but-plausible-sounding information?

lxgr · 2024-12-18T20:42:04 1734554524

That's because we're currently largely not using them correctly, i.e. hooked up to RAG instead of hoping that they've memorized enough of the training data verbatim, which is arguably a waste of neurons in a foundational model.

Imaging being graded on your ability to quote exact line numbers of particular parts of your codebase as a senior software engineer without being able to look at it!

LLMs are not, in isolation, a search product.

jklinger410 · 2024-12-18T19:37:46 1734550666

> but 60% of the time it completely lies to you

This is such an exhausting conversation

hansonkd · 2024-12-18T20:28:51 1734553731

i think when people say things like this it indicates that they tried LLMs in 2022 and solidified their opinion there.

I had the same impression about the hallucinations 2 years ago. The reality is in at the end of 2024, you can get incredible value from LLMs.

I've used copilot to code almost exclusively now for the past few months. Anyone still comparing it to text completion I feel is operating on completely out of date information either intentionally or unintentionally.

beefnugs · 2024-12-18T21:36:40 1734557800

Wait do you expect people to retry every failed thing they have tried due to marketing lies every how often exactly?

jklinger410 · 2024-12-18T21:52:21 1734558741

Do you expect the first iteration of every product to be perfect?

zamadatix · 2024-12-18T19:39:36 1734550776

I'd (generally) agree. About 5 minutes of using Flux, Claude or Suno would have provided more net new value than I've yet to get out of blockchain, self driving, gig brokers, metaverse, 5G, AR/VR, quantum computing, hyperloop, and whatever people were trying to make web3 be combined over the years. Not that I don't think all of these things will always perpetually fail to deliver (hell, if I had a chance to try Waymo already then self driving probably wouldn't be on the list), just the hype cycles were unrelated to when that delivery occurred (if ever).

The hard part is, despite actually having some "real" value delivered, you still have to sort through the 99% of bullshit that comes along with it anyways.

becquerel · 2024-12-18T20:32:53 1734553973

I will personally say that if you ever get the chance, definitely try a Waymo. I did recently for the first time and it's a hell of an experience. You can very vividly imagine it being the future.

I'm also going to stand up for AR/VR here. I'm in a long-distance relationship and me and my partner spend an hour or so in VRChat around two to three times a week. The power that has to reduce the badness of an LDR is well well well well worth the three hundred bucks I paid for a Quest. That and some of the golf games on it are fun.

zamadatix · 2024-12-18T21:06:11 1734555971

I am super stoked to try a Waymo when I'm in a city with one. It's hype failures have more to do with 10 years of hype about its public availability yet not being available to 99% of the world's population 10 years later. Hype is useless without the result.

I've had an HTC Vive and an Oculus Rift 3 (Walkabout Mini Golf is one I tried!) and while I wouldn't try to argue NOBODY has found a use for it (somebody somewhere found uses for all of the things I mentioned, just not me and just not the majority of people like big new things are promised to) it never really ticked the "new value" box before they ended up in the closet for me.

becquerel · 2024-12-19T07:37:29 1734593849

That's totally fair. The tech is only barely coming out of the enthusiast adopter phase and there's not a critical mass of content on there to keep most people putting on the headset daily.

That and the ergonomics do still suck, even if I've mostly gotten used to them.

I do think VR will make it, though - starting with the kids. Apparently Gorilla Tag broke 1.5 million players recently, and those are mostly under-15s. The next generation is going to have a strange relationship with computers.

emptysea · 2024-12-18T19:35:55 1734550555

Have you tried a Waymo yet? Honestly the coolest tech I’ve seen/used in ages

Lots of engineering involved

dghlsakjg · 2024-12-18T20:17:47 1734553067

No. Never even seen one, since I don't live in the US/California.

TRiG_Ireland · 2024-12-19T00:03:22 1734566602

They're "adding" value they've stolen from artists and writers. It's an industry built on copyright infringement at a heroic scale.

smokel · 2024-12-18T19:52:18 1734551538

Don't forget the vast (and parallel) improvements in image processing.

itishappy · 2024-12-18T19:28:10 1734550090

> a bunch of apps that just aim to be mediocre middlemen/gig economy brokers with bad customer service

Isn't this the new LLM playbook?

dghlsakjg · 2024-12-18T20:23:33 1734553413

How so?

I pay Claude/ChatGPT trivial amounts of money for metered API access to their models, and they in turn provide it to me.

Middlemen/marketplace models like "Uber for x" or "Etsy for x" or "Betterhelp for x" is a totally different business model.

itishappy · 2024-12-18T21:01:47 1734555707

I had in mind the surge in LLM chat support and the surge in thin ChatGPT wrappers with a custom system prompt. Claude/ChatGPT do seem useful, "an AI companion for Microsoft Paint" less so.

vasco · 2024-12-18T19:30:19 1734550219

And now we will have mediocre middlemen/gig economy brokers with bad customer service performed by AI agents that you can summarize with chatgpt and automatically reply back to. Progress!!

dylan604 · 2024-12-18T21:09:34 1734556174

yeah, we were definitely stagnant when the focus was on crypto

otabdeveloper4 · 2024-12-18T19:31:26 1734550286

> doing something cool

Yes.

> and adding value.

No. The only breakthrough innovation LLMs gave us is the ability to speedrun the making of racist pictures. Not sure the world really benefited.

dghlsakjg · 2024-12-18T20:20:19 1734553219

LLMs don't generate images at all.

finnh · 2024-12-18T19:33:44 1734550424

I don't think this was your intent, but the only interpretation here is that you think the rapid creation of racist pictures is cool.

gowld · 2024-12-19T18:59:54 1734634794

Read the comment again, more slowly.