Hacker News new | past | comments | ask | show | jobs | submit login
The Generative AI Con (wheresyoured.at)
364 points by nimbleplum40 3 days ago | hide | past | favorite | 475 comments





I’m a little shocked at how much negativity there is around LLMs among developers. It’s a new tool that requires some learning, and it’s sometimes not so great, but if you’ve used an IDE with real coding assistance built in (eg. VS Code in Edit with Copilot mode - NOT Chat mode, using Claude 3.5), it’s honestly not much worse than a junior dev and 100x faster. And if the code is bad you throw it away and try again 10 seconds later. The amount of speed up I see as a very experienced dev is astronomical. And just like 6 months ago it was awful. How great is it gonna be in a year or two? It doesn’t even have access to running unit tests or reading console errors or IDE hints, and it still generates mostly correct code. Once it gets more deeply embedded it’s just going to improve more and more.

The article is about how the economics of the LLM market is making all tech look bad.

They need trillions of dollars in returns. VC's won't finance tech startups for decades.

I use Cursor sometimes, and VSCode + Continue with llama.cpp, and it's great. That's not worth billions. It's definitely not worth trillions.


This is the crux. A cool thing has been invented, with real usages. Unfortunately, it's cost hundreds of billions of dollars and it has absolutely zero hope of making the trillions needed to justify that.

Now someone will respond about how it's just a stepping stone, and how the billions are justified by _something completely imaginary, and not invented yet, and maybe not ever_ e.g. agents.


>it's cost hundreds of billions of dollars and it has absolutely zero hope of making the trillions needed to justify that.

The BigTech companies have been flush with liquidity and poured those hundreds of billions into the promising tech, and as result we got a wonderful new technology. There is not much need for those trillions in return - just look at liquidity positions of those companies, they are just fine. If those trillions come in eventually - even better.


>There is not much need for those trillions in return

Whilst you are correct that big tech cos do not need the return to survive, that's not how public markets work at all, and thus not how the incentives for those in charge of the companies work, and so making you actually wrong.


If i were wrong, those companies would be distributing that cash to shareholders instead of chasing any promise of any big chance.

If investment in AI don't pan out (i do think that it will pan out, and those trillions will come) then those companies would just pour even more billions into whatever big thing/promise would come next. Rinse and repeat. Because some of those things do generate tremendous returns, and thus not playing that game is what really constitute true loss of money.


Markets are funny things.

US right now is run by someone whose explicit promises, if actually implemented, have an obvious immedidiate 13-14% reduction in GDP — literally, never mind side effects, I'm not counting any businesses losing confidence in the idea that America is a place to invest, this is just direct impact.

DOGE + deportation by themselves do most of that percentage. The tariffs are a rounding error in comparison, but still bad on the kind of scale that gets normal politicians kicked out.

And yet, the markets are up.


What timeframe are you working with, as in, when do you expect to see this reduction in GDP?

I just want to know so that I can set a reminder and check back on your comment when the time arrives.


If you factor in the inflation and the worldwide trade crisis, trading dollars for shares that will lose 10% real value doesn't sound so bad.

Funny, I had been told we had to lay off all those workers because they weren’t flush with cash.

They're convinced they no longer need them.

Just as they were convinced after Covid that they needed to put hiring into overdrive.

Tech management has the collective IQ of a flock of sheep.


Nobody has ever been punished for choosing IBM. It’s the same story here. Nobody is going to blame them for following the zeitgeist, but you bet they’d be punished if they didn’t and it doesn’t pan out.

The whole thing is like bitcoin. There’s too many people that benefit from maintaining the collective illusion.


cash on hands GOOG - 100B, AMZN - 80B, FB - 70B, and their core businesses are basically printing money, so they pretty much do have to invest into new things. If somebody sees a multi-billion dollar sink better than AI right now ...

> If somebody sees a multi-billion dollar sink better than AI right now ...

I think if they could find a way to make their software good, instead of bad, like it increasingly is, that would be a good use of that money.


Workers, infrastructure, taxes…

They’ll be fine and will survive regardless, but their current astronomical valuations probably won’t be.

To train. Inference is much cheaper...and getting cheaper by the day

I see it a little differently. What was the direct economic return of the Manhattan Project?

Ideally it was thought to have shortened a very expensive war, and may have prevented the USSR from taking over Europe by leveraging its unquestioned postwar conventional forces advantage.

Well sure but how much cash did the MaPr corp. make selling their new and improved model implosion-type-u-235?

I don't know how to tell you this, but the government isn't a business and has completely different objectives and operating conditions

If more people understood this we might have avoided the carnage happening in the US right now.

I don't know why it is so hard to understand. I mean money doesn't really exist without a government[0] and while government plays a role in the market and economy, this role is VERY different than that of a business. A government isn't trying to "make money", is isn't trying to make investors happy, and it certainty can't take existential risks that could make "the company" go bankrupt (or it shouldn't lol).

But I do think (and better understand) there is a failure to understand this at a higher abstraction. One part is simply "money is a proxy." This is an uncontestable fact. But one must ask "proxy for what?" and I think people only accept the naive simple answer. Unfortunately, this "is a proxy" concept is extremely generalization. Everything is an estimation, everything is an approximation, and most things are realistically intractable. We use sibling problems or similar problems to work with that are concrete, but there are always assumptions made and ignoring these can have disastrous consequences. Approximations are good (they're necessary even) but the more advanced a {topic,field,civilization,etc} gets, the more important it is to include higher order terms. Frankly, I don't think humans were built for that (though by some miracle we have the capacity to deal with it).

My partner and her dad are both economists, and one thing I've learned is that what many people think are "economics questions" are actually "business questions". I think a story from her dad makes this extremely clear. A government agency hired him to look at the cost benefit analysis of some stuff (like building a few hospitals and some other unambiguously beneficial institutions), and when he presented everyone was happy but had a final question "should we build them?" The answer? "That's not the role of an economist." The reason for this is because money can't actually be accurately attributed to these things. You can project monetary costs for construction, staffing, and bills, and you can make projections about how many people this will benefit, how it can reduce burdens elsewhere, and as well as make /some/ projections about potential cost savings. But you can't answer "should you." Because the weight of these values is not something that can be codified with any data. It is an importance determined by the public and more realistically their representatives. Very few times can you give a strong answer to a question like "should we build a new hospital" and essentially in only the extreme cases. I'll give another example. In my town there was an ER that was closed due to budget constraints. This ER was across the street to the local university, which students represent ~15% of the population. The next nearest ER? A 15 minute ambulance ride away and in the next town over. Did the city save money? Yes. Did the sister city's ER become even busier? Also yes. Did people lose access to medicine? Yes. Did people die? Also yes. Have economists put a price on human life? Also yes, but they are very clear that this is not a real life and a very naive assumptions[1]. It is helpful in the same way drawing random squiggles on a board can help a conversation. Any squiggles can really be drawn but the existence of _something_ helps create some point to start from.

[0] okay crypto bros, you're not wrong but low volatility is critical as well as some other aspects. Let's not get off topic

[1] https://www.npr.org/2020/04/23/843310123/how-government-agen...


The profit was made by the private sector in supplying goods to the program. Today, private companies do a lot and earn a lot of money from stockpile maintenance.

The Manhattan Project was driven by the U.S. Government, which doesn't need a VC-tier return. The entire business model of VCs is based on the idea that they'll have the occasional 100x return, and if none of the AI companies do that it would destroy the VC model.

About the GDP of the US and Europe over the past 80 years so a few quadrillion dollars.

That's not direct return of VC-invested cash that people are refusing to see past in here.

Doesn't matter. The Manhattan project was a breakthrough in fundamental science that changed the world. Current generative AI are a solid degree improvement on previous technology that is not remotely as big a leap as the amount of money poured into it assumes it to be.

“people … in here” seeing “past it” or not is irrelevant, the VCs won't see past it once they realize that money is lost.

Wait, what? The Manhattan project produced something--multiple somethings in fact. What has this "project" produced?

Completely irrelevant. The Manhattan Project wasn't funded by VCs with an expectation of a return.

> I use Cursor sometimes, and VSCode + Continue with llama.cpp, and it's great. That's not worth billions. It's definitely not worth trillions.

That seems like a suspect claim. If you're saying that you, personally, cannot create billions of dollars in value with Cursor & friends that is certainly true - but you are in no position to make a judgement call about where the cap on value creation is for the LLM market is worth based on your personal use cases. LLMs don't just do code completion. We really can't estimate how much potential value is being created without doing some serious data diving and studying of cases.

A better argument would be that the DeepSeek experience suggests these companies have no moat and therefore no way to earn a return on capital. But LLMs are probably going to generate at least trillions of dollars in value because they're on par or ahead of Wikipedia and Google for answering many queries then they also have hundreds of ancillary uses like answering medical questions at weird hours or creative/professional writing.


It's possible to grow an economy by trillions of real value without any actor being able to extract that as a profit or it even showing up in the books as money.

Consider that Wikipedia is much bigger than Encyclopedia Britanica, but because it is given away to everyone for free, it is not counted as E.B.'s max sale price ($2900 in 1989?) times the world's internet connected population (5.6e9?) — $16 trillion.

AI, regardless of value, are priced at the marginal cost to reproduce weights or run inference depending on which you care about.

But I do mean "reproduce" not "invent" — it doesn't matter if DeepSeek's "a few million" was only possible because they benefited from published research, it just matters that they could.

And if the hardware is the bottleneck for inference, that profit goes to the hardware manufacturer, not to the top ten companies who made models.


> That's not worth billions. It's definitely not worth trillions.

That is a problem for the VC’s that bet wrong, not for the world at large.

The models exist now and they’ll keep being used, regardless of whether a bunch of rich guys lost a bunch of money.


Their ongoing operation is quite expensive, so even that is not assured.

My ongoing operation is a MacBook pro that costs pennies worth of electricity.

Where are you getting this from? Outside of o3, every AI provider's API is super cheap, with most productive queries I do coming in under 2c. We have no reason to believe any of them are selling API requests at a loss. I think <2c per query hardly counts as "quite expensive".

The reasoning people have for them selling API requests at a loss is simply their financial statements. Anthropic burned $3B this year. ChatGPT lost $5B. Microsoft has spent $19B on AI and Google has spent close to $50B. Given that revenue for the market leader ChatGPT is $3.7B, it's safe to say that they're losing massive amounts of money.

These companies are heavily subsidized by investors and their cloud service providers (like Microsoft and Google) in an attempt to gain market share. It might actually work - but this situation, where a product is sold under cost to drum up usage and build market share, with the intent to gain a monopoly and raise prices later on - is sort of the definition of a bubble, and is exactly how the mobile app bubble, the dot-com bubble, and previous AI bubbles have played out.


Are the training costs (CapEx) and inference costs (OpEx) being lumped together?

Not sure if it matters at this point. There will need to be many more rounds of CapEx to realize the promises that have been put forth about these models.

The implication would be that those API requests are being sold at a loss. Amodei wrote in January that Claude 3.5 Sonnet was trained for only a few $10Ms, but Anthropic has been losing billions.

That would be a killer for the current and near future generations of LLM as a business. If they are having to pay many times in compute what they are able to get for the API use (due to open models being near comparable?), then you definitely can't "make up for it in volume".

> they’ll keep being used

How? I get that many devs like using them for writing code. Personally I don't, but maybe someday someone will invent a UX for this that I don't despise, and I could be convinced.

So what? That's a tiny market. Where in the landscape of b2b and b2c software do LLMs actually find market fit? Do you have even one example? All the ideas I've heard so far are either science fiction (just wait any day now we'll be able to...) or just garbage (natural language queries instead of SQL). What is this shit for?


Anecdotally, almost every day I’ll overhear conversations at my local coffee shop of non-developers gushing about how much ChatGPT has revolutionized their work: church workers for writing bulletins and sermons, small business owners for writing loan applications or questions about taxes, writers using it for proofreading, etc. And this is small town Colorado.

Not since the advent of Google have I heard people rave so much about the usefulness of a new technology.


These are not the sort of uses we need to make this thing valuable. To be worthwhile it needs to add value to existing products. Can it do that meaningfully well? If not it's nothing more than a curiosity.

Worthwhile is a hard measure.

To make money though it just needs to have a large or important audience and a means of convincing people to think, want, or do things that people with money will pay to make people think, want or do.

Ads, in other words


Can you get enough revenue from ads to pay the cost of serving LLM queries? Has anyone demonstrated this is a viable business yet?

A related question: has anyone figured out how to monetize LLM input? When a user issues a Google search query they're donating extremely valuable data to Google that can be used to target relevant ads to that user. Is anyone doing this successfully with LLM prompt text?


I bet Google is utilizing the value of the LLM input prompts with close to the same efficiency they are monetizing search. I that case, there are two questions -- 1) will LLM overtake search? and 2) can anyone beat Google at monetizing these inputs? I think the answer to both is no. Google already has a wide experience lead monetizing queries. And personally, I'd rather have a search engine that does a better job of excluding spam without having to worry whether or not it's making stuff up. Kagi has a better search than any of the LLMs (except for local results like restaurants/maps).

> Do you have even one example?

My company uses them for a fuckton of things that were previously too intractable for static logic to work (because humans are involved).

This is mostly in the realm of augmented customer support (e.g. customer says something, and the support agent immediately gets the summarized answer on their screen)

It’s nothing that can’t be done without, but when the whole problem can be simplified to “write a good prompt” a lot of use cases are suddenly within reach.

It’s a question if they’ll keep it around when they realize it doesn’t always quite work, but at least right now MS is making good money off of it.


LLMs are incredible at editing my writing. Every email I write is improved by LLMs. My executive summaries are improved by LLMs. It wont be long until every single office worker is using LLMs as an integral part of their daily stack, people just have to try it and theyll see how useful it is for writing.

Microsoft turned itself into a trillion dollar company off the back of enterprise SAAS products and LLMs are among the most useful.


> What is this shit for?

Various minor thing so far. For example I heard about ChatGPT being evaluated as a tool for providing answers for patients in therapy. ChatGPT answers were evaluated as more empathetic, more human and more aligned with guidelines of therapy than answers given by human therapists.

Providing companionship to lonely people is another potential market.

It's not as good as people at solving problems yet but it's already better than humans at bullshiting them.


Are people actually satisfied by that? I personally find "chatting" with an LLM grating and dissatisfying because it often makes very obvious and incongruous errors, and it can't reason. It has no logical abilities at all, really. I think you're really underestimating what a therapist actually does, and what human communication actually is. It's more than word patterns.

I could see this being useful in a "dark pattern" sense, but only if it's incredibly cheap, to increase the cost to the user of engaging with customer support. If you have to argue with the LLM for an hour before being connected to an actual person who can help you, then very few calls will make it to the support staff and you can therefore have a much smaller team. But that only works if you hate your users.


Subjective evaluation of "humanity" and "empathy" in responses is much less important than clinical outcome. I don't think an online chat with a nebulous entity will ever be as beneficial as interactions that can, at least occasionally, be in-person. Especially as the trust of online conversations degrade. Erosion of trust online seems like a major negative consequence of all the generative AI slop (LLM or otherwise).

Clinical outcome of humans doing therapy would be better if for some reason doing therapy worse (less according to taught guidelines) was better. But, sure, we can wait for another research or follow up. It might be true. Therapy has dismal outcomes anyways and the outcomes are mostly independent of which theoretical framework the therapy is done according to. It might be the case that the only value in therapy is human connection that AI fails to simulate. But it seem that for some people it simulates connection pretty well.

> The article is about how the economics of the LLM market is making all tech look bad.

No, it's not. The first half of the article talks about how useless the actual product is, how the only reason we hear about it is because the media loves to talk about it.


Yeah whatever. VCs will keep backing entrepreneurs, that's their job. Until there's a better way to get 10-100x returns, we're fine.

LLMs are pretty good at the aspects of coding that I consider to be "the fun part". Using them has made me more productive, but also made my job less fun, because I can't justify spending time using my own brain to do "the fun part" on my employer's dime. And that was something I was particularly good at, which is why I was able to be paid well to do it.

So now my company makes more money, and the work gets done faster, but I can't say I feel appreciative. I'm sure it's great for founders though, for whom doing the work is merely an obstacle to having the finished product. For me, the work is the end goal, because I'm not hired to own the result.


Huh, for me it's the opposite. It does the boring bit, writing pedestrian method bodies. Writing import statements. Closing tags.

I do the fun bit: having creative ideas and trying them out.


Looks like you haven't used a decent IDE: these things have been standard for decades, locally and with minimal requirements. But wait, now it happens in the Cloud (meh, that's not gonna fly anymore, too last decade)...AND requires massive amounts of power AND cooling, PLUS it's FUBAR about 50/50.

For an incremental improvement...not great, not terrible.


I think LLMs are vastly overhyped and mostly useless, but I use Copilot as glorified autocomplete and like it.

It does what the other poster said: it automates the boring parts of "this db model has eight fields that are mostly what you expect" and it autocompletes them mostly accurately.

That's not something an IDE does.


You're really comparing an IDE's autocomplete with something that can, at minimum, write out entire functions for you?

You're either completely misremembering what IDEs have been able to do up until 3 years ago, or completely misunderstanding what is available now. Even the very basic "autocomplete" functionality of IDEs is meaningfully better now.


It's kind of analogous to the old taxi drivers who took pride in having a sixth sense knowing which route to take you, vs uber drivers who just blindly follow their navigation

Some of them might have had a really good mental map; but the majority would just take inefficient routes (and charge you some random price that they put into their counter) — plenty of reasons to dislike Uber but having a pre-set price, vetted/rated drivers, and clear routing for a taxi service is a massive plus in my opinion.

Bit of a boomer statement here but maybe this will encourage devs such as yourself to contribute more to open source passion projects that will help dethrone the monopolies. Looking at Valve's investment into Linux via Proton as a great example.

It would be so nice to have a productivity Linux OS that just works on all my devices without tinkering. I want to stop supporting the closed source monopolies, but the alternatives aren't up to par yet. I am extremely hopeful that they will be once mega corps inevitably decay and people tire of the boom-bust cycle.

As technologists, we all want beautifully designed tools, and I'm increasingly seeing that these are only created by passionate and talented people who truly care about tech, unlike megacorps that only care about enriching their board and elite shareholders.


That experience is heavily subsidized and is unprofitable for these companies providing it based on what we know. Even with all of the other developers who are also using the same work flow and espousing how great it is. Even with all of the monthly subscribers at various tiers. It has been unprofitable for several years and continues to be unprofitable and will likely remain unprofitable given current trends.

The author spends a good amount of bytes telling us that they don't want to hear this argument even though they expect it.


I think these types of arguments need to at the very least acknowledge the distribution of cost between training and inference.

Perhaps, and the externalities often unaccounted for or hand-waved away.

Even the US Government is getting involved in subsidizing these companies and all of the infrastructure and resources needed to keep it expanding. We can look forward to even more methane power plants, more drilling, more fracking, more noisy data-centres sucking up fresh water from local reserves and increased damage to the environment that will come out of the pocket books of... ?

Update: And for what? "Deep Research"? Apparently it's not that great or world-changing for the costs involved. It seems that the author is tired of the yearly promise that everything is just a year or two away as long as we keep shovelling more money and resources into the furnace.


Inference isn’t that expensive. A single junior dev costs orders of magnitude more than the amount of inference I use. Companies in growth mode don’t have to make money, it’s a land grab right now. But the expense is largely in the R and D. You can build a rig to run full models for 10-20k right? That’s only a month or two of a junior dev’s time, and after that it’s just electricity. And you could have dozens of devs using the same rig as long as they could timeshare. I don’t see where the economics wouldn’t work, it’s just there’s no use in investing in the hardware until we know where AI is going.

Yeah, you can build a rig to run full models for 10-20k... That's a big reason OpenAI might not make it. The whole article is about LLMs not being a viable business.

It is unprofitable because they keep spending money developing new AI. Inference for existing AI is not unprofitable.

For now.

Unless closed models have significant advantage AI inference will be a commodity business - like server hosting.

I'm not sure that closed models will maintain an advantage.


Unreliable tools are utterly exhausting.

> not much worse than a junior dev and 100x faster.

Is there a greater hell than this!?


If the old metric is right, that it is ten times harder to debug code than to write it, having something that writes buggy code 100x faster than you can understand it is a problem.

Especially given that you can ask an LLM to optimise code and on multiple runs it can not tell if it's is improving or degenerating.


At least with a junior dev, I can teach them how to do it better next time. Not so much with generative "AI".

Not totally. But you might be surprised at the things you can do. Cursor has some template-like files where you can basically teach the AI “when we do X, do it this way.” Or you can change the global prompt to add the things it should keep in mind when working with you.

If you actually take the time to tell it “hey, don’t do it this way,” it can definitely do it differently the next time.

On top of that, is anyone training models on their own codebase, and noting to AI which patterns are best practice and which aren’t?

There are a ton of ways to make it better than the baseline copilot experience


> a junior dev and 100x faster. > Is there a greater hell than this!?

Yes — junior management using LLMs and 100x more cocksure.


That's 100x more bugs to fix. Moreover, increasingly complex models produce bugs that are increasingly hard to spot and fix.

I am of the firm belief that unreliable help is worse than no help at all. LLMs are unreliable help, therefore they are useless to me.

Worse than an intern and 1000x faster?

"It makes lots of mistakes, but at least it makes them quickly!"

LLMs are the "move fast and break things" of AI.

> I’m a little shocked at how much negativity there is around LLMs among developers.

While the timeline is unclear; it seems likely that LLMs will obsolete precisely the skills that developers use to earn their income. I imagine a lot of them feel rather threatened by the rapid rate of progress.

Pointing out that it is already operating at junior dev quality and rapidly improving is unlikely to quiet the discontent.


I use LLMs in coding. There are Junior Devs in my team.

If you think LLMs operate at "junior dev" capacity you either don't work with junior devs and is just bullshitting your way around here, or you just pick pretty awful junior devs.

LLMs are alright. An okay productivity tool, although its inconsistencies many times nullify productivity gains - By design they often spit out wrong results that look and sound very plausible. A productivity blackhole. Its mistakes are sometimes hard to spot, but pervasive.

Beyond that, if your think that all a dev does is spit out code, and since LLMs can spit out code it can replace devs in some imaginary timeline, you are sorely mistaken. The least part of my work is actually spitting out code, although it is the part I enjoy the most.

I honestly feel way nore threatened by economic downturns and the looming threat of recession. The only way LLMs threaten me is by being a wasteful technology that may precipitate a downturn in tech companies, causing more layoffs, etc nd so forth.


The value of developers is not the code they output. It's the mental models they develop of the problem domain and the systems they build. LLMs can output code without developing the mental models.

Code is liability. The knowledge inside developers' heads is the corresponding asset. If you just produce code without the mental models being developed and refined, you're just increasing liability without the counterpart increase in assets.


If you define "junior" based mostly on age, then LLM's aren't yet at the level of a good "junior".

If you base it on ability, then an LLM can be be more useful to a good developer than 1 or more less competent "junior" team members (regardless of their age).

Not because it can do all the things like any "junior" can (like make coffee), but because the things it can do on top of what a "junior" can do, more than makes up for it.


>> If you think LLMs operate at "junior dev" capacity you either don't work with junior devs and is just bullshitting your way around here, or you just pick pretty awful junior devs.

I’ve hired lots of junior devs, some of them very capable. I’ve been in this industry for more than 15 years. LLMs operate at junior dev capacity, that’s pretty clear to me at this moment.


I sincerely doubt both your experience and your ability to hire decent devs.

I sincerely doubt your ability to use LLMs well.

I know, it's an highly unpopular opinion among devs. Let's revisit this comment in 5 years...

Yep. There are people who love programming, it's the best part of the work anyhow! And then there are people who come and tell that whatever you do doesn't matter and they are more content on getting a new app by writing a prompt and deploying possibly buggy code. Two different crowds of people.

I'm in a middle. I enjoy Zed and its predictions, I utilize R1 to help me to reason. I do _not_ ever want to stop programming. And I see so often whenever somebody less experienced than me shows me look how Cursor did this with three prompts, can we merge? And the solution is just wrong and doesn't solve the hard issues.

For me the biggest issues are the people who want to see the craft of programming gone. But I do enjoy the tooling.


> it seems likely that LLMs will obsolete precisely the skills that developers use to earn their income

I’m not particularly worried. I think it’s obvious that software engineering is definitely an “intelligence complete” problem. Any system that can do software engineering can solve any problem that requires intelligence. So, either my job is safe or I get to live through the fall of almost all white collar disciplines. There’s not a huge middle ground.

Although perhaps this is just the programmer stereotype of thinking that if someone can code, they can do anything.


> Any system that can do software engineering can solve any problem that requires intelligence. So, either my job is safe or I get to live through the fall of almost all white collar disciplines. There's not a huge middle ground.

How about the middle ground where a human using AI replaces you?

The human job is (maybe) safe, but your job?


Of course that is exactly the middle ground that I’m not certain is so big.

Developer productivity has gone up immensely in the last 50 years and the industry is larger than ever.


Um. How are you measuring that productivity?

Any meaningful metric.

Nah. "AI" is just really, really lame and square. People have a visceral reaction to it even when it's actually not that bad.

These types of articles are just catching the next meme wave, which will be hating on and making fun of "AI" of all sorts.


I was thinking the same but it's not really what the post is about. They talk about there are use cases for LLMs and devs can be benefiting.

What it goes into is how over hyped and over valued these companies are. They've blown through $5bn of compute each in a year and their revenue is abysmal. Microsoft won't report on ai separately, probably because it's abysmal.

I'm positive on LLMs for coding. But I think I have to agree with their assessment. Coding seems like the best area for these tools and what we see now is great. It's probably even worth $10b to the IT industry maybe eventually. But they're not paying for it yet, clearly. And I also think it's just not going to have huge significance outside our industry. The people I rub shoulders with outside of work have not mentioned or asked about it once, which is not necessarily meaningful but it does reveal the limits of hype too.


I think the usefulness is just very domain specific. If you're writing some types of boilerplate or often-tutorialized code it can spit out something very reasonable. Other types of code, like say in game dev, it stumbles around and never produces anything usable.

But like you said, in a few more years we'll see! It does feel like there's some missing pieces yet to be figured out to truly "reason" and generalize.


> If you're writing some types of boilerplate or often-tutorialized code it can spit out something very reasonable. Other types of code, like say in game dev, it stumbles around and never produces anything usable.

This makes me think of a quirk I discovered recently which is that ChatGPT simply won't generate a picture of a 'full glass of wine'. It generates pictures with all sorts of crazy waves/splashes in the glass but the glass is always half full no matter how you prompt it.

I'm not enough of an expert to make any deductions from this, but I think it hints at what the limitations of the currently models are.


For easy things, LLM assist has sped things up a lot for me.

For medium complexity things, I can get them done quickly without manual coding if I have a clear understanding in mind of what the implementation should look like. I supply the requirements, design and strategy and it's fairly easy to "keep things on the rails". The "write a PRD first" hack (https://www.aiagentshub.net/blog/how-to-10x-your-development...) works pretty well. Agent with YOLO mode and terminal access rips, particularly if you have good tests.

For tasks where I know the spec of the feature but don't clearly understand how I would design / implement the feature myself, it's hit-and-miss. Mostly miss for me.

I also haven't had much success with niche libraries, have to stick to the most popular library/tool/framework choices or it will quickly get stuck.


Whereas I've been disabling AI assist features because I find them actively disruptive to the development process. When it ghost pops up text suggesting what I should do, it's sometimes right...but it breaks flow. It forces me to read and parse apparently correct code, and decide if it is correct or it's just a mirage which is valid but not actually what I'm doing at all.

> I’m a little shocked at how much negativity there is around LLMs among developers

There's a Quentin Tarantino quote where he says there are 2 kinds of film critics. There are those who love movies and there are those that love the movies they love.

A lot of developers really seem to love the technology they love.

These people are where most of the negativity is coming from. And my guess is that the people who are encouraged by LLMs and not negative (mostly) aren't taking time out of their days to write long blog posts or argue about it online.


I use Copilot for autocompleting the boring boilerplate. I like it. I also think LLMs are mostly useless.

It's not the technology, it's the stupid overhype. It really feels like all the HODL bitcoin cultists have finally gotten over their lost apes and found a new technogod.

So many people in these threads are convinced it's about to gain sentience. That's not going to happen. You get the people outing themselves by saying "it does my job better than me!"

If you say something honest and direct like "their output is mediocre and unreliable" or "the RNG text generator is not capable of thinking, you're just Clever Hans-ing yourself" or "if it does your job better than you, that says more about you than about it", you get people clutching their pearls like a Stanford parent whose kid got a D.

arXiv has turned into a paper mill of AI startups uploading marketing hype cosplaying as "research".


Conversely I'm shocked the negativity hasn't graduated to naked hostility among developers. A group that tends to pride itself on clarity of thought entranced by bullshit generators? A group that tends to pride itself on correctness of work cheerfully adding tools to their workflow that provably fuck up in unpredictable ways and that have to be monitored constantly for just such behavior? Why not hire a few junior devs instead if that's your jam, at least you can train a human towards competence.

As a developer I'm seeing less hate than apathy from my colleagues, but rather the hatred I do see is from people trying to push LLM's ON developers.

So it's from middle management levels riding the hype train, and possibly trying to save money and getting bonuses for it at the expense of other people.

Just like when offshoring was in its same point on Gartner's hype curve.

"Everyone has a model, but no one has a business".


No matter how widespread Copilot becomes, it won't make OpenAI profitable, nor will it enable Sam Altman or Jensen Huang to complete their apps.

I don't really get the hate; work is boring, if some tool can make it happen faster and with less effort I'm all for it. I don't hear the line cooks at McDonalds complaining that they have to use a semi-automated grill that beeps at them instead of an open fire.

I disagree. Tools which need to be babysat, the way LLMs do, slow you down rather than speed you up. It's like having to mentor a junior team member, except a human will eventually learn and you can just let him work. LLMs are incapable of learning, so you can't ever leave the phase where they are a drain.

So, in essence, it's now incrementally better than a templating script (except when worse), but Have Faith, it will be Better Soon. TBH, that's the same song that's been on repeat since the Dartmouth Workshop. In 1956. Jam yesterday and jam tomorrow, never any jam today.

If you want to see an LLM tackling unit testing, check https://www.codebeaver.ai

Disclaimer: I’m the developer behind CodeBeaver


Honestyl. It has its drawbacks but I am usually at 50x with few different agents running side by side. What we need is better GPU competition with tons of ram.

Doesn't that just scream "bad design" at you? Shouldn't we be aiming for agents that require less GPU? And agents that are good enough that we don't have to shop around for "competing prices" on answers?

I personally think IDE makes for worse programmers vs just using a text editor.

Writing the code isn't the hard part, and wrangling the computer is the part of the work I enjoy most. My problem with AI bros is that they explicitly want to automate all the shit people like to do, such that we can all finally be free to work service jobs.

I happen to value human creativity.


I feel like you missed the first third of this article that was quite clear they are not saying there are no uses cases. They are saying there doesn't seem to be an economic model that makes sense.

> it’s honestly not much worse than a junior dev and 100x faster. And if the code is bad you throw it away and try again 10 seconds later. The amount of speed up I see as a very experienced dev is astronomical

Personally, I find that waiting for the code to generate, then reviewing the code carefully, then deciding if I need to rewrite it to be more painful, more error prone, and much slower than writing the code correctly.

Especially since this AI junior never learn from it’s mistakes.

I think it speaks to different approaches to how individuals write code.

> How great is it gonna be in a year or two?

I would bet that it’s about the same (not great code, generally), but the tools fail to generate responses less often and likely would have more context.

Hopefully they become fast enough to run offline or at least feel more instantaneous.


Yeah I'm surprised by all the negativity as well. I'm listening to the post right now (using xtts-v2 finetuned on a voice I like lol). Sounds like these companies are overvalued / over hyped. Maybe they are / some of these companies go the way of myspace, but LLMs are incredibly useful for me.

I'm able to do a so much more using LLMs (Mistral-Large, Qwen2.5 and R1 locally, Claude via API) than without them.

I have to get the IDE setup properly now.


Personally, I've found DeepSeek R1 to be a profoundly good model for thinking through problems across fields.

I had a complex finance situation that I was struggling with, both from a mathematical/taxation perspective and a personal psychological finance hangup. I spent a few good hours talking to it through everything and had a mental breakthrough. To get the same kind of insight, I would have to pay a financial advisor AND a psychologist for several hours.

That all of this was free while someone calls it a "con" seems completely wrong

(I got my CFA cousin to look over the numbers and he agreed with R1's advice, fwiw)


Yeah, I've had similar experiences. I still hesitate if it's a field I don't know too well of course (never trust an LLM), but R1 has been able to solve things I've been stuck on. And watching it's <think></think> process has been insightful. Only issue is that it ties up all my GPUs while I run it.

Hopefully Mistral can copy their technique and give us a 123b reasoning model.


Did you run this locally?

It is difficult to get a man to understand something, when his salary depends on his not understanding it.

On this site, the adage cuts both ways.

It's just status anxiety. Mid engineers go on and on claiming theres literally no value from LLMs even possible in principle while top tier people are using them as force multipliers.

> top tier people

Who? How? This is not what I've seen where I work. There's a bunch of hubbub and generalized excitement, and lots of talk about what could be done, or what might be done, but not very much actual doing. I must just be a clueless "mid".


Yeah unironically.

Guido van Rossum - "I use it every day. My biggest adjustment with using Copilot was that instead of writing code, my posture shifted to reviewing code." https://www.youtube.com/watch?v=-DVyjdw4t9I

Here's Jeff Dean saying 25% of the characters in new PRs at Google are AI Generated. https://www.dwarkeshpatel.com/p/jeff-dean-and-noam-shazeer

Andrej Karpathy - "I basically can't imagine going back to "unassisted" coding at this point" https://www.reddit.com/r/singularity/comments/1ezssll/andrej...

Andrew Ng - "I run multiple models on my laptop — Mistral, Llama, Zefa. And I use ChatGPT quite often. " https://www.ft.com/content/2dc07f9e-d2a9-4d98-b746-b051f9352...

Simon Willison https://simonwillison.net/2023/Mar/27/ai-enhanced-developmen...

I mean I can keep going. I doubt you would compare yourself to these people.


These kinds of responses are my favorite dark pattern rhetorical device, because you can assert literally _anything_ in this format and almost nobody will refute you, because the cost of refuting bullshit is 100x the cost of producing it.

Anyways, here goes.

1. Guido uses Copilot like I do - as a StackOverflow replacement to write the dumb boilerplate code. A much less flattering quote is "It doesn't save me much thinking, but [it helps] because I'm a poor typist". Also it's literally a minute or two of a three hour podcast.

2. A lot of code is autogenerated lol. Again, it's all the boring boilerplate stuff.

3. The cofounder of OpenAI is a biased source lol

4. He's an AI researcher, of course he runs that stuff.

5. Again, similar to Guido. He's using it for the boilerplate. Nothing wrong with enjoying using it as a toy, as he is here. But he's not doing serious work with it.

There's no virtue in hyping this stuff like a HODL bitcoin cultist.


Look in the context of the comment I'm actually replying to your criticisms are just completely misplaced.

>here's a bunch of hubbub and generalized excitement, and lots of talk about what could be done, or what might be done, but not very much actual doing

I am showing that indeed many top people use the tools to make themselves more productive, in direct contradiction to the comment above.


And, as I stated, your own sources say it's a mild improvement at best. Despite what you're insinuating.

Listen: it's a fun toy. Engineers love toys and shiny distractions.

Don't confuse shiny rocks for gold.


> it's a fun toy.

Keep thinking that and don't feel too bad when 21 year old zoomers are 10x more impactful than you are at work.


I believe your examples are - unironically - misleading.

1- he states that the generated code is most likely wrong. He is appreciative of it though because he is a very poor typer so he doesn't have to do that part as much

2- so that's not supporting your argument that the 'top' devs are using it. Besides it doesn't say how it's counted, nor how much time is spent reviewing and correcting it

3- actually okay. But is he using it for production code? Doesn't say

4- he definitely doesn't talk about coding, only brainstorming and writing text.

5- your best one. Still, the use case here is side projects not production

You might still be right, I definitely do not compare myself to these people, but trying to glue some sources together makes a poor argument.

And the subject on hand is more that just using LLMs, it's the role of LLMs in the dev work environment


>You might still be right, I definitely do not compare myself to these people, but trying to glue some sources together makes a poor argument.

It makes a better argument then the bold and plainly wrong claim that no one is using them and its all just "a bunch of hubbub".


Yeah you are trying to make a better argument, but picking out-of-context references from random sources that actually goes against your point.

None of the 5 sources say that AI code generation is really making them more productive


If my Android (or IPhone) disappeared tomorrow, I would feel like I time traveled back a century. If Google search was gone, I wouldn't be able to do my job anymore. If the cloud disappeared, I wouldn't be able to build apps anymore. There are no workarounds, unless you feel like going to a library...?

If ChatGPT disappeared tomorrow (or derivatives like Copilot, etc.), I would be mildly inconvenienced. Then I'd go back to reading docs, writing code slightly slower and carry on. In fact, I did this already, several times (Copilot with GPT-3.5, Cursor, Copilot with GPT-4, Zed with Claude, etc.)


I think that's an unfair comparison. If the IBM Simon disappeared in 1994, I'm pretty sure you wouldn't have cared. If search engines disappeared in 1992, you'd have felt the same. Also, (what later became) AWS probably didn't interface much with you in 2003.

It takes some time for technology to mature, usually at least a decade or two. Even once the iPhone was released it took a few years until it became indispensable.


If IBM had been spending hundreds of billions on its phone thing in 1994, though, well, there would probably have been no IBM by the early-noughties when smartphones that people wanted to use started to become practical.

"It may or may not produce something useful, in a few decades" cannot justify the present level of spending; that's just not going to fly. Without concrete results _soon_, the whole thing is in very big trouble.


Also, it takes time for people to forget what it was before so when their current status quo is taken away, they don’t know what to replace it with.

I remember when people commented here that the blockchain was the same as early google search or early aws or early iPhone.

Everyone thinks their new thing is the T-1.


That's always been true. We went through PDAs, 3D TVs, smart cities, cyberspace, but some of those actually did become our iPhones. "Nothing ever happens" is just hindsight/survivorship bias.

That doesn’t mean they were a good use of our energy and hype. A lot of those also didn’t in my way or cost billions.

I spent a moment with my brain froze up apparently trying to remember when the first Terminator was supposedly deployed in one of the timelines.

Then I realized the reference was to the "Model T" car. Somehow in my brain the token for "Model" is actually necessary for the correct lookup.


And yet many, many people - including me - were shouting from the rafters that this is nonsense, that Blockchain has no real use case (aside from cryptocurrencies), and that this was all a massive bubble.

Most of those some people - again, including me - are saying the opposite about AI.

The lesson to draw isn't "there's a lot of hype, means there's nothing there", nor is it "there's a lot of hype, means there something there". It's "we need to actually think about the technology we're discussing and make object-level decisions, not meta-level decisions".


But the AI-public Internet timeline is more like 1995. if all web search disappeared in 1995 it would've been a massive loss, despite how primitive search engines were back then.

reminding me that ~1995 for a couple of years it was practical to print listings of web sites:

https://www.goodreads.com/book/show/2868341-the-internet-yel...


Chat right now is search in 99. Highly functional

Did you read the article? The author spends a LONG time going over why it's not early days for LLMs.

Here: https://www.wheresyoured.at/longcon/#:~:text=Also%2C%20uhm%2...


If Google search disappeared, I would still remember the names of several sites like Stack Overflow that are always at the top of the results, and I’d just go directly there.

Maybe the original Yahoo! style curated list of categorized links would actually be more useful for me at this point than Google with all the SEO spam.

That kind of high-quality directory combined with ChatGPT would probably replace Google for me.


> If Google search disappeared, I would still remember the names of several sites like Stack Overflow that are always at the top of the results, and I’d just go directly there.

I kind of agree with this, but Google is still, IMO, the best way to search these sites even when you know they exist because most of them have terrible local search.

Google searches with the site: tag are one of the few ways in which I find google search to be somewhat useful. Its pretty terrible these days at more generalized search due to their capitulation to SEO.


I think this was true for a long time and so it's burned into a lot of us as conventional wisdom, but have you actually tried lately?

I've made a habit to go directly to the website I want and use its search first. In almost all cases I find what I want, and I get to avoid touching Google.


Does anyone maintain a Yahoo-like directory anymore? I used to love browsing the internet that way, and I feel like it would be a really interesting way to find websites rather than content.

I agree, but I think such a project is just not realistic anymore. If you make it publicly editable (like a wiki) it'll just be full of spam, and if you don't, then you need swathes of human editors. And then you get into endless fruitless debates on exactly which websites should be listed and which should be excluded.

I've had some luck googling "awesome" "<topic>" "github" -- people make lists of projects and links and papers and such and I've found some gems on there when I'm looking for dataviz or csv cli tools or what have you.

Before 2015, I didn't have a reliable internet access (as in I can access when I want to, not I have access all the time) or electricity. But I manage to learn much about computers and programming. Why? Because a book is actually a very dense information repository. And the rest, you discover by reading code and manuals. The reason I use the internet today is mostly because no one takes time to polish their library and document it. Instead you have to wade through issues report and forums. Compare that with C, where it's much more slower and you can use man for function names.

The internet is much more faster, but you can do stuff locally if someone took care to download documentations (which I did because of the above reasons).


To play devil's advocate here, it takes time for society to orient around new technologies and make them feel indispensable.

If smartphones had disappeared in 2008, most users would be mildly inconvenienced. They'd go back to using a flip phone and a TomTom, or printing out map directions, and sending emails on their laptop with WiFi. No employer expected them to have a chat app on their phone* or use PagerDuty, no businesses required them to download an app to purchase services. People called taxis on the phone.

Perhaps in 5-10 years people will stop putting any effort into documentation or organizing information, (some companies are already ahead of the curve on this one) and our jobs will become that much harder without an LLM to sift through all the information.

* I'm ignoring the BlackBerry world here which was always pretty niche.


> Perhaps in 5-10 years people will stop putting any effort into documentation or organizing information

The LLM needs the documentation more than I do


LLMs (or other AIs) of the future may well get better than humans at reading badly written documentation, filling in gaps by cross-referencing multiple sources, or learning APIs entirely from reading uncommented source code...

I agree (mostly), but

> If the cloud disappeared, I wouldn't be able to build apps anymore.

Seems rather hyperbolic. Colos and shared hosting were a thing long before the cloud craze, and still continue to be a thing. I figure if nothing else a lot of people would go back to the still relatively low touch environment of uploading PHP or CGI scripts, which honestly seems like a pleasant change these days.


I don't think Google search is useful, as a developer. The top results are all garbage, unrelated, or ads. Going past the first page used to be useful, now it's not. Honestly, it's so bad, that I straight up do not believe you that you use Google to find new information.

DuckDuckGo/Kagi are where good search is at.


Agreed, I don't even bother with Google anymore. The only thing I use Google for is for local results, e.g. finding a restaurant. It's useless for finding actual information on the web.

>If ChatGPT disappeared tomorrow (or derivatives like Copilot, etc.), I would be mildly inconvenienced. Then I'd go back to reading docs, writing code slightly slower and carry on.

Except this is exactly what you would do if Google disappeared and did before Google existed. You're applying different standards.


While that might be true for you, it's not at all true for me. If generic chatbots went away, no big deal. If Windsurf IDE + Sonnet 3.5 went away tomorrow, my prospects would look very different.

The last production code I wrote was over 20 years ago. I don't know React and TypeScript. I recently created a SaaS MVP using Windsurf/Sonnet/React/Refine.dev/Supabase in 8 days. We already have live humans excitedly using the product.

The SaaS product is a recreation of an app that I tried as a startup a few years ago, which never got to even this level of traction and failed. We failed for many classic reasons, but one of the main reasons was that we had no truly technical co-founder, and could only afford an off-shore dev. Iteration took around 24 hours. Using Windsurf, a product shmoe like myself can iterate in 2 minutes.

Of course we will have to get a real React dev on board if we start to get real traction, but the LLM tools allowed us to explore an opportunity that would not have existed without them.

Disclaimer: I happened to use Windsurf, but there are other options like Cursor, which you might have better luck with. I am not on Mac OS, so Cursor was not an option for me.


What remains to be seen if your venture (and others like it) will translate into hundreds of billions of new economy. That was the effect of the iPhone, cloud, and Google Search (analogies used in the article). That's the difference between a "new era" and a cool tool that could of (and should of) been built at university with enough funding to produce it for the public good.

Yeah, I can agree with all of that comment.

My gut feeling is that this will play out like the dot-com bubble. There was definitely a bubble, it burst and many investors lost money, but the Internet did end up eating the world in the long run.


Congrats on the rapid MVP! Is it launched / publicly available? I'm in the same boat with modern front ends. Would like to see what the current LLMs can help with.

Thanks! It is not public yet. It's a b2b web app + browser extension. We currently have users in a friendly company testing it to replace their existing cumbersome Excel + Email/Teams workflow.

If you are in the same boat as me with modern front ends, come on in, the water's fine! I highly recommend using something like Refine.dev (YC S23) + Ant Design, or whatever is appropriate in your case. Giving a tool like Windsurf/Sonnet a much more narrow scope of options than just "React" is a huge win.

Even if you just started from a paid or free template for whatever language and framework you are targeting, it will greatly improve your chances of making something appealing, very quickly, using a tool like Windsurf.


It is sad that searching online is so much more convenient, since between downloading dumps of Stack Exchange sites and downloading the full documentation for programming languages and libraries I use it would be easy to have 99% of all the answers saved locally, but it would still be faster and easier to look those same answers up in the cloud.

And generative AI made it even worse, probably worse than any single thing before it.

Now you're being fed incorrect answer by the search engine built in llm, it's impossible to find legit reviews: they're either sponsored reviews or written by bots, image search is next to useless, it's impossible to find a recipe, you can't tell if they're legit or if they're written by an LLM and will completely fall apart because as it turns out the best cookie recipe isn't the average of all known cookie recipes


Google search example is interesting I wouldn’t care at all. YouTube or maps on the other hand…

I've actually been wondering what platform(s) people would flock to if YouTube were to suddenly disappear. Vimeo? DailyMotion? Or even... PeerTube??...

And how much money per month are you willing to pay for each of these capabilities/services? I can't imagine paying $50/month for any use of AI.

I willingly pay $80/month for a smart phone and its ecosystem. Google search (for general queries) is worth a lot less to me, since it's fairly easily replicated by federating a search across a dozen sites where most answers arise now. So I might pay $5/mo for that, or maybe $20/mo for code queries (to be paid by my employer of course). Internet access for desktop/laptop computing or for media streaming might be worth $40/mo.

So what are the end uses of AI that I'm willing to pay generaously for? If GenAI is supposed to revolutionize the infosphere, the question has to be, what service will it provide that can justify its cost -- the $trillion investment in infrastructure that is underway?

I have absolutely no idea what AI's killer app could be. IMO, not even a dedicated secretary/tutor/companion is worth more than $50/mo to the average bloke. And that drip of revenue per user isn't nearly enough to justify the development costs that AI is incurring now.


People lose their phones all the time. No one has a dramatic time travel moment. You open up your laptop, or walk to your desktop. It's not a necessity excepting actual phone calls.

If Google search disappeared and ChatGPT was still around, you could replace Google search with ChatGPT and still do your job

That's why it's not a completely product hype driven bubble since there is a useful product there


I think the above poster was comparing search engines to LLMs and just using Google as a generic example.

LLMs very obviously aren't a replacement for live human written information which is findable with a search engine.


For me Google search (and YouTube) feels like its downgraded in recent years, even before the AI craze. Maybe people aren't using SO as much as they used to.

If electricity disappeared we would go back to lighting candles…

And clearly candles are definitely viable products.

If you've been in SV long enough, you've seen multiple hype cycles. AI is the latest one. It doesn't mean AI is a con, there's just a tendency to exaggerate and make hyperbolic claims to sell.

AI is primarily used to write text and code. Actually integrating it into workflows across markets will take time.

A great article that is neither overly pessimistic or optimistic is Benedict Evan's The AI Summer [1]. He argues that there's a lot of excitement with big corporates but their actual adoption is low so far.

"an LLM by itself is not a product - it’s a technology that can enable a tool or a feature, and it needs to be unbundled or rebundled into new framings, UX and tools to be become useful. That takes even more time"

[1] https://www.ben-evans.com/benedictevans/2024/7/9/the-ai-summ...


In the securities industry, "hype" - i.e. making claims which can not be fulfilled - is known as "fraud" and it lands people in jail. And, even with a ton of regulation in the securities industry, we still get some kind of financial crisis every decade or so. So tech people: why not think ahead and stop making false claims before you (1) cause the next crisis and (2) get slapped with heavy regulations? Stop hyping and think.

>It doesn't mean AI is a con, there's just a tendency to exaggerate and make hyperbolic claims to sell.

"Con", in this context, is short for confidence. A con-man is a confidence man.

"a swindler who exploits the confidence of his victim"

It's a con.


Sure you'll find folks that perhaps literally believe that AI is bigger than fire and the wheel, or whatever. Or business leaders that more cynically try to ride the hype wave with their own customers.

I think the key here is actual usage and adoption. If early adopters (marketers and coders) keep using the new tools for real work over the long term then it's a positive signal.


You've got to distinguish the tech from the companies and salesmen. Like in the dot com bubble the internet was real and important but a lot of the companies were cons.

Likewise here AI is real and important but a lot of the companies are cons.


> you've seen multiple hype cycles. AI is the latest one. It doesn't mean AI is a con, there's just a tendency to exaggerate and make hyperbolic claims to sell.

The term "AI Winter" dates back from the 80s, and that should tell us something.

At every cycle, we have insane hype (remember when "expert systems" would replace doctors?), a lot of investment. Hype fails to catch up to reality, investors get spooked. Nobody talks about that flavor of "AI" for a few years, even though we usually get new and useful tools.


Not remembering when expert systems were the thing myself I looked it up:

>Expert systems were formally introduced around 1965

and hyped in the 1970s, launched in the 1980s and found to be a bit rubbish https://en.wikipedia.org/wiki/Expert_system


I agree. This is a helpful take. The book Prediction Machines [1] is a longer version, and longer view explanation, of what Ben Evan's shares in the post you link to.

[1] https://www.predictionmachines.ai/


The author betrays himself early with this line:

> a cynical bubble inflated by OpenAI CEO Sam Altman built to sell into an economy run by people that have no concept of labor other than their desperation to exploit or replace it.

He brings up the concept of labor and applies a moral judgement about "replacing" and "exploiting" labor.

And then he throws the kitchen sink at the technology. People use it sure, but it's because journalists write about it. How it's expensive to train. Throw a bunch of explicits and call it a hot take.

It's the equivalent of a vegan trying to convince you that eating meat is morally wrong, and will give you cancer, and make you fat, and give you ED, and ...

This doesn't work primarily due to the fact that most people reading this got real value from an LLM. And I'm sure the author did as well. Claiming otherwise is dishonest. So what is his problem?


How are you able to make the inference that "most people reading this got real value from an LLM. And I'm sure the author did as well." How are you so sure of that?

Speaking for myself, I've never gotten any real value from an LLM and their disappearance would not affect me in the slightest.

It sounds more like you were upset about his assertion because YOU derive value from an LLM, and are projecting that as some sort of dishonesty on the author's part.

Also, it was his intention to throw "the kitchen sink at the technology" as a means of showing its lack of value. In the same way a vegan would do exactly as you mention to show all the arguments AGAINST eating meat. It is meant to strengthen the intended argument through overwhelming evidence.


Chatgpt was the 6th (and climbing) most visited site in the world in January. Cursor is the fastest growing Saas of all time.

So yeah, most people get real value from LLMs. It's pretty plain to see, for anyone actually interested in seeing it.


As the article mentions, the number of users is wholly irrelevant to the discussion of how much positive value a tool brings to society (also factoring in the costs and negative impacts of the tool). This is a weak argument.

How much value a user gets from a tool is the users prerogative to give.

The user above is talking specifically about how much value users are getting out of LLMs. The number of users who consistently return is in fact a very good argument for the plain real world value being generated by LLMs.


Incorrect, I'm afraid. The "plain real world value" does not necessarily have any correlation to the number of users, so your argument again fails to hold water.

Simply consider the users that use generative AI in order to perform some unimportant work (as I hope is the case, for generative AI cannot produce anything of novelty, by design). If such work held no value to begin with, then through simple deduction you can conclude that the generative AI contributed nothing of value.

The only outcome worthy of consideration was the time reclaimed by the user not performing such mundane tasks (so that they can move on to perform... more mundane work?), in which case one must question the larger scope of process at hand.

This is just one counterexample to your underlying premise for your argument - that you know nothing of how the users use such tools, or whether their use even brings anything of value to the real world.

The point being, the number of users is a number with no meaning. It is a number used to inflate the faux excitement surrounding AI and nothing more. "Falling for the hype", if you like.

Extrapolating your faulty logic, I could say a pornography website is of extreme value to humanity. After all, literal billions of people visit such websites very frequently. This must bring real world value, no? Or cigarettes? Or TikTok?

If your definition of value is derived from self-indulgence, e.g., how much time one can spend away from work with AI, or how much one can smoke because it feels nice in the moment, i.e., the hedonistic evaluation, then by all means, spoil yourself. After all, I don't stand to revoke those liberties anyway.

But understand, that this definition of value is not the same as "plain real world value" by necessity, and as such, the "number of users" is no guarantee.


>Simply consider the users that use generative AI in order to perform some unimportant work

Says who ? I certainly don't use it for unimportant work.

>(as I hope is the case, for generative AI cannot produce anything of novelty, by design)

Another Unfounded Assertion

>then through simple deduction you can conclude that the generative AI contributed nothing of value.

Nonsensical. 'Unimportant work' that people keep doing is work that needs to be done. Getting it done is providing value.

>I could say a pornography website is of extreme value to humanity.

Pornography provides a lot of value yes.


You have once again conflated individualistic values with real world values, and shown bias to personal anecdotes. I am afraid you are simply incapable of understanding so there is no point to discussing this further.

>You have once again conflated individualistic values with real world values

I'm not. It's just nonsensical to think there is some 'real world value' independent of the people said product is targeting in the first place. You don't get to tell people what provides them value.

And by the way, this person I replied to in the first place is specifically commenting on this 'individualistic value' so I have no idea why you thought your comment was relevant if you think such distinctions exist.

You're the one who seems bent on personal anecdotes if anything if your assumption is that LLMs are used for 'unimportant work'. I did not initially bring anecdotes or assumptions into the matter at all.

>I am afraid you are simply incapable of understanding

Whatever floats your boat I guess

>there is no point to discussing this further.

Sure


Very few people get real value from an LLM. They're useful for minor grammatical checks, and idiot-level summaries. Maybe for entry-level coding. Outside of that, they're basically just gimmicks. You can't trust the output at all; literally everything an LLM outputs must be verified before it can be used for anything important, so you end up doing all of the work you thought you would have avoided by using an LLM.

> literally everything an LLM outputs must be verified

Until you prove P=NP I’ll take that as a win


Honestly can not understand this take. I put literally everything I write for work into an LLM. 80% of the time it has good suggestions to improve it. It is by far the best grammar checker I have found. Obviously dont ask for facts but it is incredible with language. Claude is far more useful to me than google search ever was.

> So what is his problem?

My guess: He's just posting a hot-take to farm for engagement.

I'm not saying he's wrong about everything. I'm just pointing out that he has a small incentive to be engaging.


> When you put aside the hype and anecdotes, generative AI has languished in the same place, even in my kindest estimations, for several months, though it's really been years. The one "big thing" that they've been able to do is to use "reasoning" to make the Large Language Models "think" [...]

This is missing the most interesting changes in generative AI space over the last 18 months:

- Multi-modal: LLMs can consume images, audio and (to an extent) video now. This is a huge improvement on the text-only models of 2023 - it opens up so many new applications for this tech. I use both image and audio models (ChatGPT Advanced Voice) on a daily basis.

- Context lengths. GPT-4 could handle 8,000 tokens. Today's leading models are almost all 100,000+ and the largest handle 1 or 2 million tokens. Again, this makes them far more useful.

- Cost. The good models today are 100x cheaper than the GPT-3 era models and massively more capable.


The "iPhone moment" gets used a lot, but maybe it's more analogous to the early internet: we have the basics, but we're still learning what we can do with this new protocol and building the infrastructure around it to be truly useful. And as you've pointed out, our "bandwidth" is increasing exponentially at the same time.

If nothing else, my workflows as a software developer have changed significantly in these past two years with just what's available today, and there is so much work going into making that workflow far more productive.


But if this is like the internet, it’s not refuting the idea that this is a huge bubble. The internet did have a massive investment bubble.

And I’d argue it took decades to actually achieve some of the things we were promised in the early days of the internet. Some have still not come to fruition (the tech behind end to end encrypted emails was developed decades ago, yet email as most people use it is still ridiculously primitive and janky)


Yes. But this article argues two things at once - that the technology is itself not useful and not used, and that this won't change in the future. And it also argues that this is a bad investment, at least in the form of OpenAI.

I have very little idea of the second - it's totally possible OpenAI is a bad investment. I think this article is massively wrong about the first part though - this is an incredible technology, and this should be evident to everyone (I'm a little shocked we're still having an argument of the form "I'm a world-class developer and this increases my productivity" vs. "no, you're wrong!" on the other).


While there was certainly a software bubble during the early internet, it still took obscene amounts of investments in brand new technologies in the late 90's. Entire datacenters full of hardware modems. In fact, 'datacenters' had to become a thing.

Then came DSL, then came cable, then came fiber. Countless billions of dollars invested into all these different systems.

This AI stuff is something else. Lots of hardware investment, sure, but also lots of software investment. It is becoming so good and so cheap its showing up on every single search engine result.

Anyway, my point is, while there may have been aspects of the early internet being a bubble, there were real dollars chasing real utility, and I think AI is quite similar in that regard.


Can it be an investment bubble but also a hugely promising technology? The FOMO-frothing herd will over-invest in whatever is new and shiny, regardless of its merits?

I recently compared the buildout of data centers for AI to the railway bubbles of the 1800s.

Nobody will deny the importance of railways to the Industrial Revolution, but they also lost a lot of people a lot of money: https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-envi...


> If nothing else, my workflows as a software developer have changed significantly in these past two years with just what's available today, and there is so much work going into making that workflow far more productive.

this is exactly the problem

The more productivity AI brings to workers, the fewer employees employers need to hire, the less salary employers need to pay, and the less money workers have for consumption.

capitalist mode of production


What's your opinion on the productivity boost open source libraries have brought to developers?

Did all of that free code reduce demand for developers? If not, why not?


> Did all of that free code reduce demand for developers?

the anwser is yes, while in the meantime, the expansion of the industry offset the surplus of developers.


I think the answer is that open source made developers more valuable because they could build you a whole lot more functionality for the same amount of effort thanks to not having to constantly reinvent every wheel that they needed.

More effective developers results in more demand for custom software, resulting in more jobs for developers.

My hope is that AI-assisted programming will have similar results.


How likely do you think this is Simon?

I don't really know myself, but I think there's a decent change that most developer jobs will actually disappear. Your argument isn't wrong, but when we're nearing (though still far from) the state where all productive tech work can be handled by LLMs. Once it can effectively and correctly fix bugs and add new well-defined features to a real codebase, things start to look very different for most developers.


Less productivity seems like a worse path.

it depends how you define good or worse

for humanity, the increase in productivity is progress

i'm not saying it's bad, i'm saying it has consequences


> LLMs can consume images,

Not very well in my experience. Last time I checked ChatGPT/DALL-E couldn't understand the its own output to know that what it had drawn was incorrect. Nor could it correct mistakes that were pointed out to it.

For example, I ask it to draw an image of a bike with rim brakes it could not, nor could it "see" that what was wrong with the brakes that it had drawn. For all intents and purposes it was just remixing the images it had been trained on without much understanding.


Generating images and consuming images are very different challenges, which for most models use entirely different systems (ChatGPT constructs prompts to DALL-E for example: https://simonwillison.net/2023/Oct/26/add-a-walrus/ )

Evaluating vision LLMs on their ability to improve their own generation of images doesn't make sense to me. That's why I enjoy torturing new models with my pelican on a bicycle SVG benchmark!


Cost as in, cost to you? Or cost to serve?

If the cost-to-serve is subsidized by VC money, they aren't getting cheaper, they're just leading you on.


I've heard from insiders that AWS Nova and Google Gemini - both incredibly cheap - are still charging more for inference than they spend on the server costs to run a query. Since those are among the cheapest models I expect this is true of OpenAI and Anthropic as well.

The subsidies are going to the training costs. I don't know if any model is running at a profit once training/research costs are included.


As a society we choose to let the excess wealth pile up into the hands of people that are investing to bring about their own utopia.

If we're stretching, we can talk about opportunity cost. But the people spending and creating the "bubble" don't have better opportunities. They're not nations that see a ROI on things like transportation infrastructure or literacy.

So unless the discussion is taken more broadly and higher taxes are on the table, there really isn't a cost or subsidy imo.


The cost to serve.

> Cost as in, cost to you? Or cost to serve?

This. IIUC to serve an LLM is to perform an O(n^2) computation on the model weights for every single character of user input. These models are 40+GB so that means I need to provision about 40GB RAM per concurrent user and perform hundreds of TB worth of computations per query.

How much would I have to charge for this? Are there any products where the users would actually get enough value out of it to pay what it costs?

Compare to the cost of a user session in a normal database backed web app. Even if that session fans out thousands of backend RPCs across a hundred services, each of those calls executes in milliseconds and requires only a fraction of the LLM's RAM. So I can support thousands of concurrent users per node instead of one.


> IIUC to serve an LLM is to perform an O(n^2) computation on the model weights for every single character of user input.

The computations are not O(n^2) in terms of model weights (parameters), but linear. If it were quadratic, the number would be ludicrously large. Like, "it'll take thousands of years to process a single token" large.

(The classic transformers are quadratic on the context length, but that's a much smaller number. And it seems pretty obvious from the increases in context lengths that this is no longer the case in frontier models.)

> These models are 40+GB so that means I need to provision about 40GB RAM per concurrent user

The parameters are static, not mutated during the query. That memory can be shared between the concurrent users. The non-shared per-query memory usage is vastly smaller.

> How much would I have to charge for this?

Empirically, as little as 0.00001 cents per token.

For context, the Bing search API costs 2.5 cents per query.


Ah got it, that's more sensible. So is anyone making money with these things yet?

The efficiency gains over the past 18 months have been incredible. Turns out there was a lot of low hanging fruit to make these things faster, cheaper and more resource efficient. https://simonwillison.net/2024/Dec/31/llms-in-2024/#llm-pric...

Interesting. There's obviously been a precipitous drop in the sticker price, but has there really been a concomitant efficiency increase? It's hard to believe the sticker price these companies are charging has anything to do with reality given how massively they're subsidized (free Azure compute, billions upon billions in cash, etc). Is this efficiency trend real? Do you know of any data demonstrating it?

I have personal anecdotal evidence that they're getting more efficient: I've had the same 64GB M2 laptop for three years now. Back in March 2023 it could just about run LLaMA 1, a rubbish model. Today I'm running Mistral Small 3 on the same hardware and it's giving me a March-2023-GPT-4-era experience and using just 12GB of RAM.

People who I trust in this space have consistently and credibly talked about these constant efficiency gains. I don't think this is a case of selling compute for less than it costs to run.


People are comparing the current rush to the investments made in the early days of the internet while (purposely?) forgetting how expensive access to it was back then. Not saying that AI companies should make a profit today, but I don't see or hear that AI usage is becoming essentials in any way or form.

Yeah that's the big problem. The Internet (e-commerce, specifically) is an obviously good idea. Technologies which facilitate it are profitable because they participate in an ecosystem which is self sustaining. Brick and mortar businesses have something to gain by investing in an online presence. As far as I can tell, there's nothing similar with AI. The speech to text technology in my phone that I'm using right now to write this post is cool but it's not a killer app in the same way that an online shopping cart is.

There's a little grain of salt with respect to context lengths: the number has grown, but performance seems to degrade with larger context windows.

Anecdote:

I often front-load a bunch of package.jsons from a monorepo when making tooling / CI focused changes. Even 10 or 20k tokens in, Claude says things like "we should look at the contents of somepackage/package.json to check the specifics of the `dev` script."

But its already in the context window! Given the reminder (not reloading it, just saying "its in there"), Claude makes the inference it needs for the immediate problem.

This seems to approximate a 'working memory' for the assistant or models themselves. Curious whether the model is imposing this on the assistant as part of its schema for simulating a thoughtful (but fallible) agent, or if the model itself has the limitation.


> This is missing the most interesting changes in generative AI space over the last 18 months

I agree, though personally I'm liking the "big thing" as well. R1 is able to one-shot a lot of work for me, churning away in the background while I do other things.

> Multi-modal

IMO this is still early days and less reliable. What are some of your daily use cases?

> Context lengths

This is the biggest thing IMO (Models remaining coherent at > 32k contexts)

And whatever improvements have caused models like Qwen2.5 to be able to write valid code reliably vs the GPT-4 and earlier days.

There are a whole lot of useful smaller niche projects HF like extracting vocals/drums/piano from music, etc


Multi-modal audio is great. I talk to ChatGPT when I'm cooking or walking the dog.

For images I use it for things like helping draft initial alt text for images, extracting tables from screenshots, translating photos of signs in languages I don't speak - and then really fun stuff like "invent a recipe to recreate this plate of food" or "my CSS renders like this, what should I change?" or "How do you think I turn on this oven?" (in an Airbnb).

I've recently started using the share-screen feature provided for Gemini by https://aistudio.google.com/live when I'm reading academic papers and I want help understanding the math. I can say "What does this symbol with the squiggle above it?" out loud and Gemini will explain it for me - works really well.


Multi-modal was the absolute game-changer.

Just last night I was digging around in my basement, pulling apart my furnace, showing pics of the inside of it, having GPT explain how it works and what I needed to do to fix it.


I would never trust an LLM to do this unless it was pointing me to pages/sections in a real manual or reputable source I could reference.

I admire your optimism that good manuals and reputable sources exist for the average furnace in the average basement.

If there are no reputable sources to point to, then where exactly is GPT deriving its answer from? And how can we be assured GPT is correct about the furnace in question?

I mean.. I fed it all the photos of the unit and every diagram and instruction panels from the thing. I was confident in the information it was giving me about what parts did what and where to look and what to look for. You have to evaluate its output, certainly.

Getting it to fix a mower now. It's surfacing some good YouTube vids.


I use it like that all the time. There's so much information in the world which assumes you have a certain level of understanding already - you can decipher the jargon terms it uses, you can fill in the blanks when it doesn't provide enough detail.

I don't have 100% of the "common sense" knowledge about every field, but good LLMs probably have ~80% of that "common sense" baked in. Which makes them better at interpreting incomplete information than I am.

A couple of examples: a post on some investment forum mentions DCA. A cooking recipe tells me "boil the pasta until done".

I absolutely buy that feeding in a few photos of dusty half-complete manual pages found near my water heater would provide enough context for it to answer questions usefully.


I would accept a link to a YouTube video with a timestamp. Just something connected to the real world.

Oh right, yeah I've done things like this (phone calls to ChatGPT) or the openwebui Whisper -> LLM -> TTS setup. I thought there might be something more than this by now

This is one of the most tilted pieces I’ve ever read. For months Ed has predicted The AI Bubble will burst “any day now” frequently citing ai company’s revenue as a sign the product is not viable and the valuations are too high. The valuations seem to be primarily based on the R&D progress instead of on a theory that widespread adoption of the existing product will experience an uptick. The current landscape imho should be viewed as an R&D race amongst private actors.

> The valuations seem to be primarily based on the R&D progress

There hasn't been much R&D progress, though. Sure, as another commenter pointed out, context lengths have gotten longer and chat models can interpret images now, but the industry figureheads have been pushing agents, and we're not much closer to those than we were two years ago when GPT-4 came out. Current models simply are not consistent enough to do the kind of agentic stuff that AI valuations are predicated upon, nor is there any sign that a significantly smarter GPT-5 is just around the corner. Multi-modal chat is cute, but OpenAI is burning money. They're all burning money, and they don't have a product. They imply and imply that there's something big on the horizon, but it's been years, and there just isn't a killer app yet. Their platform isn't good enough, and it's not improving in the ways it would need to in order for Godot to arrive and for agents to be feasible.


Recent results are showing exponential improvement in reasoning and dramatic decreases in the time and cost to train models. O3 now ranks 50th on code forces according to openai staff. Are you aware of all of this and still say R&D hasn’t progressed?

You can invest in building bigger and more complicated pipe structures, but until you show the field that is supposed to be irrigated, you can't say you're disrupting farming business.

In the context of Moore's law exponential growth was measured in the number of transistors per integrated circuit. This seems vigorous and straightforward.

With AI the improvements have certainly been impressive but it isn't straightforward how you can define "reasoning" to measure whether or not the reasoning is exponentially "improving".


I think you can simultaneously think that there is some real value being made with LLMs and also look at OpenAI losing $5B a year or thereabouts and really wonder how they're not going to run out of money.

That said, I'm learning a new sdk and I've moved 500-1k searches a month from kagi and google to llms.


By this I mean it’s a bet on what R&D might yield, current progress being some kind of signal. No one has certainty here. It’s an emergent technology and no one knows for certain how far it can be pushed.

The writer is a journalist, runs a media firm and podcasts. His job is to get attention. You get attention by being outrageous. The “AI will kill us all” take is covered by too many people, so he’s taking the “AI is doomed” path. No one is going to engage with a reasonable middle-of-the-road article. He’s got no credibility on this subject, but he knows how to turn attention into dollars. Everyone here keeps falling for it.

He's a good writer. Even if I don't agree with this opinion on it, I still enjoy reading it. Color me "fell" then?

> a theory that widespread adoption of the existing product will experience an uptick

because profit of this can not cover the investment in this industry

adoption of iphone/smartphone/internet brought new products, including those for reproduction and those for consumption

but generative AI is totally different with iPhone, consumers maybe willing to buy a new ai-powered iphone __just like how they bought new iPhones for every 2years before__

> The current landscape imho should be viewed as an R&D race amongst private actors

in fact, it's a CapEx race, you don't need to R&D anything (ofc you must pretent you do)

that's why it's a con

> The AI Bubble will burst “any day now”

"The canary in the coal mine to look at is when Satya Nadella or Sundar or Zuckerberg say, ‘You know that $80bn of capex I said I was going to do? I think I’m going to cut that by two-thirds.’ That’s what you need to look for."

that's the day


I can’t help but be reminded of Greenspan’s remarks on the housing market in 2006 while reading this comment:

    While he was chairman of the central bank through January 2006, Greenspan always denied there was a 
    bubble in the nationwide U.S. real estate market, 
    saying only that a certain number of metropolitan real estate markets could 
    see declines in home values because of a localized run-up in prices. That view of any real estate bubble as 
    a merely a local phenomenon is a condition he 
    termed as "froth" in congressional testimony in 2005, as well as in subsequent comments.

A failure of imagination I suppose.

hes missing a critical insight which is that this is being treated as an arms race now. the money will keep coming for a lot longer than it should.

> This is one of the most tilted pieces I’ve ever read.

He comes across as just a ludicrously unpleasant, spite-filled person.

> I'm fucking tired of having to write this sentence.

> I am so very bored of having this conversation

> I don't care about this number!

> Shut the fuck up!

> This isn't the early days of shit.

> Didn't we just talk about this? Fine, fine.

> $3.25 billion a quarter is absolutely pathetic.

> This isn’t real business! Sorry!

> He said in one of his stupid and boring blogs that

> This man is full of shit! Hey, tech media people reading this — your readers hate this shit! Stop printing it! Stop it!

> It's here where I'm going to choose to scream.

> Dario Amodei — much like Sam Altman — is a liar, a crook, a carnival barker and a charlatan, and the things he promises are equal parts ridiculous and offensive.

> Why are we humoring these oafs?

> Despite Newton's fawning praise

> Nobody talks like this! This isn’t how human beings sound! I don’t like reading it!

> Ewww.

> I'm sorry, I know I sound like a hater, and perhaps I am, but this shit doesn't impress me even a little.

> I know, I know, I'm a hater, I'm a pessimist, a cynic, but I need you to fucking listen to me: everything I am describing is unfathomably dangerous

> expensive, stupid, irksome, quasi-useless new product

> I know this has been a rant-filled newsletter, but I'm so tired of being told to be excited about this warmed-up dogshit.

> I refuse to sit here and pretend that any of this matters.

> I'm tired of the delusion. I'm tired of being forced to take these men seriously.

When I read this kind of thing, it’s very apparent that this is being driven entirely by spite not insight. He’s just so angry about everything. There are 57 exclamation marks in this article!


Indignation and fulmination is his shtick and gets him shared across social and the like.

I know! I Loved it!

There’s substance under the brashness, though. He’s just upset that his reason is contradicting what everyone around him is saying, struggling to cope with the dissonance. Like being gaslighted. It’s a natural reaction, but I agree, annoying to read.

He hasn't said anything about when the bubble will burst. Just says that it is a bubble and it will inevitably burst.

Spotting bubbles and predicting they will burst at some point is not a particularly useful skill. Housing in Amsterdam was in a bubble for 37 years in the 1700s; identifying the bubble early on would have been completely pointless.

So what should one do when one notices a bubble?

If I had a reliable / repeatable answer to that question, my net worth would be a very large multiple of what it currently is!

And a Nobel prize in economics to your name!

Try not to be caught up in it, even if you feel a lot of FOMO as it lasts longer than you expect.

Sure, but predicting when an economic bubble will burst with any accuracy is virtually impossible.

Directionally correct.

GenAI is - imo - an assistant. Copilot does effectively templating.

I can have ChatGPT read an email and check it for tone.

Claude can comment on camera kit.

Claude does a very nice image recognition for obscure things.

What I have become persuaded of is that the /completions API is simply not much more than +10% or a low key helper.

I do not need a dumber-than-intern agent going ape on my codebase at speed, which is, approximately, what the codegen tools seem to do.

I saw a self driving car startup using a GPT neural network to recognize images during driving. I would assess that class of use as plausibly very promising.

I would also hazard that Shirkys BS jobs thesis is being proved true, because if a hallucinating ai can do it...

Anyway.

I don't think the fundamentals justify the spend. I think there's too much vitriol, but there's also too much hype & by a country mile too.


> I can have ChatGPT read an email and check it for tone.

Maybe Im crazy but this alone is a trillion dollar market cap industry imo. msft is worth 3 trillion off the back of similar products. If LLMs are seen as indispensable by every office worker in the country, as I think they are, and every employee has a subscription for $20 a month we're looking at many billions in revenue.


20-60 million office workers in the US * $20/month = $5-15 billion, but for a specialist AI, companies can charge more. $200/month * 10 million people = 24 billion.

> the /completions API

> I do not need a dumber-than-intern agent

two tells that you have not updated since 2023


Ed occasionally makes good points, but he's very very angry at Big Tech, and his anger often gets in the way of his message. Reading his latest rant reminds me of Karl Denninger railing against Google around the time of their IPO, claiming they would never make enough money to justify an $85 share price (a $1000 investment then would be worth around $375,000 today).

I think there's the same logic flaw of looking at how things are at the start - so so - and how they may be in 20 years - Google getting an advertising cut for most of the world's commerce, AI replacing/doubling the ~100tn/yr labour market.

Fortunately his message is still big enough to get through.

Maybe OpenAI can become an advertising company?

Maybe. Did you foresee Google becoming a massively profitable advertising company with a search engine attached in 2004? I certainly didn't.

By 2004 AdWords was already like 90% of their revenue. If anything, with their cloud business they are less of an advertising company now.

Google solved a real problem. They indexed the web and made search work, and they did it very cheaply. So cheaply, in fact, that they could give their service away to users and monetize it with ads. LLMs are not like this. They're both extremely expensive to run and they don't do anything truly valuable--there's no killer app. So how exactly is OpenAI or their ilk (or for that matter the rest of us) supposed to use these things to make money?

This is the only question, and the fact it's still an open question just screams "hype bubble". My bet is this AI stuff goes the way of the NFT.


I'd take that bet. Google offers a very expensive service for free, but is able to monetize it with ads. Sometimes connecting users to companies is what users actually want. But Google has this problem that since the service is free, its users feel entitled to everything for free. They can't just go and charge people what it costs to run a Google search.

OpenAI doesn't have this problem. ChatGPT has a free level to get you hooked, but it's restricted. So a lot of users pay them $20 or $200 or some other amount per month to use their service. So how OpenAI makes money is by selling access to their service. What you do with it is up to you, but their value proposition is simple. Pay us to get more/better access to our service.

How much it costs them to operate the service is a secret known only to them. There are a lot of very very educated guesses, but they're just guesses. After the VC money runs out they'll have to charge more than it costs to provide the service to stay afloat, and then we'll see. $20/month for ChatGPT plus is the $1 Uber that got people hooked. There's already a $200/month tier.

Whether OpenAI, specifically, will be standing in 20 years, only time will tell. But by this point it should be obvious that there's something to this LLM thing. Even if the product doesn't get any better than it is today, it'll still take 5-10 years for its effects to reverberate through society.

The killer app is LLM-accelerated programming. Sure, it doesn't work for all domains and it can't do everything, but even if the only thing it's good for is creating JavaScript react CRUD apps, well, there are a lot of those out there, and they're not actually limited to that. And since tool use means they can generate code and compile it and test that it works, it's possible to generate datasets for other languages and libraries, the only question is which ones is it worth it for.

It might not help at all in your line of work, but a friend who does contracting is able to use LLMs to cut the time it takes him to do a specific kind of job in half, if not more, enabling him to take on twice as many clients and make more money. For him it would still worth it even at 100x the current price. thankfully competition means it'll take a while before it's that expensive.


A huge part of our problems stems from the fact that it's possible to make "companies" whose business model is built on losing money hand over fist until they've brainwashed everyone into thinking their "product" is good. In a sane world these companies would fail and AI would continue develop through small failures and small successes over a period of years or decades. Instead we get a firehose of nonsense just because a small number of wealthy people are willing to gamble.

On the plus side, if all the AI companies collapse there will be a lot of spare hardware.

Open source projects would have a lot more compute to work with.


That would be nice, but I'm more worried they won't collapse because they'll succeed at shoving their snake oil down the throats of enough big players to ensure their survival.

This is a great point. Now I have extra reason to cheer on the bubble bursting.

This is my thought, sure a lot of VCs will lose a lot of money as they write things off.

But it's not like people are going to throw out all the Nvidia hardware they bought.

And there are ai applications that I can think of that would be viable at 100x cheaper price.


> So, what exactly has generative AI actually done? Where are the products?

The product is ChatGPT, actually.

If LLMs are a bubble, then you should expect most of OpenAI's revenue to come from its API (which is used by startups which have raised money to do "magic AI stuff", and the bubble would pop when investors would stop giving the money). But according to https://futuresearch.ai/openai-revenue-report, revenue from the API accounts only for 15%, the other 85% being the different subscriptions offers, including 55% of ChatGTP Plus subscriptions -- that is, _direct consumers_.

This doesn't prove that it isn't a bubble (the consumers could realize it's useless and then leave some time later), but it makes it less likely IMO.


What a clown analysis. Sorry, that is not worth the bits it is written in. OpenAI is a privately owned company and does not publish its financials. The sources your link brags about are pathetic beyond belief,

> This report draws on a variety of sources, including those not easily found by search engines, e.g.:

> * Personal anecdotes of pricing info from sales calls

> * A blog that used DNS records to infer which Fortune 500 companies pay for ChatGPT Enterprise

> * Transcripts of OpenAI exec interviews

> Just as important are the data points FutureSearch rejected when they didn’t corroborate more trustworthy sources. The core assumption made here is the $3.4 billion in ARR that Bloomberg and TheInformation reported Sam Altman said in a staff meeting.


> And even then, we still don't have a killer app! There is no product that everybody loves, and there is no iPhone moment!

I would strongly argue that coding assistants are AI’s first killer app. Copilot, Cursor, Windsurf etc.


Say you reduced their revenue to only that application. Would it be sustainable? Would it be worth the billions upon billions of dollars that have been shoveled at it? Would it add more than the billions upon billions of dollars in the end?

By your logic I could claim a quantum computer with qubits on the scale of the mass of the sun is a killer app for doing RSA encryption breaking. And I would be making an equally useless statement.


This is moving the goalpost on what "killer app" means. Code assistants are a compelling use of the tech that has quickly shown real-world value, which is the point I'm trying to make here.

Whether the companies that are leading the market today will end up being the ones who capture that value is anyone's bet.


I fail to see how a technology that is to expensive to maintain can have a killer app.

It’s only going to get cheaper over time. It’s already cheap enough that if these services disappeared overnight I’d switch to an open source alternative with a local model. The industry needed VC backing to pay the fixed cost of the research, but the cost of running inference is not insane compared to the volume it provides.

I think the argument this article makes though is that generative AI isn't generating enough value to justify the sky high valuations and investment. Sure, if all of these services disappeared overnight, the remaining users could self host or run a model locally. I feel like that speaks to the lack of value any one company provides though?

Maybe this era of AI will be remembered as a wealth transfer from VCs back to everyday consumers lol


Only insofar as $1 Uber rides were a transfer of VC funding to everyday consumers. When the funding dries up and there's a need for revenue, then we'll see what they charge. Hopefully for them, they've become inextricable from our lives, like Uber has, and we'll gladly pay the new price. Not everyone will be hooked though, but the VC bet is that enough people will be that their horse wins the race and they'll make money.

There are a lot of horses in this race and the literally billion dollar question is who's Amazon this time around, and who's Webvan. Who's Uber and who's Flywheel (which doesn't even have a Wikipedia page anymore, ouch). Not knowing which horse is going to win doesn't negate the fact that a horse is going to win.

Model available LLMs, like Llama and Deepseek and StableDiffusion are totally a wealth transfer to consumers. Better make use of them!


> I would strongly argue that coding assistants are AI’s first killer app. Copilot, Cursor, Windsurf etc.

These IMO are relatively useful things. But probably (in their current state) will not justify the valuation of the companies involved and the massive investment occurring right now.

I don't know how the future will unfold. I do think it is reasonable to be somewhat bearish on what has been promised vs. what has been released.


Are they? I find Agentic mode on most editors barely useful. Autocomplete and inline editing is great though.

To use these tools properly, you need to know how to build the same thing precisely.


Nah you don't to know to build the same thing precisely. Just the other day I wanted to write a vanilla JS component that could let you select a picture from something like a carousel and be able to blow up the selected picture when clicked. I know JS / HTML but am not used to working with vanilla JS. Copilot didn't write it all by itself but it did teach me things I didn't know like making a custom tag in vanilla JS by extending an HTMLElement.

The code isn't the most readable because I don't need it to be however if you make me write it from scratch in an interview style setting I'd have trouble doing it. If I read the code I can follow it and it makes sense + it's an easy component to manually test. So.. no, I don't need to know how to precisely build the same thing.

And before you worry that I'm committing code I can't build from scratch.. This is a simple component for a 5 page landing page build with astro where I'm the "main" dev ( wrote like 80% of the code). The web-page won't even need maintainance once it's deployed


It’s not a common use case though, dipping into an unfamiliar tech stack only to dip out after committing the code. Typically, when you learn a new stack (eg. for a job), you’ll be living in it for at least a few months, and at that point, you’d be better served by perusing the docs and getting deeply familiar with the API.

The copilots get you going quick at the expense of your learning, which is great for one-offs, but not lasting work quality.


I believe if you spend months / hundreds of hours using any framework / tool you will eventually read so much of it that it's easy to get "deeply familiar" and even then.. it's often times faster to have an LLM write 80-90% of the code for you and just refine / finish it.

> Copilot didn't write it all by itself but it did teach me things I didn't know like making a custom tag in vanilla JS by extending an HTMLElement.

> This is a simple component for a 5 page landing page build with astro

You're already in the over-engineered section there.


Not that over engineered, here's the code :

https://gist.github.com/lazarcf/c80ae6f9362aaf3aa92e21e3a0dd...

you can see the component (the "image gallery") here : https://pixico.roware.ro/apa-distilata/


But that still doesn't exclude copilot, which is the most intuitive and genuinely useful version of ai after chatgpt itself.

I don’t deny it.

I used claude + gemini to whip out a rewrite of a show HN project in less than half an hour to deployment yesterday.

https://news.ycombinator.com/item?id=43071381

But this sort of work is fairly low value and boilerplate-y.


I think autocomplete alone would be enough to make coding a killer app for AI.

I agree the tools are overhyped for allowing non-developers to write code. It’s not (today) a replacement for a dev agency that takes a set of requirements and runs with it, it’s a replacement for a junior developer who you need to micromanage a bit. But that’s still a boon to productivity!


I keep on trying them, but if they are useful they are useful for only a small fraction of engineers at the moment. I'm not sure if this is due to the nature of the work, or the nature of the user.

I have heard "top" engineers at various places say it makes them 2x faster, or whatever, but I would like to see this assessed by timed testing, as is sometimes done for evaluating software engineering.

Copilot may let me type less, but I have not seen the wall clock effects, which is a very hard thing to measure (time perception is very unreliable).


https://news.ycombinator.com/item?id=43071381

You can see this example where I timed myself to deployment using AI tools to rewrite a show HN project in half an hour. The code is open source.

My comment was posted 2 hours after show HN when I saw it on front page so you know I didn’t lose track of time I spent.


The vast majority of software work is not greenfielding a PoC or reimplementing an existing, small, well-specced project. We’ve had OpenAPI client generators for years after all.

The majority of software work is maintaining large, existing products: adding features, fixing bugs, improving performance, etc., or building new software in problem domains that aren’t so well-defined.


This is my experience too.

I think it also really accelerates learning of a new language or framework, when that language or framework is really well documented on the web. For novel programming frameworks, obviously it's a bit more challenging to get help from an LLM.

One of more recent attempts at using LLM code assist was to try to fix a bug in a Swift SSH Agent's connection handling that was causing hangs. I know zero Swift, much less the networking frameworks. So I pumped the output of `tree` on the git repo into the LLM, asked for which file likely handled connections, and it found it right away. That's probably 15 minutes saved. Before putting in the file I asked for likely reasons for deadhangs, got that list, then put in the Swift file that handled connections, and it pointed to what the likely problem was. That's probably 1 hour+ of reading documentation to try to figure out what the code was doing wrong with the networking framework, assuming the LLM was not hallucinating. And that "not hallucinating" probability is high enough in my experience that I spend >50% of my time trying to verify I'm not getting bullshitted.

The LLM proposed a fix (~10-20 minute savings), but even as somebody who doesn't use Swift it seemed like >99% chance that it had just introduced a bunch of race conditions in the data structures it used to track connection status. So I asked about it, and it said "Oh yeah of course how could I forget" and then significantly complicated the solution with something that I thought looked like it probably worked. But was the LLM just being obsequious or was it correct the first time? So hard to tell...

So in about 20 minutes I probably accomplished in a language I didn't know, in a code base I didn't know, about 2 hours+ of learning.

But if I knew the language, it would have saved me very little time, and may have cost me some time.


Searching is an area where LLM technology excels. It makes sense, given their structure.

Of course, you have to find a company willing to spend more money on worse (for them - less ads) search, and it's won't be Google.

The results aren't always accurate, but neither is Google...


Most of my LLM experience is pulling off the mask with a label of "AI" only to find another mask underneath labeled "information retrieval".

In my opinion, AI only really helps you (a lot) if you are bottlenecked by the actual code-writin. I have not been in a such a position since... I dont even know. Maybe In my 20s, 15 or so years ago? Even if AI wrote my code 100x faster it would not appreciably change my working days.

If it could test and verify things though... ideally physically since Im in embedded and pulling SD-cards etc is a thing.


Yeah I’m in the same boat with it. I do keep trying, but so far it has been far from earth-shattering.

I’d love it if I could get it to write decent unit tests given a basic description of what I’m testing but I at least cannot get it to output anything useful for our codebase. It’s too unaware of the broader context of the code, what objects need to be instantiated from some other internal library and passed in, etc. It can do a decent job if all I have is a totally isolated function that doesn’t touch the rest of the codebase or use any domain objects, but that’s a rare enough case as to be essentially useless to me.


I agree it's impressive and stuff, but I wouldn't consider a JS POC as a serious project. I have never done that in my whole life and would rather see results from a 10 years old application with a million lines of code of C++. That's would be realistic. What you did is refactoring a pet project and I don't know why we're wasting $billions for that.

The killer app is entertainment. Since LLMs emerged people have consistently loved getting them to say whatever they want or roleplay with them. Once integrated into games, it will be very fun to have natural conversations with the inhabitants of game worlds. Imagine a goomba talking to Super Mario before he stomps its head or delivering your pithy one liner in response to some final boss’s villainous monologue.

and strangely in two years no one has demonstrated anything like this that people found of value. In fact for all the "entertainment" sectors it has been injected into, we have gotten poisoning of self-publishing and soulless generic pornography. Remember the twitch channels that had AI generated content? Where did those go? Surely by your rationale the market would have taken over by now. Surely there would be something.

It's almost like entertainment requires some humanity and thought and true creativity behind it.


The primary reason is because LLMs are expensive computationally and financially.

In an open world game, it’s trivial to assign memories and facts an AI learns about its world from interactions or in response to game events. All an LLM has to do is be fine tuned to take data from that internal knowledge base and express it as natural language text, in order to have intelligent and useful conversations with a player. It’s not difficult.


This only works if your game doesn't use the GPU. Then there's the whole problem with nondeterminism. But I'm sure when those small problems are solved people will use this technology in games /s

by giving the models durable memory, they become "agentic". in any case, they don't make as much of a mess when their output is getting written into git.

I think at this point it's an open question which comes first; the LLM bubble popping, or Ed Zitron exploding from pure indignation.

I largely agree with Zitron but I think the angry-newsletter bubble he's milking is also going to pop sooner or later, lol. And I can't say I'll be sad when it happens.

Software development to me has always been about the 80-20 rule. You build 80% of the functionality in 20% of the time. Next you spent 80% of your time to build the remaining 20%.

With LLMs it feels we are getting near to 90-10. Finding the bug in those good-looking pieces of generated code is pretty hard. (After all, you did not pay a lot of attention to the generated code, it looked pretty solid) Some will argue that the LLM should spot the bug, Indeed, it should ask clarifications about the requirements. One day… but you need an expert to understand and answer the questions for that last 10%.


I feel like finding bugs in golang is a non issue. The language is typed, so obvious problems are caught early in the editor. Unit tests mop up the rest (unit tests which are also written by copilot in my case).

How I write unit tests: Open the chat menu, paste in function signature, describe the tests I want. Out pop the tests. Run, fix code as needed. Add more tests, etc. Super easy.


Or way easier, build an harness, then just copy-paste the few tests you created at the beginning, because test codes is the most repetitive code I've seen. No need to rely on external services and much more simpler to maintain and reason about.

A refreshing read, the take down of OpenAI's 300m users is very weak though and the hand waving "I know GenAI has use cases" could be fleshed out. It's a bubble obviously and some useful software will get written, which has happened in every other bubble and will continue to happen.

It's just this bubble is so public, rapidly moving, and capital intensive.


The hard thing is that it's both a bubble and not.

It's a bubble in the respect that the hype around integrating into existing companies/software is likely often falling flat.

It may not be a bubble in that all of the best/useful/valuable use cases of AI are in new software, which have yet to prove themselves in the enterprise. This makes sense because you can't just bolt it onto existing software/organizations and expect it to work because they're built around the way things used to be, similar how when factories first tried to integrate electricity.

For example, I'm sure Palantir is doing some good stuff, but I just have doubts about how useful AI can be in the context of existing companies. And their valuation seems insane, which screams bubble, especially since they're an older companies and less 'AI-native' than the newer ones, like the clunky ways Salesforce and Microsoft implement AI.

But do I expect startups to continue to emerge that approach problems in AI-native ways that help companies reorganize? Yes, it's just a question about how long it takes these companies to work their way into the enterprise and earn enough credibility to drive organizational change and restructuring.

The 'bubble' question is really about whether this latent/potential productivity will be enough to inflate the bubble before it bursts.

My money is on yes, but rather than picking a winner at the app layer and trying to win the lottery I'm heavily invested in the boring stuff like chips (NVDA) and those building data centers with low P/Es (back when I bought them), thus a lot of room to grow even conservatively.


>People like Marc Benioff claiming that "today's CEOs are the last to manage all-human workforces"

This is the real problem. Companies have ALREADY starting laying off significant percentages of their workforce because they're buying into the AI "digital worker" hype without any idea of how exactly AI is going to do the jobs of 80% of THEIR employees across all departments in the next year or so.


No, they're using that and RTO as an excuse for layoffs. They layoffs were coming inevitably from ZIRP going away. If you were a CEO and looking at that balance sheet would you tell investors "we need to shed workforce because we can't be profitable" or "we can shed workforce because we're cutting edge". Stop listening to CEOs whose literal job it is to make their company look good under any and all circumstances and start looking at SEC filing numbers.

This part. Companies love a smokescreen when they need to tighten their belts.

> And when this all falls apart — and I believe it will — there will be a very public reckoning for the tech industry.

I shocked he really believes it in his closing thought.

Maybe his rant would be bit more digestible if it contained sections with: "here's what I tried and it did not work". But that would make it not a rant but actual research with value.


He's wrong, there's no other sector of the economy with growth potential and there's so much capital desperately seeking returns. Also anxious capital terrified of being disrupted. The tech industry will just move on to the next hype cycle when this one burns out.

I'll save him the trouble of writing it: "I asked ChatGPT 3.5 to write a large, underspecified chunk of code a couple of years ago, and it didn't work the first time, unlike the code that I write. This whole 'AI' business is an elaborate scam."

He's not a developer. He's really talking about consumer tech.

OK, let's try, "I have no idea how any of this works. However, I have come to the conclusion that it doesn't, and can't."

Closer? I'm just going by the headline here.


This reads like a straightforward rant to me, not something informative.

Which then invites the question: why is this person's opinion on the subject relevant? Do they have some credentials that make it more valuable than a random comment with a similar rant (of which there are plenty) on Reddit or HN?


They have 55,000 newsletter subscribers. https://www.wheresyoured.at/about/

I mean we insist on enduring Paul Graham's every brainfart on this website so why not other bloggers?

>iPhone fundamentally redefined what a cellphone and a portable computer could be, as did the iPad, creating entirely new consumer and business use cases almost immediately. >So, what exactly has generative AI actually done? Where are the products? No, really, where are they? What's the product you use every day, or week, that uses generative AI, that truly changes your life? If generative AI disappeared tomorrow — assuming you are not somebody who actively builds using it — would your life materially change?

I think ChatGPT and similar generative AI did fundamentally redefine what software could be. Everyone rushed to implement generative AI into every software. Even MS Paint has AI now. Before this, the idea that you would have it was unthinkable.

If it doesn't make any money, that's a separate issue.

>If generative AI disappeared tomorrow — assuming you are not somebody who actively builds using it — would your life materially change?

To put it in another way, you could live without the ability to drag and drop, but that doesn't mean it hasn't redefined user interfaces.


My concern is when you've implemented it and rely on it and then the company providing the service pulls a "Google" and deprecates the project. I haven't seen that mentioned in any of the comments.

This is the same issue that would happen with any closed source software, and why the push for fully open source or at least "semi" open source models (which provide the weights but not the training data). If you are critically dependent on software that can just disappear for reasons out of your control, you're beholden to its developers.

On the flip side I feel that this is the big problem with monetizing AI. AI is already bad enough in that its output is practically always untrusted output, so any customer-facing application of AI requires a second AI to make sure the first AI didn't output anything improper to consumers or even children (because parents are okay with an app that just tells their children randomly generated text, apparently).

It would be a little better if you had control over the model. But without a intellectual property rights over the model, what is the AI company even selling? A GUI? So anyone can just copy and paste the model, skip all the training costs, and just sell a react frontend for the model trained for billions of dollars?

It feels like you can't make money from training the AI in a way that makes sense for customers, but you can make money from selling the AI that someone else trained by selling access to it for non-technical users.


I'm not sure what the rush is. Do we have to make it profitable now? 2 years is not a long time. The point has been made elsewhere by other commentators about the dot-com bubble and how long that took for trillion dollar industries to form. It sounds like his gripe with Altman's hype narrative has soley informed his somewhat negative view on LLMs as a whole.

I find it interesting that he almost equates OpenAI === LLMs and misses the fact that the hype is not purely industry driven. For instance, the number of machine learning papers in the last year has quite literally doubled.

This is also typical of an Americentric view on innovation that we don't report on the quiet revolution happening in education in underdeveloped countries that are a direct result of the accessibility of this unprofitable technology.

I don't think we need killer application right now

We also forget the internet bubble happend first

I think the author is looking at LLMs through the lens of Sam Altman's hype narrative and I wonder why we care so much that


Plenty of AI companies that are cons or extremely overvalued, but the technology is the real deal and delivering huge improvements over previous techniques in all kinds of domains: language translation, weather prediction, code completion, self driving, etc.

I swear no matter how many times people say it people will still conflate all ML with LLMs. No, chatGPT is not driving advances in self-driving or weather prediction

For better or worse, "LLM" or "generative ai" has become roughly synonymous with the current wave of ML.

I know very little about ChatGPT, but Waymo is using an LLM: "Powered by Gemini, a multimodal large language model developed by Google, EMMA employs a unified, end-to-end trained model to generate future trajectories for autonomous vehicles directly from sensor data." (https://waymo.com/blog/2024/10/introducing-emma)


Waymo uses reinforcement learning (what it was before LLMs) (TD3+BC according to one of their blogs)

Emma is something they tried, but further down the article they explain why they don't use it as such yet.


Yep. It's an interesting experiment and really stretched my understanding of what an LLM is and can do.

Huh. I mean it makes sense to train end-to-end on all the interrelated tasks involved in driving but putting a whole-ass language model in the middle of that seems like a stunt. I wonder if it does better than like, any random transformer not trained on language first? Still, I hadn't heard that so I guess I was wrong about that one

No, you were right, this appears to be just research on how applicable LLMs could be to the space. They talk about the improvements their LLM makes, especially in being multimodal vs training multiple independent models, but also the limitations that appear to prevent it from being useable as it is. Maybe some form if it will be used some day (it does seem like it would be useful to have semantic understanding of the world integrated into the system), but at least as of when this was published, it's not actually used.

Vision Language Models are absolutely being trialed for self-driving

https://wayve.ai/thinking/lingo-2-driving-with-language/


Okay so because of the ambiguity of the other reply I'm just gonna say, I don't think we should be surprised that someone is trying to use LLMs to do basically anything. That's basically what prints funding money right now, so long as you're the kind of company or guy the VCs or whoever will believe in. The signal here is "does it do something to appreciably advance the state of the art over previous methods"?

Seems to be the case to me, reading this and waymo's attempts. There's paper on EMMA here - https://arxiv.org/abs/2410.23262

And there are state of the art weather prediction transformers.

https://arxiv.org/abs/2312.03876


Yeah so like, this is a cool result, and it uses a transformer architecture. I actually do think that it's fair to say that transformers have proven widely useful, especially in tasks that look like sequence modeling. It's a step change akin to the now-pervasive use of convolutional neural networks that started in the 2010s, and is deeply significant of course. This is also really different from "this is an LLM"

The reason I want to specifically harp on this is because a lot of people are selling this narrative where "AI is becoming superintelligent" or whatever by making an amorphous blob out of a bunch of separate advances that use machine learning techniques. This has been happening for a while, is a great thing for science, and it's clear that machine learning methods are here to stay in science. I'm a machine learning researcher. I've understood, celebrated, and tried to help with this as best I can manage over the last 9 years of my life. And it's been going on for a lot longer than the general public has been in this AI hype wave. The entire modern field of bioinformatics is arguably built on the backbone of machine learning, and has been since before I went to grad school.

This is really different from "We fed everything into a language model and now it's superintelligent and is making scientific advances all by itself" or even "scientists just ask chatGPT shit and it figures it out for them". The breathless tech press really makes it sound like anything that happens in AI research, which increasingly includes the entire usage of ML toolkits in the sciences (Which is pervasive, and expectedly so! ML is an extension of statistics and statistics has been the basis of science for like a century) is just some amorphous force called "AI" that's suddenly gained this aggregate body of competency. Imagine if we anthropomorphized statistics that way. Or Math for that matter. This kind of narrative gives me the overall impression that this is not being talked about honestly, and it's clear that this is profitable to do. I don't have to use charged words like "con" or "fraud" to think this deceptive framing is not a great thing


What you're saying does happen to some degree and in this instance, if i had linked some advance with a diffusion model then i would get it but about the only difference between this and chatgpt is the data it's been trained on. If Open AI cared, the next version of GPT could be a State of the Art weather predictor.

I mean by the same logic the only difference between a diffusion model and a VLM is that you put the spatial transformer on the other end.

Yes, one of the powerful things about every kind of neural network is that they're a very general class of function approximator. That we can use a similar toolkit of techniques to tackle a wide variety of problems is very cool and useful. Again, the analogy to statistical models is telling. You can model a lot of phenomena decently well with gaussian distributions. Should we report this as "Normal distribution makes yet another groundbreaking discovery!"? Probably this wouldn't have the same impact, because people aren't being sold sci-fi stories about an anthropomorphized bell curve. People who are using LLMs already think of "AI" as a thinking thing they're talking to, because they have been encouraged to do that by marketing. Attributing these discoveries made by scientists using this method to "AI" in the same way that we attribute answers produced by chatGPT to "AI" is a deliberate and misleading conflation


>I mean by the same logic the only difference between a diffusion model and a VLM is that you put the spatial transformer on the other end.

Maybe if that was the only different but it's not. There are diffusion models that have nothing to do with transformers or attention or anything like that and where using them for arbitrary sequence prediction is either not possible or highly non-trivial.

Yes, All Neural Network architectures are function approximators but that doesn't they excel equally for all tasks or that you can even use them for anything other than a single task. This era of the transformer where you can simply use a single architecture for NLP, Computer Vision, Robotics, even reinforcement learning is a very new one. Literally anything a bog standard transformer can do is anything GPT can do if Open AI wished.

Like i said, i don't disagree with your broader point. I just don't think this is an instance of it.


It's clear you're missing what point it is that I'm making from these responses, but I'm unsure how to explain it better and you're not really giving me much to work with in terms of seeming to engage with the substance of it, so I think we gotta leave this an impasse for now

LLMs can do weather prediction? There is no way that is true. Considering ChatGPT sometimes insists 2+2=5, I sincerely doubt it is solving PDEs used to model weather systems.

Transformers can do weather prediction yes. https://arxiv.org/abs/2312.03876

This thread is about large language models and generative AI. AFAIK LLMs perform pretty badly at time series prediction.

Maybe some specially trained weather prediction neural network could perform well, but that isn’t really what we’re talking about here.


A Large Language Model is just a text predicting Transformer (and sometimes image and/or audio predicting if they're multimodal). Transformers are general sequence to sequence predictors. The only difference between this and GPT is the data it's been trained on. They're the same kind of neural network.

Think base models. LLMs are extremely good at predicting the future when applied to human languages, as this is literally their only optimization goal. Why couldn't they also be good at predicting the future when applied to other complex forecasting tasks?

Of course what can be mathematically calculated without inference is going to be. LLMs may however be able to interpret the results of these calculations better than humans or current stochastical evaluations.


Sounds like airlines need to start using ChatGPT to autopilot their airplanes.

> I am so very bored of having this conversation, so I am now going to write out some counterpoints so that I don't have to say them again.

It is not clear to me why the author feels the need to have the conversation.

Human consciousness gives us the ability imagine future states in the universe and make them come true.

The results will speak for itself.


He seems to think the hype will do great damage to society

"I need you to fucking listen to me: everything I am describing is unfathomably dangerous, even if you put aside the environmental and financial costs."

Personally I think he's lost it a bit. I mean say he's right in that LLMs plateau and investors lose some money. Life will go on.


Hey I have my gripes with the landscape, certainly, but this is just too much.

> It sure is! But it doesn't really prove anything other than that people are using the single-most-talked about product in the world. By comparison, billions of people use Facebook and Google. I don't care about this number!

> User numbers alone tell you nothing about the sustainability or profitability of a business, or how those people use the product. It doesn’t delineate between daily users, and those who occasionally (and shallowly) flirt with an app or a website. It doesn’t say how essential a product is for that person.

Both of these "arguments" could be applied to any of the big tech giants of the last 25 years - Google, Amazon, Facebook, Uber, whatever (and there'd be other incumbents used by billions of people before them). I don't believe these arguments discount ChatGPT from having the potential to continue growing like a Facebook. And who cares how many journalists Altman knows, you don't get a product written about that much unless it's truly a groundbreaking product.

> And even then, we still don't have a killer app! There is no product that everybody loves, and there is no iPhone moment!

There sure is, it's called programming. He called out quality earlier on, but the quantity and speed and direction the AI can take (as well as its rate of improvement) is breathtaking. My own output has 10x'd easily since GPT-4 came out (although some of that means I'm needing far less hours in certain places). And guess what? The code quality is generally fine.

> Where are the products? No, really, where are they? What's the product you use every day, or week, that uses generative AI, that truly changes your life? If generative AI disappeared tomorrow — assuming you are not somebody who actively builds using it — would your life materially change?

Ok, the product is called ChatGPT, or Claude, or DeepSeek or whatever, and if it disappeared overnight, my programming productivity would drop dramatically. I would not seek to take on as ambitious projects in as short of a time frame as I am doing now.

I don't know, as a user and developer both of AI/LLMs, this article isn't hitting the mark for me. There are legitimate criticisms of the field, but I'm not seeing them thus far.

Edit - I'll say I agree with the Deep Research criticisms. These products are very underwhelming. They're literally to help people do a research report which needs to be done, but won't be used or read critically by anyone report.


This is not a criticism he really mentions explicitly, but the issue I see with the valuation is that these products cannot be used in a production system by a responsible engineer. As in, I can't have LLMs autonomously plugged in as part of my product. The failure modes are not ever going to be predictable enough. Now, Microsoft will probably be able to charge a fortune at enterprise level, and managers will dream of a day that the LLMs can replace all those weirdo devs at their company, but that'll stay a dream.

All of the valuable uses are personal. It makes me personally feel more productive at work. It helps me personally understand some topic better. It gives me an idea in a personal project.

That's all really cool, but that is not what the valuation is about. The valuation is about a false science fiction and hype bubble about agentic this or that or AGI or whatever, and this is driving very questionable decisions for wasting possibly trillions of dollars and tons of energy.

The plus side is that there is some really cool personally useful tech here, and we will probably end up with very good open source implementations and cheaper used GPUs once the bubble bursts.


You also can't responsibly have a prolific apprentice (intern, first or second year, etc.) plugged directly to production.

On the other hand, today most money for SWEs goes to people who aren't "staff engineer" level. What if most money went to staff engineers who directed these interns and paired with new human apprentices to learn staff eng?

In the past few thousand years, apprentice/journeyman/mastery of trade was how trades worked, with an "up or out" model that kept the role pyramid the right shape for skill to shape the outcomes.

These days, far too many careers stay apprentice skill the entire career, mostly thanks to enterprises failing at engineering management as a skilled trade, so being unable to value and raise staff engineer caliber contributors. The enterprise learning machine is broken by false "efficiency".

LLMs change this. Staff engineer caliber SWEs are able to direct these indefatigable assistants, as if each staff engineer has a team of 10 who never need mental breaks to remain productive. There will of course be some number of junior devs who themselves have enough affinity for the role they will want to stick with the apprentice model and work to the staff engineer level. (And will always be solo or boutique teams of app/saas SWEs.)

As for the enterprise engineering management that couldn't tell the difference between a staff engineer and an apprentice, the LLM multiplies the difference to the point the outcomes are evident even to a non technical observer.

So one possible timeline for this is a raising of the median human skill level by attrition of those unskilled enough or unable to think critically enough to leverage the machine assistants as force multipliers or unable to survive directly mentored skill-up training and observation from the staff engineers.

You talk about personal value. Roughly, I agree with you completely, and am adding who I think those persons could be (or have to be given the current level of "thinking" by these tools) for the hype to deliver on value. (At a higher level of machine, closer to "AGI", this scenario changes.)

As is evident by downsizing a mediocre team and observing output go up and work more reliably, these forces could, if playing out this way, make dollars per human go up, productivity go up, quality go up, and enable a return to the millennia-proven model of apprenticeship for the trade.


> Both of these "arguments" could be applied to any of the big tech giants of the last 25 years - Google, Amazon, Facebook, Uber, whatever (and there'd be other incumbents used by billions of people before them). I don't believe these arguments discount ChatGPT from having the potential to continue growing like a Facebook. And who cares how many journalists Altman knows, you don't get a product written about that much unless it's truly a groundbreaking product.

I don't get these statements. This line of thinking is so egregious, FTX made an ad about it. This is survivorship bias. If 'wrong' predictions about some can be discrediting, then why not right predictions to establish credibility? The same people were right about the Segway, crypto, metaverse, web3, that dog walking startup that Softbank burned money on, and countless other harebrained endeavors.

Even in your example, Uber is a financial crime. It is still using accounting tricks to show profit.


"I'll say I agree with the Deep Research criticisms. These products are very underwhelming."

I haven't shelled out the $200/month for OpenAI's Deep Research offering, but similar products from Google and Perplexity are extremely useful (at least for my use case). I would never present the results unchecked / unedited, but the Deep Research products will dig much deeper for information than Perplexity could be persuaded to previously. The results can then be fed into another part of the process.


GitHub Copilot and similar tools make good developers more productive. This alone is a genuine use case with some associated value. Is it enough to justify the valuations of OpenAI etc? Probably not by itself. But I expect other industries have similar productivity boosts where people learn to use the tools appropriately.

What’s the total business opportunity of making all knowledge workers 10% more productive (to pick a more modest goal than outright replacement)?


Generative AI only really helps (a meaningful amount) developers where writing code is actually the bottleneck. I have not been in a such a position for years and years.

Same experience here. Today I spent 5 hours debugging code, and wondering how I could put 3 contradicting specifications inside while negotiating some weird stuff that does not concerns me, and then I spent 3 hours deleting 100 lines of code and writing 100 lines of code to please everyone. I fail to see how a LLM could have helped me, and I've been doing this for more than 10 years.

> GitHub Copilot and similar tools make good developers more productive.

Do we have any empirical evidence of this? It seems like it'd be an easy experiment to run - task a number of teams with building a particular product, some with Copilot and some without, and see what happens.

I've tried copilot myself, and at times it makes me feel more productive, but I can't tell if it's truly helping me overall.


As a 40yo software engineer (at Microsoft) with no specific domain expertise other than using genAI for fun and some code completion, this essay/blog post articulates my gut feelings about where we are at, very well.

I am a math PhD student and I already draw some value from recent reasoning models. I strongky believe than in 1-2 years LLMs will become established tool for scientists to help with coding and math. I really don't think you can call this a con...

41 here, working in healthtech… and Devin has committed more code and closed more tickets on my behalf in the past week at my behest than I’ve done on my own in a month.

It’s basically functioning as a team of entry-level junior engineers at this point.

Previously I was having to spend a fair amount of time writing tickets and providing context, but lately I’ve fed all my meeting transcripts and such into an LLM and it interactively creates Jira tickets for me. Each one takes me maybe 30s to read before I confirm them and the assistant creates the actual tickets.


What kind of tickets are these? Even at non-complex tasks, I find agents struggle a lot.

Can you give some examples?


Sure. One task I gave it a couple of days ago was to upgrade the version of Python used in a project. In this case, that was a task suited for a junior engineer - it was simple enough to be described fully, but complex enough to require effort.

Devin was able to recognize that the project used Poetry, was Dockerized, and that the Python version was specific in multiple places (.python-version, pyproject.toml, Dockerfile). It saw that a couple of minor dependencies didn’t support the new version of Python, so it went back and upgraded those to the most recent matching version first.

Devin had never touched the repository in question before getting this task.

I’ve given it more and less complex tasks, and yeah, it struggles with some things. I’d estimate that it consumes about 5-10% of my time but multiples my overall output by ~3x.


I would be very curious about the size and complexity of this codebase. Every review of Devin I’ve seen has been very negative (burns a ton of money, gets stuck, doesn’t implement the changes you want).

For large codebases (greater than 15k or 20k LOC) the context size seems like a real problem right now.


I’ve used it for everything from “change this text on a webpage” to squashing complex migrations in multiple apps in a Django monolith where migrations in one app depends on migrations in other apps.

My apologies if anyone finds this offensive, but I sorta see Devin as a fresh junior SWE hire. It doesn’t do well with tasks that require deep knowledge sometimes, but it has shallow or better knowledge of everything. I would describe it as working with a brand new SWE with an IQ of about 85 who is also on the low end of being high-functioning autistic. By that I mean that it takes most things literally and sometimes has difficulty with nuance.

> burns a ton of money, gets stuck, doesn’t implement the changes you want

The first time you use it, I think that’s pretty fair. Every time it gets stuck or does the wrong thing, when you correct it, it gives you the option to add to its “knowledge base”. That’s a bunch of additional context that it applies in only certain situations. Within a week or so of using it regularly, it’s significantly more valuable. It “learns” much faster than a human.

Example:

About a dozen of our projects all rely on a shared repository (“Enki”) that contains a Composefile, configs, and some light automation. Tests are run in Docker, and you have to navigate to the other repo’s directory to bring up the service. Some of those projects have service names in the Composefile that differ from the project name. I was able to run the steps interactively on “Devin’s machine”, tell Devin what I had done, and then tell it that this is the correct approach for any project that depends on that repository. I didn’t tell it what projects those are, or how to find out.

The next time I used Devin on a project like that, it tried to run the tests directly in a local Python environment. That didn’t work, but it tried the correct approach next. That worked, so it added a line to its knowledge base “Project <foo> uses Enki.” From that point forward it did the right thing the first time.

> For large codebases (greater than 15k or 20k LOC) the context size seems like a real problem right now.

The primary project I’m working on is a Django app. I don’t have it in front of me right now, but it’s about five years old, has been under very active development the entire time, and is comprised of about twenty apps. It’s not the largest codebase I’ve worked on, but it’s far from the smallest. I can do a line count tomorrow if you’d like.


This terrifies me.

It excites me. The only way it would really terrify me is if I were a very junior engineer right now or in college to be one.

I think we’ll see a ton of complaints about how bad the job market is in the next couple of years. That will be true, but only for juniors or for seniors who don’t embrace the tech. For seniors who do embrace it and specialize in implementing these systems, it’ll be a gold mine.

Then, over 5-10 years, our seniors will start to retire or leave the field. No one will be there to replace them. At that point we’ll see a resurgence in the job market.

Things like autocompletion and “chat with your codebase” help juniors more than seniors; agents help seniors much more than juniors. As these systems improve, their failure cases get more and more complex/nuanced - you will always need senior people with the insight necessary to figure out what’s wrong when it breaks. For a while that will help seniors and hurt juniors… right up until businesses realize that they don’t have replacements for their existing senior engineers, at which point they’ll be desperate to hire again.


You have massively misunderstood what I’m terrified by. In fact you’ve described something I find the least terrifying of anything I’ve ever read because it’s all pure fantasy.

ChatGPT and LLMs have had a significant impact on my wife's life. She's a second language speaker, and having ChatGPT available to draft and proofread professional sounding emails and text messages has drastically increased her self-confidence and ability to communicate with colleagues. I think that's amazing.

That's also the only use of LLMs we've found.


Two uses for me (as a native English speaker who writes pretty well on my own):

1. Reformatting notes or bits of information into something more formal (something I consider actually counterproductive in a way, since formal is often more verbose, but that's expected in certain contexts...)

2. Sifting through the crap of the internet to answer obscure questions. The Google replacement that has been needed.


It's helped me incredibly to proof-read a novel I wrote in Spanish and translated myself into English to make it sound more native. I review ever single suggestion an LLM provides (as I would do with a native proof-reader!).

I think this type of job suits LLMs perfectly... At the end of the day it's just a statistical NLP tool.


For second/new language users, search enabled LLMs are great for finding information. You can instruct it to search in the target language and provide the results in your native language, helping greatly when you're not even sure what the relevant search words are.

I wouldn't trust the analysis on anything important, but that gives you the source links so you can still verify yourself.


The downside to doing this is that you'll sound like an LLM. LLM-generated text is very obvious to anyone with basic reading comprehension and once detected will cause some people to summarily dismiss the sender as a bot.

This is more than acceptable if it allows you to confidently send of an email in less than a minute that would otherwise take you 30 minutes of agony to write and still not be confident about.

Also, these aren't cold calls. The recipients aren't critical about how "botty" the email sounds.


I think this can be mitigated by proofreading and changing up a few things.

Protein folding is a an application of generative AI that will probably produce trillions of dollars of value in the long term. It was probably impossible for Google to squeeze that sort of money from researchers who use it, but it proves that the technology definitely useful. Another application that is highly underrated is with robots completing complicated tasks.

Can you lend me 100 billion? I can pay you in a year or two or maybe 10 but I will probably have trillions by then.

How is predicting protein folding GenAI? Seems like traditional machine learning?

I suppose I'm thinking about transformer architecture rather than strictly GenAI, but the computer science aspects of protein folding and GenAI seem like they overlap significantly.

People are using these models to generate candidate molecules for specific purposes. According to some estimates I've seen, their hit rate is about 50% instead of 25-33%, and doesn't take two years.

This is a case for example I wish AI was targeting. However its more likely they will build more benchmarks and target dev's/SWE's quickly and hard because that's probably what their VC's want them to do. The domains that will generally benefit society - that's more of a maybe they might do later.

They just released a benchmark today to try to attempt it (OpenAI).


I could only skim to verify an automated summary, but if the takeaway is "these AI giants are doomed" then he's right.

The future, probably within 10 years, is most tasks being handled by small on-portable-device models (7B parameters or so; see Apple's Intelligence thing), a middle ground of workhorse models (pushing closer to 30s and 70s) running on more capable ML-focused chips in laptops and workstations, and home and office servers for the biggest professional users running on dedicated servers.

And then there's the apps. Whoever makes the "Stripe for generative AI" with multiple models with different levels of data provenance, security, SLA, etc for different use cases tied together with support for custom fine tuning stands a good chance of sweeping the market post-collapse.


I'm not on iOS these days, can you elaborate how "Apple's Intelligence thing" has solved the 7B-sized on-portable-device model for everyday users?

My understanding of the zeitgest on HN about Apple Intelligence was definitely not leaning towards "they nailed it". Not even in the ballpark of "promising", I'd say.


I didn't say they nailed it. I don't know since I can't run it, but ten years is a long time for any technology. I toyed with the 7B Mistral through MLC Chat and, while slow, the responses were good.

The Llama-3.2-3B-Instruct it comes with is fast but sometimes takes questioning to get accurate answers.

The older Phi variant it has was thorough and accurate, Phi's selling point, but made my phone run hot being too thorough.

I don't know much about Apple's model.


"We are in the midst of a group delusion — a consequence of an economy ruled by people that do not participate in labor of any kind outside of sending and receiving emails and going to lunches that last several hours — where the people with the money do not understand or care about human beings."

Regardless of the hostile tone of the article, this stuck out to me as an incredibly poignant description of the current tech/finance elites' mindset.

As most of us who have tried LLMs can attest, they are indeed stochastic parrots with no capacity for knowledge or understanding. This is best exemplified by their non-deterministic outputs, wherein they give different answers to the same question if asked enough times. This is not how a human brain works. Perhaps it is a small building block, but the systemic architecture required to reach brain level is currently not in sight based on what I'm seeing.


>non-deterministic outputs, wherein they give different answers to the same question if asked enough times

I think you may find some humans do that too.


Only because they're tired of you asking them the same thing over and over, or because they've picked up new knowledge/opinions since you last asked.

"How many r's in strawberry?"

"3."

"How many r's in strawberry?"

"...3?"

"How many r's in strawberry?"

"Piss off!"


LLMs write code. Quickly. That is a killer app.

Performance improves every year, and costs become 3x-10x lower every year for the same level of performance. The difficulty of the code we want it to write does not increase 3x-10x per year. So there is no cost problem in just a couple years.

The level of denial and anger about LLMs on HN is astounding to me. Is this just defensiveness from software engineers worried about losing their jobs? An inability to extrapolate cost or performance trends just a couple years forward? Personal criticism against Altman and Musk? What am I missing?


Extrapolating trends mostly doesn't work like that, it's very possible that the growth will slow down.

It's not totally useless. I have more productive conversations about my aquarium with copilot than I do with aquarium related subreddits, and the results are about as useful. I treat them as interesting anecdotes worthy of further research.

> How does this industry actually continue? Do OpenAI and Anthropic continue to raise tens of billions of dollars every six months until they work this out?

It's been made fairly clear that the insiders are setting the stage for governments to back them (bail them out).


The possibility that OpenAI, Anthropic or other companies in the space can lose their investors money does not make the technology a "con". As long as they do not try to pass their losses to the tax payers, how is this a problem? If anything, the history of tech has abundant examples of the first movers in a space not capturing the eventual economic value.

It is ridiculous to say that generative AI is a con at this point when it is by far the best way to search the internet (in spite of hallucinations).


The capex build-out for the gpu's reminds me of the telecom build-out in the late 90's. So many of those companies went bust but they laid a lot of high-speed fiber before they did which we all are enjoying now. I suspect most of these gen-ai companies will go bust in the next couple of years but all that compute power will be repurposed and used in the decade to come...

> And when this all falls apart — and I believe it will — there will be a very public reckoning for the tech industry.

and will trigger the collapse of the wider asset price bubble, with consequent economic turmoil - an unfortunately necessary reset, in my opinion.

I suppose the good news is that likely the current US administration will wear this one, though not completely being their fault (much like Covid), and a political reset will also ensue.


I'm worried with a slowing housing market (demand is at a 5yr low) there could be a perfect storm of major market crash coupled with plummeting home values.

That might be a good thing. Current housing prices are insane.

I think ChatGPT is rather iPod moment not iPhone moment and development of AI should be benchmarked against hardware companies not software companies. It's early day of AI, in the way, that at some point, it was early day for smartphones when some computing capacity in phones was prevalent and cheap, but better capabilities were super expensive and still not very good.

> The AI bubble means that effectively every single media outlet has been talking about artificial intelligence in the vaguest way, and there's really only been one "product" that they can try that "is AI" — and that product is ChatGPT.

What is this guy even talking about now? Zitron has gone so far off the rails.


The killer app for AI is going to replacing frontline support staff. When you call, email or chat with support for companies that are screwing up, you won't be able to reach a human without going through several layers of "helpful" AI agents first.

To succeed here, AI doesn't have to be cheap or good, it only has to be cheaper than human staff.


Where's the product that everyone is using and loves? Um, Chatgpt. And Copilot.

I use them both. A lot. I'd hate to me without them.


this can be both true:

1. The Generative AI is very very useful

2. The Generative AI is a epic bubble and it will kill us all (financially) one day

It's entirely appropriate to describe it as a 'con', as long as you look deep enough into OpenAI, SoftBank, MSFT, NVDA, SMCI, etc.

it's a con, but it's useful, and it will kill us all


Sadly, this comment still works without the word "(financially)"

It's not even worth responding to these "AI is a useless toy!" screeds anymore. Let 'em rant. I'll just keep using that useless toy to build cooler and cooler stuff faster and faster.

I had to write a similar post[1] a few months ago because I was so tired of everybody I know telling me we're all gonna be without jobs and we're about to enter a new epoch. I'm so sick of the hype and that will be the real thing that dooms us all.

[1] https://blog.curtii.com/blog/posts/the-laypersons-guide-to-a...


I wonder, when people will begin read books (or at least learn documentation).

> OpenAI burned more than $5 billion last year.

Well, this is semi-true. When speaking about LLM technology, must be honest, and make difference of base (or foundation) model training, vs fine-tune it for purpose.

Sure, if you just use base model, you also could gain some profit, but real value of LLM achievable if you got already done base model and fine-tune it on your target task.

What this mean - base LLM are just learn language structure from really huge dataset (for example, entire Wikipedia), and this is really expensive, but when you fine-tune LLM from for example, your corporation product documentation, it will become AI-consultant about your corporation. Or you could fine-tune LLM from children story book, and it could indefinitely generate texts similar to that story. BTW, rumors said, some orgs fine-tuned GPT-3 on their company codebase and have very interesting results on code generation (much better than with base model).

Fact, base model training really cost millions (Llama-2 official cost $5 millions, and I believe it much more than claims of Chinese about deepseek R1 cost also $5 millions).

But fine-tune GPT-4o now cost about 20 bucks for 1 million tokens, and inference is $3.75 per million input tokens and $15 per million output tokens. For GPT-4o mini, training cost is $3 per million tokens, and inference is $0.30 per million input tokens and $1.20 per million output tokens (from official announce on OpenAI developer community).

If you consider fine-tuning of GPT-3 class model (or for example, similar open source model), official prices are just few bucks for million tokens (run it on your own infrastructure will be slightly more expensive), which I think very tolerable and already affordable for small companies.

And I admit, just few Billions of market is not scale of big thing, but I think, it is just because conservative corporate tops, and because security problems of current implementations, and will change nearest years.


I suppose we're now entering the trough of disillusionment.

I would be highly amused if the OP revealed that the essay/rant had been written by o3-mini tomorrow.

While I don't really understand what fuels this person's Substack Forensic Journalist energy, I can only say that I am thrilled to pay $20 to OpenAI because it delivers outrageous value to me as a solo, self-taught "engineer"* designing reasonably complex physical devices intended for sale. Air quotes because here in Ontario, if you don't got the ring, you don't got no business using the title.

So my first hand gut reaction is that people who cannot fathom meaningful use of modern LLMs are by definition people who are not trying to solve complex problems in domains they aren't yet super confident in. No judgement, and this is intentionally reductive; an LLM skeptic is lots of other things, too. Just saying that if you want to build hard things, reasoning models are dramatic force multipliers.


> So my first hand gut reaction is that people who cannot fathom meaningful use of modern LLMs are by definition people who are not trying to solve complex problems in domains they aren't yet super confident in.

OTOH I would never trust anything built by someone not super confident while heavily relying on a LLM.


Nobody starts off super confident! I'm proud to be a life-long learner. I've just learned more, about more things, in the last three years than I did in the previous few decades of coding every day.

One person's imposter syndrome is often superior to another's blustery confidence.


Then you should probably stop using most applications today, specifically those that have been iterated upon in the past two years.

There's a reason why most modern software is disappointing

Title could be confused for advertising a conference on generative AI

The mismatch between AI’s actual utility and its hype reminded me of Prediction Machines[1], which frames technological change as progressing from point solutions → platform solutions → system solutions.

We’re still in the “what the heck is the point solution here” phase, with a lot of anticipation for platform and system-level shifts. There are some point solutions—like coding assistants—that make existing workflows more efficient and higher quality, but they haven’t translated easily to other domains. Platform solutions require completely rethinking workflows holistically, and system solutions demand restructuring everything that depends on those workflows. That’s going to be slow and messy. Including financially messy.

The book likens this to the introduction of electricity. Initially, electrification meant new individual machines in factories organized around steam power. Steam power was hard to turn on and off and not at all portable. Actually getting the full benefit of electricity meant redesigning factories around electricity use as-needed (not just when the steam engine was running) and spatially organize around task efficiency (not proximity to the steam energy production). All that was not a quick shift.

I very much sympathize with the author's frustration over hype that fails to understand the underlying technology and puts unwarranted faith in a small collection of corporate leaders. But I do think that this technology does have a high degree systems change potential and possibly the momentum to see it through this time. Not that we know how that will play out of which actors or forces will bring it to fruition. It really doesn't feel the same as the other tech crazes of the last two decades.

[1] https://www.predictionmachines.ai/


for people using these AI products for coding what exactly are you doing ?

general API plumbing ? call this api, combine the json results & spit it out. Then slap a React / Next.js frontend ?

lately, I have been doing your classical business apps - due to the domain rules - ai is pretty useless - but I have found perplexity and deep mind to be smarter stack overflows. that's it.


personally I use it as a glorified stack overflow. It really just saves me a few clicks on my browser.

For anything even remotely complex or involved, it's just not that useful. Interestingly, It seems to shine the most when doing boilerplate stuff in widely used languages such as python or javascript, but it's extremely bad with terraform.


Its very smart templates imo.

Its basically more time-consuming for me to get it to where I want the code than finish it myself. For some scaffolding in python green-field development, sure I save a few minutes.


I tell it the function name, input, output, and an LLM writes the function for me. I prefer to keep an eye on the architectural decisions myself.

The point you're missing is that people were making the same kind of comments about Amazon and Uber not too long ago

Don't rewrite history. Amazon had a million times more books than my shitty local library. ChatGPT is at best the equivalent of a junior that you have to supervise all the time and replaces all your thoughts. It's a very different scenario unless those LLMs can improve very fast which I doubt. And when they reach a senior level, the damage will already be done.

Nice write up, but if you'd ask me the author did fall for another con: calling webhosting that fancy C-word.

Article is a motte-and-bailey[0] argument.

Bailey (the clickbait): “Generative AI is a con!!”

Motte (narrow defendable argument): OpenAI and Anthropic have not shown that building a proprietary model and selling inference is a sustainable business.

0: https://en.m.wikipedia.org/wiki/Motte-and-bailey_fallacy


I see, so whenever anybody states their position tersely, I can accuse them of using a motte-and-bailey argument, and force them to be so pedantic and long winded that their point get lost in the weeds. Unless I like what they're saying, of course. I'll keep it in mind!

…what?

The author has a narrow defensible point (OpenAI and Anthropic have questionable business models) and rather than stating it tersely, he’s instead tried to use that to write an unfocused article dismissing all of Generative AI as a con.

It feels like you neither read the original article very carefully, nor took the time to understand what a motte-and-bailey argument is before writing this.


These two are completely different statements, though.

I thought they were fairly similar. I suppose there's the question whether it's a sincere misdirection.

I always wonder how LLMs will achieve superintelligence when they are, by definition, average.

This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.


But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.

I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).

It was a figure of speech. But there is nothing superintelligent about acing Spanish tests. Give me a Riemann hypothesis.

Let's define "zeta" as a mathematical function ζ(s) which takes a statement "s" as input, where s is a statement of a breakthrough in LLM capabilities achieved relative to the current date and time and ζ(s) is the probability that a given AI skeptic will honestly recognize "s" as a breakthrough,

then our Riemann-Goalpost hypothesis is that ζ has zeros for every "s" which is a negative integer (every breakthrough that happened in the past is null in value) and only has positive values where s is positive.

We can conclude from the above that given a far enough date in the future, any given breakthrough can be spectacular, but once achieved, will be derided as trivial.


Where is this defined? I'll wait for your reponse.

In some math books about markov chains

To this pedantic point, If the average written intelligence of all humans alive and dead is > the max intelligence of all live humans who are also willing/positioned to do the same task at the same time and at the same place.

But yeah, I don't think LLMs (the current core architecture) can provide super intelligence. I think it needs a bit more than next token prediction architecturally speaking.


Just because you want something to be true doesn't make it true

> If generative AI disappeared tomorrow — assuming you are not somebody who actively builds using it — would your life materially change?

Yes. My coding sessions would surely be different. I now have a very fast junior developer who has excellent knowledge of various libraries, though I have to check their code. Write me a function that accepts X and outputs Y. It works great!

Yes, the business model of OpenAI et al is probably unsustainable. I couldn't care less.

> I Feel Like I'm Going Insane

> Everywhere you look, the media is telling you that OpenAI and their ilk are the future, that they're building "advanced artificial intelligence" that can take "human-like actions," but when you look at any of this shit for more than two seconds it's abundantly clear that it absolutely isn't and absolutely can't.

Either I'm the dumb one or Ed Zitron is...


Article seems to have 3 main complaints:

1. LLMs are not very useful

2. Companies like OpenAI and Anthropic are losing tons of money

3. There is a lot of hype around them

The first seems objectively untrue - lots of people find them useful especially for coding. Not to mention the fact that they get significantly better every year.

The second is completely true but it's not clear how much that matters. Our products are being subsidized by VC firms while costs are falling by 3x-10x every year. Seems great to me.

There is a lot of hype because hype helps capitalists get rich faster. Annoying, but a small price to pay for useful technology.


>So...yeah, of course ChatGPT has that many users. When you have hundreds of different reporters constantly spitting out stories

Oh sure, because you can just have hundreds of reporters constantly write about your product. It's so simple. Why aren't more people thinking of that ?

>The weekly users number is really weird. Did it really go from 200 million to 300 million users in the space of three months?

According to similarweb, monthly visits grew over 1B in that timeframe so yeah sure it sounds possible.

>300 million monthly active users would mean a conversion rate of less than 4%, which is pretty piss-poor

A B2C Saas whose lowest price point is $20 will be lucky to get anywhere near 4% conversation.

>And even then, we still don't have a killer app!

The 6th (and climbing) most visited site in January is not a killer app ? Okay


Look... I am cheap. I get rid of streaming services when my family isn't paying attention to see if they complain.

I keep paying $20/mo to OpenAI even though I think Altman is a frightening snake man.

The utility that ChatGPT and other LLMs provide is undeniable. Their revenue will tell us how much value people get because nobody's going to spend $20/mo (much less $200/mo) without getting something for it.


> So yeah, OpenAI “burned only $340 million last year” as long as you don’t consider billions of other costs for some reason.

Ah, yes, the WeWork 'community EBITDA' model.


Cue Charlie Munger's "Just replace EBITDA with 'Bullshit Earnings'" https://youtu.be/7B_6AFG0lUU

It's very rare that I can't get through a blog post because I find the argument too disingenuous to tolerate, but that was the case here. There are usually some nuggets of insight even in a tirade, and that may very well be the case here too, but the bad faith arguments came so fast and furious that I couldn't keep going.

Well, he isn't wrong.

He might be. It’s an emergent technology and no one knows for certain how far it can be pushed.

Technology is largely a function of human imagination, tempered by constraints imposed by time on the one hand and amplified by new discoveries unfolding on the other. Imagination being a fuel, what does the imagination project is possible in 100 years?

On LLM/ML itself: it seems a lot of cynical people start with some unreasonable idea that "AI" should be able to do what it will perhaps be able to in 10 or 100 years, and are subsequently upset that it is not capable of that yet. It may get there, it may not. But that's on you for starting with a wrong assumption.

Is the AI "business" or "market" overvalued for it's current capabilities? Yeah, I do believe so. Welcome to the financial world, which is completely separated from reality. It's like that in all sectors where something new and exciting is happening, not just IT or AI. People poor money in hoping to be early enough to make a profit. Nothing more, nothing less. The rest is marketing. Some Sam Altman guy promoting the hell out of his own product? That is literally his job, regardless of wether or not he believes it all.

But articles like these are so bizarre to me. The author acts like he has millions at stake and his money manager just won't listen and pull all investments out of AI. Hurry up, the bubble is about to burst, I will lose all my money!

Except that... they don't. They are just "old man yelling at cloud". If you believe AI is the next Metaverse or WeWork, then it will just die off by itself once the bubble pops. Why are you having so many conversations about it, where you seem to be desperately trying to convince people of the bubble/con that is AI. To the point that you're so sick of it, that you write down your arguments so you can point the blinded there instead of having those tiresome arguments.

Genuinely baffled. Spend your energy on something productive rather than destructive, perhaps?


Trillions of dollars, dude. They need to make trillions of dollars to satisfy their investors.

If you are correcting my use of millions to trillions: I was refering to the author himself, who writes like this giant AI bubble is pushing him forward to the edge of the cliff as people keep believing in it, and he is desperately trying to get the bubble to shrink or he'll fall off and die. Methaphorically.

But why does he act or feel that way? Let the trillions be lost, it's just how hypes, bubbles and the stock market in general work.


> Let the trillions be lost

Haven’t read any news in the past 20 years? This is all going to be funded by the American taxpayer


> it seems a lot of cynical people start with some unreasonable idea that "AI" should be able to do what it will perhaps be able to in 10 or 100 years, and are subsequently upset that it is not capable of that yet

Because that's what AI wa supposed to be in the first place. But the industry performed the swindle of renaming "AI" to "AGI", so that they can pretend the thing that exists now is "AI".


This is bullshit. Gen AI and related technologies are disruptive and will impact and/or create billion dollars industries one way or another.

heres a quote from the article

Altman uses his digital baba yaga as a means to stoke the hearts of weak-handed and weak-hearted narcissists that would sooner shoot a man dead than lose a dollar, even if it means making their product that much worse.

the only correction i would offer for the entire article is that instead of saying “shoot a man dead,” it would be more accurate to say “smother a baby with a pillow.”


Love Ed!

"To say an LLM is intelligent is like saying a scanner has an eye for detail"* paulacannon

I have a 6 million+ word archive with ChatGPT.

It truly is like having an army of interns, each a confident undergrad in a different subject, who have paid attention to every lecture they ever went to, event the one they'd popped acid just before going in.

It's right more often than it is wrong, but some of its clangers are almost unbelievable.

Yet, having never written a line of code, to build a python application that analyzed election data and applied the results to an interactive map, that gave constituency specific data on hover.

It invariable uses the word clarify instead of correct when challenged. Yet it knows that a clarification is refining an answer within the set of the previously proffered answer, and a correction is a revision on an answer outside of the set previously provided.

It believes that this is so consistent that, on the balance of probabilities this is coded and not purely as a result of training data.

When asked to write an article on this, and include the instances from that conversation where it had incorrectly used the word clarify, it edited the quotes to remove the evidence (probably the most egregious act I've witnessed it perform).

I still use ChatGPT, even more so now since DeepSeek got slow, but I watch it like a hawk.

I still call it out every time it prevaricates or flat out lies, it still promises to do better, it still, on being challenged, acknowledges that these assurances are dangerous lies to anyone who doesn't know it's lying.

But, for me, it is still a highly useful tool.

It frequently makes assumptions that would be made by those in a field I am unfamiliar with in a way that allows me to refine arguments.

Sharing ChatGPT chats can be a very helpful means of sharing one's thought process.

I have it on strict instructions not to create unless specifically told to, to not regurgitate what it already has, to focus on critiquing instead of echoing or praising.

Yet it still reckons 70% of its output violates these instructions.

But the remaining 30% justifies the time I spend using this remarkable, next generation, automation machine.

Because that is what it is.

*To say it is intelligent is like saying a scanner has an eye for detail. Yes, a scanner identifies every pixel but and LLM is no more a brain that a scanner is an eye. (And, yes, I know, but this is a line for people who don't know the neurological processing behind sight, which to be fair, is frequently not very logical.)

So it is a threat to people who earn money on fiverr writing bits of code or designing logos - hell yes.

It is a threat to those who code complex systems or who's designs can add actual digits to market share? hell no. Or at least not for the foreseeable future.

Just as the dotcom bubble funded the internet infrastructure that we still use today (just very inefficiently), it is unlikely these trillions will be completely wasted


Why in the world would you want a car?? They are horrible to maintain, difficult to operate, expensive, smell bad, and are slower than horses! -somebody salty about automobiles, circa 1900.

Tbh they'd have been right. Cars are expensive to buy and to maintain. They are difficult to operate—people die every day. They do smell bad, and the particulate matter they emit is extremely bad for you. And they are slow: Traffic is an unsolvable problem in urban centres.

Redesigning our cities around cars was one of the big mistakes of the 20th century.


> They are difficult to operate—people die every day. They do smell bad, and the particulate matter they emit is extremely bad for you. And they are slow

Sarcastically, one might say that horses are even harder to operate (they have minds of their own), they smell worse that automobiles (esp EVs), and the particulate matter they excrete would be unhealthy to consume. They are also very slow.

More seriously, the trajectory that our imagination pushes towards seems to be “overcoming” biological limitations. Perhaps a symbiosis of machine and biology and consciousness will take us to the next level by opening up vast new universe state.


>Redesigning our cities around cars was one of the big mistakes of the 20th century.

Cities have been designed around Carriages for millenia. You can go and walk in Pompeii and observe pavements for pedestrians, roads for wheeled carriages with crossing spots of elevated stones for pedestrians.

It turns out that cities require a lot of goods to be moved through - more than a pedestrian can carry, and over inclines that human muscle power doesn't like.

The reason why cities are designed around cars is that cars were designed to fit in contemporary cities and they co-evolved over the 20th century. It was the slow kind of evolution, with each step being easier and cheaper than the big redesign.


Surely you can differentiate between horse drawn carriage routes and six lane stroads?

The more pertinant observation is that cars are a great tool for mobility, but going _all in_ on cars causes a whole bunch of issues at society scale. If you zone cities and design infrastructure with the assumption that everyone drives, it forces everybody into the least space efficient mode of transport. You have to designate huge amounts of valuable real estate to keeping all those cars somewhere. People who _can't_ drive will have much more difficulty navigating life. When cars arrived, some parts of the world made sure their cities were still easy to navigate by foot, bike and transit, and I'd argue they're more pleasant places to be.

The point isn't to say that new tech is bad, but that there can be adverse consequences to jumping in wholesale.


And yet, they’re better than what they replaced.

Horses weren’t cheap when they were a primary mode of transportation. Lots of people have died riding, driving, and breaking horses. They definitely smell bad, and their “particulate matter” was so bad that houses had to be set back and elevated from the street.

Cities designed around cars are far superior to cities designed around horses.


But cities designed around trains are superior and it was a contemporary technology that was ignored.

No, they were not. Trains are good for taking large numbers of people from one place where they don't want to be to another place where they don't want to be. In certain situations that can be a useful thing to do, but you can't design a city around it. In every case you have in mind, rest assured, the city was there first.

And yet we created trains that are a more efficient way of traversing cities built around horses as well.

Why can't you design a city around it? It would be different than Houston or LA, but would that be a bad thing?


> They definitely smell bad

??? No, they don't. They only smell bad when they're kept standing in their urine, which is still not worse than a hairdresser. Compared to dogs (or ICE cars) horses smell way less. They do sweat to regulate temperature, which has a distinct smell, but it's way less irritating to a human's nose than the sweat of the rider.

There's a set of distinct smells associated with the horses, but other than the piss, none of them are particularly "bad". In my experience, humans tend to smell way worse overall (from food to body odor to the excrement) than horses.


It's our loss that you weren't around to set everyone straight, I'm sure.

They'd be right too soon. The ICE has been a very useful technology but car dependency has been a nightmare for society. Forget a bicycle for the mind, maybe LLMs are like an SUV for the mind.

People like the horse joke, but it works with computers in general. The best argument in favor of computers in the 80s was to save paper, otherwise they were expensive and overcomplicated.

it's different

car is consumer goods, we buy cars for use, not for reproduction


It's not a con.

It's the frontier. The only new really big one we know.

It already solved real issues, see alphafold and it will continue until we hit a ceiling.

The money throwing thing is a zeitgeist issue: we have this money and people don't know what to do with it.

But yes ml is now the thing.

And no it was very very far away speaking to a machine and the machine feeling smart. If you can't see this breakthrough as what it is, you will never be excited for any new invention.

Until perhaps aliens are arriving on our door steps.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: