Good. Effective Austruism is trying to destroy democratic institutions and it's likely a bigger threat to society than fascism. The pro-progress people needs to organize themselves to stop types like SBF from giving AI to China.
Satya Nadella's Microsoft is such a weird company. It's like there's one side of it that is running with Zuckerberg's "move fast and break things" and the other side is saying "wait, we're the most important software company in the world! Things can't break!"
One side is open-sourcing .NET and VS Code and running GitHub well and making vcpkg. The other is crapping up Windows with embarrassing ad-ridden F2P games. It's really weird.
They didn't open-source the debugger so that you have to use VS or VSC. VS Code also has shittons of telemtry (same for dotnet LCI) and when you use Codium you are (officially) not allowed to use their marketplace.
>running GitHub well
GitHub is down nearly every week and constantly has problems. I appreciate them making certain features free though.
This is a pretty insightful comment. That's exactly how it feels. The core of their technologies have never been more solid, including Windows. But then on top of that solid core is a bunch of "move fast and break things" and short-term profit choices that make the whole thing seem awful.
Don't forget the ones that can't get a simple chat app to work right (Microsoft Teams) or the ones redesigning outlook which introduced a shit ton of bugs.
It's amazing that humans as a collective have decided that private corporations are the best way to progress as a civilization.
Even before Nadella, MS took insane risks with Windows. Ballmer oversaw the disastrous Windows 8 wigh the fullscreen Start Menu, which was hated far more than Vista ever was. W8 didn't even last 3 years before being replaced by Win10.
And that's to say nothing of the decade-long attempt to compete with Google and Apple in mobile with Windows Phone/RT/Nokia, which Nadella mercifully unwound.
One side is targeting corporate business, the other is for end-consumer.
The eye opener for me is the Surface Pro 10 only existing for businesses. They cared to design and produce the whole device, but not ship it to regular customers. That whole market is forced to go to the more experimental copilot line instead (which could arguably be great, but you don't get to choose in the first place)
“You want to know how to paint a perfect painting? It's easy. Make yourself perfect and then just paint naturally.” - Robert M. Pirsig
The Musk reasoning here is stupid, but smart. If he makes a superhuman intelligence, he can only ask it "What is dark matter?" and it might figure out.
I have some big problems with this idea, but it isn't 100% stupid. Just 98% stupid.
This thing continues to stress my skepticism for AI scaling laws and the broad AI semiconductor capex spending.
1- OpenAI is still working in GPT-4-level models. More than 14 months after the launch of GPT-4 and after more than $10B in capital raised.
2- The rhythm that token prices are collapsing is bizarre. Now a (bit) better model for 50% of the price. How people seriously expect these foundational model companies to make substantial revenue? Token volume needs to double just for revenue to stand still. Since GPT-4 launch, token prices are falling 84% per year!! Good for mankind, but crazy for these companies.
3- Maybe I am an asshole, but where are my agents? I mean, good for the consumer use case. Let's hope the rumors that Apple is deploying ChatGPT with Siri are true, these features will help a lot. But I wanted agents!
4- These drop in costs are good for the environment! No reason to expect them to stop here.
I'm ceaselessly amazed at people's capacity for impatience. I mean, when GPT 4 came out, I was like "holy f, this is magic!!" How quickly we get used to that magic and demand more.
Especially since this demo is extremely impressive given the voice capabilities, yet still the reaction is, essentially, "But what about AGI??!!" Seriously, take a breather. Never before in my entire career have I seen technology advance at such a breakneck speed - don't forget transformers were only invented 7 years ago. So yes, there will be some ups and downs, but I couldn't help but laugh at the thought that "14 months" is seen as a long time...
Over a year they have provided an order of magnitude improvements on latency, context length, and cost, while meaningfully improving performance and adding several input and output modalities.
Your order of magnitude claim is off by almost an order of magnitude. It's more like half again as good on a couple of items and the same on the rest. 10X improvement claims is a joke people making claims like that ought to be dismissed as jokes too.
$30 / million tokens to $5 / million tokens since GPT-4 original release = 6X improvement
4000 token context to 128k token context = 32X improvement
5.4 second voice mode latency to 320 milliseconds = 16X improvement.
I guess I got a bit excited by including cost but that's close enough to an order of magnitude for me. That's ignoring the fact that's it's now literally free in chatGPT.
Thanks so much for posting this. The increased token length alone (obviously not just with OpenAI's models but the other big ones as well) has opened up a huge number of new use cases that I've seen tons of people and other startups pounce on.
All while not addressing the rampant confabulation at all. Which is the main pain point, to me at least. Not being able to trust a single word that it says...
I am just talking about scaling laws and the level of capex that big tech companies are doing. One hundred billion dollars are being invested this year to pursue AI scaling laws.
You can be excited, as I am, while also being bearish, as I am.
If you look at the history of big technological breakthroughs, there is always an explosion of companies and money invested in the "new hotness" before things shake out and settle. Usually the vast majority of these companies go bankrupt, but that infrastructure spend sets up the ecosystem for growth going forward. Some examples:
1. Railroad companies in the second half of the 19th century.
2. Car companies in the early 20th century.
3. Telecom companies and investment in the 90s and early 2000s.
Comments like yours contribute to the negative perception of Hacker News as a place where launching anything, no matter how great, innovative, smart, informative, usable, or admirable, is met with unreasonable criticism. Finding an angle to voice your critique doesn't automatically make it insightful.
Well, I for one am excited about this update, and skeptical about the AI scaling, and agree with everything said in the top comment.
I saw the update, was a little like “meh,” and was relieved to see that some people had the same reaction as me.
OP raised some pretty good points without directly criticizing the update. It’s a good balance the the top comments (calling this *absolutely magic and stunning*) and all of Twitter
Peoples' "capacity for impatience" is literally the reason why these things move so quick. These are not feelings at-odds with each other; they're the same thing. Its magical; now its boring; where's the magic; let's create more magic.
Be impatient. Its a positive feeling, not a negative one. Be disappointed with the current progress; its the biggest thing keeping progress moving forward. It also, if nothing else, helps communicate to OpenAI whether they're moving in the right direction.
> Be disappointed with the current progress; its the biggest thing keeping progress moving forward.
No it isn't - excitement for the future is the biggest thing keeping progress moving forward. We didn't go to the moon because people were frustrated by the lack of progress in getting off of our planet, nor did we get electric cars because people were disappointed with ICE vehicles.
Complacency regarding the current state of things can certainly slow or block progress, but impatience isn't what drives forward the things that matter.
Tesla's corporate motto is literally "accelerating the world's transition to sustainable energy". Unhappy with the world's previous progress and velocity, they aimed to move faster.
It's pretty bizarre how these demos bring out keyboard warriors and cereal bowl yellers like crazy. Huge breakthroughs in natural cadence, tone and interaction as well as realtime mutlimodal and all the people on HN can rant about is token price collapse
It's like the people in this community all suffer from a complete disconnect from society and normal human needs/wants/demands.
Hah, was thinking of that exact bit when I wrote my comment. My version of "chair in the sky" is "But you are talking ... to a computer!!" Like remember stuff that was pure Star Trek fantasy until very recently? I'm sitting here with my mind blown, while at the same time reading comments along the lines of "How lame, I asked it some insanely esoteric question about one of the characters in Dwarf Fortress and it totally got it wrong!!"
There are well talked about cons to shipping so fast, but on the bright side, when everyone is demanding more, more, more, it pushes cost down and demands innovation, right?
IMO, for fear of being label a hype boy, this is absolutely a sign of the impending singularity. We are taking an ever accelerating frame of cultural reference as a given and our expectation is that exponential improvement is not just here but you’re already behind once you’ve released.
I spend the last two years dismayed with the reaction but I’ve just recently begun to realize this is a feature not a flaw. This is latent demand for the next iteration expressed as impatient dissatisfaction with the current rate of change inducing a faster rate of change. Welcome to the future you were promised.
> Token volume needs to double just for revenue to stand still
I'm pretty skeptical about all the whole LLM/AI hype, but I also believe that the market is still relatively untapped. I'm sure Apple switching Siri to an LLM would ~double token usage.
A few products rushed out thin wrappers ontop of chatgpt ai, developing pretty uninspiring chat bots of limited use. I think there's still huge potential for this LLM technology to be 'just' an implementation detail of other features, just running in the background doing its thing.
That said, I don't think OpenAI has much of a moat here. They were first, but there's plenty of others with closed or open models.
This is why think Meta has been so shrewd in their “open” model approach. I can run Llama3-70B on my local workstation with an A6000, which after the up-front cost of the card, is just my electricity bill.
So despite all the effort and cost that goes into these models, you still have to compete against a “free” offering.
Meta doesn’t sell an API, but they can make it harder for everybody else to make money on it.
LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.
Whether or not that's actually enforceable[0], and whether or not other companies will actually challenge Facebook legal over it, is a different question.
[0] AI might not be copyrightable. Under US law, copyright only accrues in creative works. The weights of an AI model are a compressed representation of training data. Compressing something isn't a creative process so it creates no additional copyright; so the only way one can gain ownership of the model weights is to own the training data that gets put into them. And most if not all AI companies are not making their own training data...
> LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.
No, the license prohibits usage by Licensees who already had >700m MAUs on the day of Llama 3's release [0]. There's no hook to stop a company from growing into that size using Llama 3 as a base.
The whole point is that the license specifically targets their competitors while allowing everyone else so that their model gets a bunch of free contributions from the open source community. They gave a set date so that they knew exactly who the license was going to affect indefinitely. They don't care about future companies because by the time the next generation releases, they can adjust the license again.
Yes, I agree with everything you just said. That also contradicts what OP said:
> LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.
The license does not forbid usage on applications with large numbers of daily active users. It forbids usage by companies that were operating at a scale to compete with Facebook at the time of the model's release.
> They don't care about future companies because by the time the next generation releases, they can adjust the license again.
Yes, but I'm skeptical that that's something a regular business needs to worry about. If you use Llama 3/4/5 to get to that scale then you are in a place where you can train your own instead of using Llama 4/5/6. Not a bad deal given that 700 million users per month is completely unachievable for most companies.
>How people seriously expect these foundational model companies to make substantial revenue?
My take on this common question is that we haven't even begun to realize the immense scale of which we will need AI in all sorts of products, from consumer to enterprise. We will look back on the cost of tokens now (even at 50% of price a year or so ago) and look at it with the same bewilderment of "having a computer in your pocket" compared to mainframes from 50 years ago.
For AI to be truly useful at the consumer level, we'll need specialized mobile hardware that operates on a far greater scale of tokens and speed than anything we're seeing/trying now.
Sam Altman gave the impression that foundation models would be a commodity on his appearance in the All in Podcast, at least in my read of what he said.
The revenue will likely come from application layer and platform services. ChatGPT is still much better tuned for conversation than anything else in my subjective experience and I’m paying premium because of that.
Alternatively it could be like search - where between having a slightly better model and getting Apple to make you the default, there’s an ad market to be tapped.
>This thing continues to stress my skepticism for AI scaling laws and the broad AI semiconductor capex spending.
Imagine you are in 1970s and saying computers suck, they are expensive, there is not that many use cases....fast forward to 90s and you are using Windows 95 with GUI and chip astronomically more powerful that we had in 70s and you can use productivity apps , play video games and surf Internet.
Give AI time, it will fulfill its true protentional sooner or later.
>It's more like you are in 1999, people are spending $100B in fiber, while a lot of computer scientists are working in compression, multiplexing, etc.
But nobody knows what's around the corner and what future brings....for example back in day Excite didn't want to buy Google for $1m because they thought that's a lot of money. You need to spend money to make money and yea, you need to spend sometimes a lot of money on "crazy" projects because it can pay off big time.
All of them, without exception. Just recently, Sprint sold their fiber business for $1 lmfao. Or WorldCom. Or NetRail, Allied Riser, PSINet, FNSI, Firstmark, Carrier 1, UFO Group, Global Access, Aleron Broadband, Verio...
All fiber went bust because despite internet's huge increase in traffic, the amount of packets per fiber increased a handful of magnitudes.
Where I work in the hoary fringes of high end tech we can’t secure enough token processing for our use cases. Token price decreases means opening of capacity but we immediately hit the boundaries of what we can acquire. We can’t keep up with the use cases - but more than that we can’t develop tooling to harness things fast enough and the tooling we are creating is a quick hack. I don’t fear for the revenue of base model providers. But I think in the end the person selling the tools makes the most and in this case I think it continue to be cloud providers. I think in a very real way OpenAI and Anthropic are commercialized charities driving change and commoditizing rapidly their own products and it’ll be infrastructure providers who win the high end model game. I don’t think this is a problem I think this is in fact inline with their original charters but a different path than most people view nonprofit work. A much more capitalist and accelerated take.
Where they might make future businesses is in the tooling. My understanding from friends within these companies is their tooling is remarkably advanced vs generally available tech. But base models aren’t the future of revenues (to be clear tho they make considerable revenue today but at some point their efficiency will cannibalize demand and the residual business will be tools)
Yes it’s limited by human attention. It has humans in the loop but a lot of LLM use cases come from complex language oriented information space challenges. It’s a lot of classification challenges as well as summarization and agent based dispatch / choose your own adventure with humans in the loop in complex decision spaces at a major finserv.
Tbf gpt4 level seems useful and better than almost everything else (or close if not). The more important barriers for use in applications have been cost, throughout and latency. Oh and modalities, which have expanded hugely.
> Since GPT-4 launch, token prices are falling 84% per year!! Good for mankind, but crazy for these companies
The message to competitor investors is that they will not make their money back.
OpenAI has the lead, in market and mindshare; it just has to keep it.
Competitors should realize they're better served by working with OpenAI than by trying to replace it - Hence the Apple deal.
Soon model construction itself will not be about public architectures or access to CPU's, but a kind of proprietary black magic. No one will pay for upstart 97% when they can get reliable 98% at the same price, so OpenAI's position will be secure.
Ask stuff like "Check whether there's some correlation between the major economies fiscal primary deficit and GDP growth in the post-pandemic era" and get an answer.
It doesn't make any sense to look at it that way. Apparently the GPT base model finised training in like late summer 2022, which is before the release of GPT-3.5. I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era.
The advancement from GPT-3 to GPT-4 is what counts and it took 3 years.
> I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era
Compute of the "GPT-3 era" vs the "GPT-3.5 era" is identical, this is not a distinguishing factor. The architecture is also roughly identical, both are dense transformers. The only significant difference between 3.5 and 3 is the size of the model and whether it uses RLHF.
Yes you're right about the compute. Let me try to make my point differnetly: GPT-3 and GPT-4 were models which when they were released represented the best that OpenAI could do, while GPT-3.5 was an intentionally smaller (than they could train) model. I'm seeing it as GPT-3.5 = GPT-4-70b.
So to estimate when the next "best we can do" model might be released we should look at the difference between the release of GPT-3 and GPT-4, not GPT-4-70b and GPT-4. That's my understanding, dunno.
This may or may not be true - just because we haven't seen GPT-level-5 capabilities, does not mean that it does not yet exist. It is highly unlikely that what they ship is actually the full capability of what they have access to.
Yeah I'm also getting suspicious. Also, all of the models (opus, llama3, gpt4, gemini pro) are converging to similar levels of performance. If it was true that the scaling hypothesis was true, we would see a greater divergence of model performance
1- The mania only started post Nov 22. And the huge investments since then didn't meant substantial progress since GPT-4 launch in March 22.
2- We are running out of high quality tokens in 2024. (per Epoch AI)
GPT-4 launch was barely 1 year ago. Give the investments a few years to pay off.
I've heard multiple reports that training runs costing ~$1 billion are in the the works at the major labs, and that the results will come in the next year or so. Let's see what that brings.
As for the tokens, they will find more quality tokens. It's like oil or other raw resources. There are more sources out there if you keep searching.
imho gpt4 is definitely [proto-]agi and the reason i cancelled my openai sub and am sad to miss out on talking to gpt4o is, openai thinks it's illegal, harmful, or abusive to use their model output to develop models that compete with openai. which means if you use openai then whatever comes out of it is toxic waste due to an arguably illegal smidgen of legal bullshit.
for another adjacent example, every piece of code github copilot ever wrote, for example, is microsoft ai output, which you "can't use to develop / otherwise improve ai," some nonsense like that.
the sum total of these various prohibitions is a data provenance nightmare of extreme proportion we cannot afford to ignore because you could say something to an AI and they parrot it right back to you and suddenly the megacorporation can say that's AI output you can't use in competition with them, and they do everything, so what can you do?
answer: cancel your openai sub and shred everything you ever got from them, even if it was awesome or revolutionary, that's the truth here, you don't want their stuff and you don't want them to have your stuff. think about the multi-decade economics of it all and realize "customer noncompete" is never gonna be OK in the long run (highway to corpo hell imho)
Simmons is one of the greatest people and a true inspiration as a mathematician, even though my career drifted from academia. He and Andrew Wiles are the reason why I always say I am a mathematician, even though I work elsewhere.
Beware Goodheart's Law: "when a measure becomes a target, it ceases to be a good measure". If your goal is stopping to waste time solving bugs, I'm sure you're going to be able to do that.
You should have an important counter-metric to see if you're not messing with software. It could be number of reported bugs, crashs in production, etc.
Then it becomes the challenger scenario. Various pieces are failing but the whole mission succeeds so everyone ignores the known risks because management is interested in their return on investment. That works right up until the rocket explodes and suddenly there are lots of external people asking serious questions. Boeing is going through the same thing having optimised for ROI as well and its planes are now falling apart on a daily basis.
Who always gets in trouble for this? More often than not the developers and operators who in a high pressure environment optimised what they were told to optimise and gamed the metrics a little so they weren't fired or held back in their careers.
Naming it "muda" helps push it that way, too: If any of those higher-ups decide to look up the word, they'll see that you're calling bugfixing "pointless work".
Professional athletes have a lot of telemetry on them. But some of that telemetry makes sense during training, and maybe makes more sense for a brief period of time while they work on technique.
You focus on something intensely for a little while, get used to how it feels, then you work on something else. If your other numbers look wrong, or if it's been too long, we look at it again.
It seems hard for people to take a nuanced approach that GPT-4 level models have in the present the potential to improve many people's lives and corporations' bottom lines while still being cautiously pessimists on the next generation of models.
GPT-4, Claude 3 & Co. are simply too useful for certain coding tasks or to review a contract. Obviously, you need to understand you're dealing with a probabilistic being, and for many tasks it isn't the correct tool, but I use ChatGPT ~5 a day and the $20 are super well spent. Now, there's an ocean apart of me liking to use ChatGPT and the promises from Silicon Valley and Microsoft.
OpenAI is not the worst, ChatGPT is used by 100M people weekly, sort of insulated from benchmarks. The best of the rest, Anthropic, should be really scared.