More

jules · 2025-10-24T11:06:27 1761303987

For another comparison: this is about 4 years worth of UK App Store net revenue.

jules · 2025-09-09T19:28:48 1757446128

What they are buying is support of the French.

ph4evers · 2025-09-09T20:23:32 1757449412

With a new French CEO, such a coincidence

jules · 2025-09-03T21:24:41 1756934681

What does this predict about LLMs ability to win gold at the International Mathematical Olympiad?

measurablefunc · 2025-09-03T21:37:51 1756935471

Same thing it does about their ability to drive cars.

jules · 2025-09-04T03:24:23 1756956263

So, nothing.

measurablefunc · 2025-09-04T03:48:43 1756957723

It's definitely something but it might not be apparent to those who do not understand the distinctions between intensionallity & extensionallity.

godelski · 2025-09-04T02:44:36 1756953876

Depends which question you're asking.

Ability to win a gold medal as if they were scored similarly to how humans are scored?

or

Ability to win a gold medal as determined by getting the "correct answer" to all the questions?

These are subtly two very different questions. In these kinds of math exams how you get to the answer matters more than the answer itself. i.e. You could not get high marks through divination. To add some clarity, the latter would be like testing someone's ability to code by only looking at their results to some test functions (oh wait... that's how we evaluate LLMs...). It's a good signal but it is far from a complete answer. It very much matters how the code generates the answer. Certainly you wouldn't accept code if it does a bunch of random computations before divining an answer.

The paper's answer to your question (assuming scored similarly to humans) is "Don’t count on it". Not a definitive "no" but they strongly suspect not.

jules · 2025-09-04T03:10:14 1756955414

The type of reasoning by the OP and the linked paper obviously does not work. The observable reality is that LLMs can do mathematical reasoning. A cursory interaction with state of the art LLMs makes this evident, as does their IMO gold medal scored like humans are. You cannot counter observable reality with generic theoretical considerations about Markov chains or pretraining scaling laws or floating point precision. The irony is that LLMs can explain why that type of reasoning is faulty:

> Any discrete-time computation (including backtracking search) becomes Markov if you define the state as the full machine configuration. Thus “Markov ⇒ no reasoning/backtracking” is a non sequitur. Moreover, LLMs can simulate backtracking in their reasoning chains. -- GPT-5

godelski · 2025-09-04T09:35:40 1756978540

  > The observable reality is that LLMs can do mathematical reasoning

I still can't get these machines to reliably perform basic subtraction[0]. The result is stochastic, so I can get the right answer, but have yet to reproduce one where the actual logic is correct[1,2]. Both [1,2] perform the same mistake and in [2] you see it just say "fuck it, skip to the answer"

  > You cannot counter observable reality

I'd call [0,1,2] "observable". These types of errors are quite common, so maybe I'm not the one with lying eyes.

[0] https://chatgpt.com/share/68b95bf5-562c-8013-8535-b61a80bada...

[1] https://chatgpt.com/share/68b95c95-808c-8013-b4ae-87a3a5a42b...

[2] https://chatgpt.com/share/68b95cae-0414-8013-aaf0-11acd0edeb...

FergusArgyll · 2025-09-04T09:40:21 1756978821

Why don't you use a state of the art model? Are you scared it will get it right? Or are you just not aware of reasoning models in which case you should get to know the field

godelski · 2025-09-04T09:47:49 1756979269

Careful there, without a /s people might think you're being serious.

FergusArgyll · 2025-09-04T10:15:47 1756980947

I am being serious, why don't you use a SOTA model?

godelski · 2025-09-04T16:50:30 1757004630

Sorry, I've just been hearing this response for years now... GPT-5 not SOTA enough for you all now? I remember when people told me to just use 3.5

  - Gemini 2.5 Pro[0], the top model on LLM Arena. This SOTA enough for you? It even hallucinated Python code!

  - Claude Opus 4.1, sharing that chat shares my name, so here's a screenshot[1]. I'll leave that one for you to check. 

  - Grok4 getting the right answer but using bad logic[2]

  - Kimi K2[3]

  - Mistral[4]

I'm sorry, but you can fuck off with your goal post moving. They all do it. Check yourself.

  > I am being serious

Don't lie to yourself, you never were

People like you have been using that copy-paste piss-poor logic since the GPT-3 days. The same exact error existed since those days on all those models just as it does today. You all were highly disingenuous then, and still are now. I know this comment isn't going to change your mind because you never cared about the evidence. You could have checked yourself! So you and your paperclip cult can just fuck off

[0] https://g.co/gemini/share/259b33fb64cc

[1] https://0x0.st/KXWf.png

[2] https://grok.com/s/c2hhcmQtNA%3D%3D_e15bb008-d252-4b4d-8233-...

[3] http://0x0.st/KXWv.png

[4] https://chat.mistral.ai/chat/8e94be15-61f4-4f74-be26-3a4289d...

FergusArgyll · 2025-09-04T18:23:20 1757010200

That's very weird, before I wrote my comment I asked gpt5-thinking (yes, once) and it nailed it. I just assumed the rest would get it as well, gemini-2.5 is shocking (the code!) I hereby give you leave to be a curmudgeon for another year...

godelski · 2025-09-04T19:25:09 1757013909

Try a few times and it'll happen. I don't think it took me more than 3 tries on any platform.

To convince me it is "reasoning", it needs to get the answer right consistently. Most attempts were actually about getting it to show its results. But pay close attention. GPT got the answer right several times but through incorrect calculations. Go check the "thinking" and see if it does a 11-9=2 calculation somewhere, I saw this >50% of the attempts. You should be able to reproduce my results in <5 minutes.

Forgive my annoyance, but we've been hearing the same argument you've made for years[0,1,2,3,4]. We're talking about models that have been reported as operating at "PhD Level" since the previous generation. People have constantly been saying "But I get the right answer" or "if you use X model it'll get it right" while missing the entire point. It never mattered if it got the answer right once, it matters that it can do it consistently. It matters how it gets the answer if you want to claim reasoning. There is still no evidence that LLMs can perform even simple math consistently, despite years of such claims[5]

[0] https://news.ycombinator.com/item?id=34113657

[1] https://news.ycombinator.com/item?id=36288834

[2] https://news.ycombinator.com/item?id=36089362

[3] https://news.ycombinator.com/item?id=37825219

[4] https://news.ycombinator.com/item?id=37825059

[5] Don't let your eyes trick you, not all those green squares are 100%... You'll also see many "look X model got it right!" in response to something tested multiple times... https://x.com/yuntiandeng/status/1889704768135905332

pfortuny · 2025-09-04T14:29:51 1756996191

Have you tried to get google ai studio (nano-banana) to draw a 9-sided polygon? Just that.

https://ibb.co/Qj8hv76h

jules · 2025-08-24T03:38:59 1756006739

There should be no need whatsoever to convince your competitors and/or bureaucrats that allowing your new connector to be produced is in their interest. Only one should be convinced: the person buying the device.

skylurk · 2025-08-24T05:59:43 1756015183

If Apple made both USB-C and Lightning variants and let people choose: then sure, let the market decide.

In reality an oligopoly was stuck in a crappy stalemate and people had only compromised options. Carrying two sets of wires everywhere sucked.

Epa095 · 2025-08-24T07:46:46 1756021606

We tried that for 40 years. The result is drawers full of chargers.

But clearly there is a price for the standardisation, it makes progress slower. On the other hand it makes everyone's lifes easier. Just as with e.g electrical outlets in the house there is a time for exploration and innovation, and there is a time for standardisation. And we are ready for standardisation now, USB-c is good enough.

jules · 2025-08-24T12:41:43 1756039303

USB-c is absolutely not good enough. The connectors are often incompatible due to tiny manufacturing tolerances, cables from different manufacturers often fall out of the port after longer term use, don't make good connection so you have flaky charging, the cables and connectors look the same but are actually incompatible due to supporting only USB 2/3/4 or thunderbolt, whether displayport/hdmi alt mode is supported, etc. This small short-term gain at the cost of locking in USB-c forever was a terrible idea, brought to you by the same hypercompetent group that mandated cookie banners.

eliaspro · 2025-08-24T15:42:19 1756050139

Cookie-banners were never mandated. It's just a fucking stupid way by the website operators, trying to circumvent data privacy regulation.

And when it comes to USB-C. Sure, it's far from perfect, but it's a great foundation to built upon and improve.

jules · 2025-09-04T21:24:31 1757021071

That's the point, the regulation effectively locks in USB-C as it is.

qcnguy · 2025-08-24T20:51:53 1756068713

They were mandated by the EU. You don't get to pass crap laws of the form "show a banner or do {vague/impossible/unacceptable thing}" and then complain when 100% of people show a banner. That kind of inane immaturity is why the EU is so far behind and falling further.

tomhow · 2025-08-25T01:26:32 1756085192

Please don't fulminate on HN. You may not owe cookie banners better, but we're trying for a better style of conversation here. Please make an effort to observe the guidelines, which seek to make HN a place for curious conversation, not rage.

https://news.ycombinator.com/newsguidelines.html

wqaatwt · 2025-08-24T08:20:29 1756023629

> We tried that for 40 years. The result is drawers full of chargers.

Which is a fine? The industry eventually converged to just a handful of common standards on its own.

You can’t innovate without being able to experiment. Which is only possible if there are actual people using your product. Thinking that a committee of bureaucrats can replace that is silly.

saubeidl · 2025-08-24T08:25:07 1756023907

A handful of common standards is useless.

One standard for chargers is the only acceptable outcome and it wouldn't have gotten there without regulation.

What need is there to experiment with chargers? Wire go in, power go through - it's really not that complicated, the only important thing is standardization.

wqaatwt · 2025-08-24T08:47:16 1756025236

> What need is there to experiment with chargers?

That’s the point, I have no clue. But we might still be stuck with floppy drives with a mindset like that.

Although as a physical connector usb-c is far from perfect. IMHO lighting seemed nicer in some ways.

saubeidl · 2025-08-24T08:57:07 1756025827

> But we might still be stuck with floppy drives with a mindset like that.

That seems like a false equivalency to me. It seems quite obvious that storage media have more potential for development than charging wires.

Wire go in - power go through, is literally all they need to do and USB-C does that pretty well.

3836293648 · 2025-08-25T01:58:03 1756087083

No, it's cable go in, power go through, *cable doesn't fall out* and usb-c does that terribly after a few months.

I'm extremely pro standardisation, but the next revision needs to do a lot better.

qcnguy · 2025-08-24T20:52:29 1756068749

MagSafe is a superior power connector in every way.

saubeidl · 2025-08-24T09:34:38 1756028078

The "bureaucrats" are a proxy for the person buying the device. That's literally the point of representative democracy. The average person doesn't want to make a million decisions on technical standards, so they elect somebody they trust to make them for them.

jules · 2025-08-04T23:30:19 1754350219

This visualization is wildly inaccurate. The supposed 1000 pixels are actually 100x100 pixels, which is 10,000 not 1000. Secondly, on many screens they are not actually pixels. For example, on a macbook pro you're likely seeing 40,000 pixels in actuality.

jules · 2025-08-04T11:07:34 1754305654

Look at Singapore to understand the benefits of attracting wealth.

vannevar · 2025-08-04T14:04:23 1754316263

Singapore is a city-state. You may as well compare the Vatican.

jules · 2025-06-29T10:01:42 1751191302

A minimum tax is a bad idea. Taxes tend to creep up, and the main pressure against that is for companies or people to leave.

delusional · 2025-06-29T10:54:25 1751194465

> and the main pressure against that is for companies or people to leave.

Has there been any serious research in this area that supports that conclusion. My impression, which is completely uninformed I admit, is that we often talk about companies leaving due to high tax burdens, but that it rarely happens. It's a politically signal, more than a factual systemic driver.

Sure, a bunch of companies have relocated to tax havens, but we're not going to solve that by regressing to a 2% universal tax rate.

fastball · 2025-06-29T10:17:13 1751192233

That mechanism would still exist, no? Just that the entity leaving is countries from an agreement, not companies from a country.

jules · 2025-07-02T22:36:18 1751495778

How would the incentives of that work?

fastball · 2025-07-04T03:17:40 1751599060

A country recognizes that the rate of company creation has gone done (or some similar metric). They identify the tax rate as a reason for this. They want the tax rate to be lower to ameliorate this. They leave the agreement.

Now presumably there are penalties or such in place for this type of agreement, so it would need to be weighed as onerous enough to accept any such penalties. If it is just one country that feels this way then it might be a non-starter, but if the global minimum tax gets to a point where many countries feel this way, it would probably be viable to coordinate to leave the agreement all at once, with the remainers having little power at that point.

thrance · 2025-06-29T10:36:20 1751193380

> Taxes tend to creep up

Citation needed, corporate taxes have been going down for decades.

> companies or people to leave.

"We can't ever tax anyone because else they would just leave; ergo nothing can or should be done about rampant inequality" is not only false, it is extremely dangerous and accelerates the fall of our democracies.

mattmaroon · 2025-06-29T10:42:43 1751193763

Also, the whole point of the agreement is that the tax is global so there’s nowhere to leave to.

jules · 2025-07-02T23:03:08 1751497388

Yes and that is bad, because cartels are bad. Competition between political systems is good, for much the same reason that competition between companies is good.

mattmaroon · 2025-07-03T14:13:09 1751551989

I disagree that that is always the case and think that viewpoint is overly dogmatic and impractical. Government is nothing if not a legal cartel.

Companies competing to make the best product is good.

Tiny nations stealing corporations domiciles by offering low tax rates hurts investment in first world economies, the kind we want everyone to have.

jules · 2025-07-03T18:49:56 1751568596

How does it hurt investment? Those tiny nations are only helping eliminate an inefficient form of taxation. The main problem is that only multinationals can make use of it.

thrance · 2025-07-03T19:08:04 1751569684

Hiding huge amounts of money in tax havens is actively detrimental to the economy. I believe the goal of any economy should be to better our lives, not hoard wealth and sit on it.

Without taxation, the infrastructures needed to maintain a healthy economy are unsustainable. We need to ensure that what companies benefit from public services is taken back so it can be reinvested.

jules · 2025-07-03T22:23:46 1751581426

Money is not "hidden" or "hoarded" in tax havens, nor does hoarding money affect the economy negatively. Taxation is necessary for infrastructure (though that is actually a small fraction: about 3% of US federal taxes goes to infrastructure), and I did not say or imply that taxation is unnecessary. The question of what the best level of taxation is, and what the best place to levy those taxes is, cannot be decided based on high level slogans. One thing is clear though: standardizing tax rates is bad because it removes the competitive aspect between countries. It is good if people and companies are able to move to the places that give them the best public services for the least cost, for the same reason that it is good that salaries are not standardized between companies, and the same reason why it is good that airline ticket prices are no longer standardized by the IATA and CAB.

ruicraveiro · 2025-06-29T11:56:27 1751198187

Leave to where to avoid a global minimum tax? To Mars?

throwawa14223 · 2025-06-29T14:45:47 1751208347

Beyond that it is actually evil.

watwut · 2025-06-29T10:59:30 1751194770

Oh please, as far as USA goes, taxes went down especially for companies and rich. And the country is in the process of creating new massive deficit just by a massive tax cut.

jules · 2025-05-04T00:35:56 1746318956

I don't understand how east-west arrays differ much from just a flat area. At the end of the day, don't they capture all sunlight in some large square? The east-west array only captures a bit differently around the outer edges. Can somebody explain? Is solar panel efficiency that dependent on incidence angle?

Havoc · 2025-05-04T01:09:59 1746320999

The panel isn’t a perfectly flat two dimensional structure. So light hitting at an angle isn’t equally effective

jules · 2025-01-15T13:56:54 1736949414

The universe is already modeled that way. Differential equations are a kind of continuous time and space version of cellular automata, where the next state at a point is determined by the infinitesimally neighboring states.

mannykannot · 2025-01-15T14:53:02 1736952782

My first thought was 'ah, yes.' My second thought was 'but what about nonlocality?'

jules · 2025-01-01T18:50:07 1735757407

Nice post. Would the larger amount of code result in different performance in a scenario where other code is being run as well, or would the instruction cache be large enough to make this a non-issue?