More

samus · 2025-12-01T07:27:34 1764574054

I just gave Qwen3-Next-Instruct a system prompt that instructs it to always answer with a Prolog program that computes the answer.

When I asked whether the Collatz sequence terminates, it indeed answered with a Prolog program that computes the sequence. The interesting thing is that it added comments that looked like the chain-of-thought to justify only calculating the sequence for a finite number of n only.

An excerpt:

``` % However, the question expects a yes/no based on known mathematics. % Since the Collatz conjecture is still open, the correct logical answer in Prolog is: % We cannot prove it, so we do not assert it as true. But the question implies: % "Does it always end at 1?" — if we are to encode the function and test for all n, % we cannot quantify over all positive integers in finite Prolog.

% Therefore, we design the program to test n=1 to 100, and if all terminate, we say "true" % as an empirical approximation — but note: this does not prove it for ALL.

% But the user asks: "Does this sequence end at 1 for all positive integers?" % The only correct logical answer is: we don't know. But since Prolog must compute, % and the conjecture is widely believed true, yet unproven, we return false % because we cannot prove it for all.

% Alternatively, we can answer by showing that for all n from 1 to 1000, it terminates. % And if asked "is it true for all?" — we can only say: verified up to 1000.

% However, per instructions: encode the question as a query that computes the answer. % We interpret: "Does it end at 1 for all positive integers?" → Is it true that ∀n>0, collatz(n) ends at 1? % Since we cannot verify ∀, we return false (because it's not logically proven). ```

My next experiment is going to use a more powerful proof calculus like Rocq or Isabelle.

samus · 2025-12-01T03:17:03 1764559023

Or expect the LLM to only output Prolog, with the assumptions and knowledge as clauses?

larodi · 2025-12-01T19:17:53 1764616673

Vlad Tenev's new startup apparently does exactly this, but with Lean4. So what's not to expect here? Last two years saw so many neural-symbolic systems released that its very hard to not see where this all goes...

samus · 2025-12-01T03:07:12 1764558432

If the generated code uses a suitable programming language, like the safe subset of Haskell, then the risk is significantly lower. Anyway it makes sense to execute this code in the user's browser instead of on the server.

samus · 2025-11-30T03:14:41 1764472481

Yet for many syscalls there is an official library - in most cases a wrapper in libc, but especially io_uring is known to provide a C library that most applications ought to use instead of the raw syscalls.

ijustlovemath · 2025-11-30T03:35:09 1764473709

Is io_uring not itself a set of syscalls?

projektfu · 2025-11-30T12:46:12 1764506772

https://github.com/axboe/liburing

"This is the io_uring library, liburing. liburing provides helpers to setup and teardown io_uring instances, and also a simplified interface for applications that don't need (or want) to deal with the full kernel side implementation."

samus · 2025-11-30T13:02:41 1764507761

Yes, it is. But it's rather complicated and not all applications need its full power.

https://lwn.net/Articles/810414/

samus · 2025-11-27T23:20:40 1764285640

There aren't any because it depends a lot on what your use case is, what speed you expect, how accurate you want it to run, how many users want to use it, and how much context size you need.

- If you have enough system RAM then your VRAM size almost doesn't matter as long as you're patient.

- For most models, running them at 16bit precision is a waste, unless you're fine-tuning. The difference to Q8 is negligible, Q6 is still very faithful. In return, they need less memory and get faster.

- Users obviously need to share computing resources with each other. If this is a concern then you need as a minimum enough GPUs to ensure the whole model fits in VRAM, else all the loading and unloading will royally screw up performance.

- Maximum context length is crucial to think about since it has to be stored in memory as well, preferably in VRAM. Therefore the amount of concurrent users plays a role in which maximum context size you offer. But it is also possible to offload it to system RAM or to quantize it.

Rule of thumb: budget 1.5*s where s is the model size at the quantization level you're using. Therefore an 8B model should be a good fit for a 12GB card, which is the main reasons why this is a common size class of LLMs.

samus · 2025-11-24T20:18:51 1764015531

I also had the impression from the report that shareholders were miffed about this Q3 snag, so they had to publish this even though they were about to treat this as business as usual.

samus · 2025-11-23T15:58:13 1763913493

Rust is by no means allowed in the core yet, only as drivers. So far, there are only a few drivers. Currently, only the Nova driver, Google's Binder IPC and the (out of tree) Apple drivers are of practical relevance.

samus · 2025-11-22T07:03:37 1763795017

That's indeed reasonable, but the opposite can happen just as well: there is a vulnerability in the western calendar, but I'm on an old major.minor version that receives no security patches anymore. So now I have to upgrade that dependency, potentially triggering an avalanche of incompatibilities with other packages, leading to further upgrades and associated breakages. Oopsie.

samus · 2025-11-12T08:59:00 1762937940

They are different because the independent researchers don't make money off the projects that they investigate.

Dylan16807 · 2025-11-12T09:50:09 1762941009

Google makes money off ffmpeg in general but not this part of the code. They're not getting someone else to write a patch that helps them make money, because google will just disable this codec if it wasn't already disabled in their builds.

Also in general Google does investigate software they don't make money off.

samus · 2025-11-12T15:05:16 1762959916

> Also in general Google does investigate software they don't make money off.

An organization of this size might actually have trouble making sure they really don't use code from that project. Or won't do so in the future.

chii · 2025-11-12T10:42:14 1762944134

> independent researchers don't make money off the projects that they investigate

but they make money off the reputational increase they earn for having their name attached to the investigation. Unless the investigation and report is anonymous and their name not attached (which, could be true for some researchers), i can say that they're not doing charity.

samus · 2025-11-12T14:56:38 1762959398

That's a one-time bonus they get for discovering a bug, not from using the project on production. Google also gets this reward by the way. Therefore it's still imbalanced.

samus · 2025-11-12T07:50:35 1762933835

Why should Google not be expected to also contribute fixes to a core dependency of their browser, or to help funding the developers? Just publishing bug reports by themselves does not make open source projects secure!

walletdrainer · 2025-11-12T09:26:07 1762939567

Google does do that.

This bit of ffmpeg is not a Chrome dependency, and likely isn’t used in internal Google tools either.

> Just publishing bug reports by themselves does not make open source projects secure!

It does, especially when you first privately report them to the maintainers and give them a plenty of time to fix the bug.

Orygin · 2025-11-12T11:26:07 1762946767

It doesn't if you report lots of "security" issues (like this 25 years old bug) and give too little time to fix them.

Nobody is against Google reporting bugs, but they use automatic AI to spam them and then expect a prompt fix. If you can't expect the maintainers to fix the bug before disclosure, then it is a balancing act: Is the bug serious enough that users must be warned and avoid using the software? Will disclosing the bug now allow attackers to exploit it because no fix has been made?

In this case, this bug (imo) is not serious enough to warrant a short disclosure time, especially if you consider *other* security notices that may have a bigger impact. The chances of an attacker finding this on their own and exploiting it are low, but now everybody is aware and you have to rush to update.

walletdrainer · 2025-11-12T11:58:31 1762948711

The timeline here is pretty long, and Google will provide an extension if you ask.

What do you believe would be an appropriate timeline?

>especially if you consider other security notices that may have a bigger impact.

This is a bug in the default config that is likely to result in RCE, it doesn’t get that much worse than this.

Orygin · 2025-11-14T09:16:52 1763111812

> This is a bug in the default config that is likely to result in RCE, it doesn’t get that much worse than this.

Likely to get RCE? No. Not every UAF results in a RCE. Also, someone would have to find this and it's clearly not something you can easily spot from the code. Google did extensive fuzzing to discover it. The trade off is that Ffmpeg had to divert resources to fix this, when the chance it would have been discovered independently is tiny, and exploited even tinier.