More

segmondy · 2025-09-03T15:48:32 1756914512

And 3, people who are about to use AI to build a mental model quickly and understand the concept deeper

segmondy · 2025-08-31T16:29:08 1756657748

How do you stop forced assisted suicide or do you think it will never happen?

segmondy · 2025-08-22T20:19:56 1755893996

be careful what you wish for, you are giving up your freedom to movement in the name of security. you might make the argument that you can hail a cab. that's more expensive than owning your own car and with self driving cabs you will lose your privacy when you use them. any movement between 2 points will always be recorded with at least video and as you are moving, someone else other than you can pinpoint your exact location. with your own vehicle, you could unplug your phone and car GPS/tracking device and have some privacy.

JumpCrisscross · 2025-08-23T01:53:29 1755914009

> you are giving up your freedom to movement in the name of security

Driving in American cities is the opposite of freedom. The necessity of regulating apes piloting heavy machinery in close proximity to each other and society is a major source of our modern police state.

Klonoar · 2025-08-23T06:19:18 1755929958

No, it is still freedom.

It is inconvenient freedom, but it’s freedom.

afcool83 · 2025-08-23T10:01:30 1755943290

We do not have a freedom to movement _by motor vehicle_ in the US.

It is a privilege licensed by the State and regularly revoked through due process or expiry.

While your concern about mobility and privacy are valid, I would contend that public safety is what it’s to be weighed against. Some people really are better riders than drivers.

potato3732842 · 2025-08-23T18:49:26 1755974966

We do not have a "right", in practice, we have some degree of practical freedom.

Of course people who really don't like that freedom will play up the privilige vs right angle as they see fit to advance their goals.

segmondy · 2025-08-22T13:04:24 1755867864

if you are running a 2bit quant, you are not giving up performance but gaining 100% performance since the alternative is usually 0%. Smaller quants are for folks who won't be able to run anything at all, so you run the largest you can run relative to your hardware. I for instance often ran Q3_K_L, I don't think of how much performance I'm giving up, but rather how without Q3, I won't be able to run it at all. With that said, for R1, I did some tests against 2 public interfaces and my local Q3 crushed them. The problem with a lot of model providers is we can never be sure what they are serving up and could take shortcuts to maximize profit.

linuxftw · 2025-08-22T15:05:13 1755875113

That's true only in a vacuum. For example, should I run gpt-oss-20b unquantized or gpt-oss-120b quantaized? Some models have a 70b/30b spread, and that's only across a single base model, where many different models exist at different quants could be compared for different tasks.

jkingsman · 2025-08-22T16:38:11 1755880691

Definitely. As a hobbyist, I have yet to put together a good heuristic for better-quant-lower-params vs. smaller-quant-high-params. I've mentally been drawing the line at around q4, but now with IQ quants and improvements in the space I'm not so sure anymore.

linuxftw · 2025-08-22T17:56:13 1755885373

Yeah, I've kinda quickly thrown in the towel trying to figure out what's 'best' for smaller memory systems. As things are just moving so quickly, whatever time I invest into that is likely to be for nil.

danielhanchen · 2025-08-22T18:35:02 1755887702

For GPT OSS in particular, OpenAI only released the MoEs in MXFP4 (4bit), so the "unquantized" version is 4bit MoE + 16bit attention - I uploaded "16bit" versions to https://huggingface.co/unsloth/gpt-oss-120b-GGUF, and they use 65.6GB whilst MXFP4 uses 63GB, so it's not that much difference - same with GPT OSS 20B

llama.cpp also unfortunately cannot quantize matrices that are not a multiple of 256 (2880)

danielhanchen · 2025-08-22T18:30:34 1755887434

Oh Q3_K_L as in upcasted embed_tokens + lm_head to Q8_0? I normally do Q4 embed Q6 lm_head - would a Q8_0 be interesting?

segmondy · 2025-08-22T11:53:45 1755863625

Don't listen to this crowd, these are "technical folks". Most of your audience will fail to figure it out. You can provide an option that llama.cpp is missing and give them an option where you auto install it or they can install it themselves and do manual configuration. I personally won't tho.

Computer0 · 2025-08-22T18:13:43 1755886423

Who do you think the audience is here if not technical. We are in a discussion about a model that requires over 250gb of ram to run. I don't know a non-technical person with more than 32gb.

pxc · 2025-08-22T23:38:10 1755905890

I think most of the people like this in the ML world are extreme specialists (e.g.: bioinformaticians, statisticians, linguists, data scientists) who are "technical" in some ways but aren't really "computer people". They're power users in a sense but they're also prone to strange bouts of computing insanity and/or helplessness.

danielhanchen · 2025-08-22T12:56:26 1755867386

I think for a compromise solution I'll allow the permission asking to install. I'll definitely try investigating pre built binaries though

segmondy · 2025-08-21T21:37:42 1755812262

garbage benchmark, inconsistent mix of "agent tools" and models. if you wanted to present a meaningful benchmark, the agent tools will stay the same and then we can really compare the models.

there are plenty of other benchmarks that disagree with these, with that said. from my experience most of these benchmarks are trash. use the model yourself, apply your own set of problems and see how well it fairs.

paradite · 2025-08-22T05:00:28 1755838828

Hey. I like your roast on benchmarks.

I also publish my own evals on new models (using coding tasks that I curated myself, without tools, rated by human with rubrics). Would love you to check out and give your thoughts:

Example recent one on GPT-5:

https://eval.16x.engineer/blog/gpt-5-coding-evaluation-under...

All results:

https://eval.16x.engineer/evals/coding

jstummbillig · 2025-08-22T18:56:11 1755888971

Which benchmarks are not garbage?

I don't consider myself super special. I think it should be doable to create a benchmark that beats me having to test every single new model.

segmondy · 2025-08-21T14:36:32 1755786992

AWS CEO says what he has to say to push his own agenda and obviously to align himself with the most currently popular view.

segmondy · 2025-08-20T17:39:00 1755711540

zed is just on the hype train, obviously very talented people, they are thinking hard about LLMs but I'm not really sure where they are going, their pivot is probably going to be more interesting...

burnt-resistor · 2025-08-20T23:06:04 1755731164

Data harvesting and non-ownership.

segmondy · 2025-08-19T16:29:44 1755620984

By your argument, once anything makes it in, then it can't be removed. Billions of people are going to use the web every day and it won't stop. Even the most obscure feature will end up being used by 0.1% of users. Can you name a feature that's supported by all browsers that's not being used by anyone?

kg · 2025-08-19T16:31:53 1755621113

Yes. That is exactly how web standards work historically. If something will break 0.1% of the web it isn't done unless there are really really strong reasons to do it anyway. I personally watched lots of things get bounced due to their impact on a very small % of all websites.

This is part of why web standards processes need to be very conservative about what's added to the web, and part of why a small vocal contingent of web people are angry that Google keeps adding all sorts of weird stuff to the platform. Useful weird stuff, but regardless.

WhitneyLand · 2025-08-19T17:49:04 1755625744

“That is exactly how web standards work…”

Says who? You keep mentioning this 0.1% threshold yet…

1. I can’t find any reference to that do you have examples / citations?

2. On the contrary here’s a paper that proposes a 3x higher heuristic: https://arianamirian.com/docs/icse2019_deprecation.pdf

3. It seems there are plenty of examples of features being removed above that threshold NPAPI/SPDY/WebSQL/etc.

4. Resources are finite. It’s not a simple matter of who would be impacted. It’s also opportunity cost and people who could be helped as resources are applied to other efforts.

troupo · 2025-08-19T21:13:31 1755638011

E.g. Google said in their document https://docs.google.com/document/d/1RC-pBBvsazYfCNNUSkPqAVpS...

--- start quote ---

As a general rule of thumb, 0.1% of PageVisits (1 in 1000) is large, while 0.001% is considered small but non-trivial. Anything below about 0.00001% (1 in 10 million) is generally considered trivial. There are around 771 billion web pages viewed in Chrome every month (not counting other Chromium-based browsers). So seriously breaking even 0.0001% still results in someone being frustrated every 3 seconds, and so not to be taken lightly!

--- end quote ---

Read the full doc. They even give examples when they couldn't remove a feature impacting just 0.0000008% of web views.

WhitneyLand · 2025-08-21T20:23:34 1755807814

Thank you for the citation. Up voted.

segmondy · 2025-08-18T22:43:16 1755556996

Mark the accounts as kids account, they will not collect data till the birthday on the account turns 18. Pick any recent month and 2025 and you get 17+yrs of minimal data collection. They won't turn it on for new data sharing options.

sizzle · 2025-08-19T04:29:17 1755577757

Will this limit your phone features in any way or ability to pay for stuff?