More

nardi · 2025-06-18T06:07:41 1750226861

Having worked in many languages and debuggers across many kinds of backend and front end systems, I think what some folks miss here is that some debuggers are great and fast, and some suck and are extremely slow. For example, using LLDB with Swift is hot garbage. It lies to you and frequently takes 30 seconds or more to evaluate statements or show you local variable values. But e.g. JavaScript debuggers tend to be fantastic and very fast. In addition, some kinds of systems are very easy to exercise in a debugger, and some are very difficult. Some bugs resist debugging, and must be printf’d.

In short, which is better? It depends, and varies wildly by domain.

nardi · 2025-06-01T19:12:45 1748805165

Whats your prompt processing speed? That’s more important in this situation than output TPS. If you have to wait minutes to start getting an answer, that makes it much worse than a cloud-hosted version.

ryan_glass · 2025-06-01T21:09:10 1748812150

Prompt eval time varies a lot with context but it feels real-time for short prompts - approx 20 tokens per second but I haven't done much benchmarking of this. When there is a lot of re-prompting in a long back and forth it is still quite fast - I do use KV cache which I assume helps and also quantize the KV cache to Q8 if I am running contexts above 16k. However, if I want it to summarize a document of say 15,000 words it does take a long time - here I walk away and come back in about 20 minutes and it will be complete.

ryao · 2025-06-01T19:24:22 1748805862

If he is doing multiturn conversations, he can reuse the kv cache from the last turn and skip the prompt processing on the history that would make time to first token too slow, by only doing prompt processing on his actual prompt for the current turn. This turns a quadratic amount of tokens to process into a linear number. I am not sure if this is what he is doing, but that is what I would do if I had his hardware.

pclmulqdq · 2025-06-01T19:24:10 1748805850

I assume KV caching makes this a non issue, but I'm also curious.

idonotknowwhy · 2025-06-01T23:27:07 1748820427

If you're just chatting with it starting with "Hi", that's correct. The conversation remains in the KV cache as it grows gradually.

But if you're posting code, writing drafts, or even small snippets of articles, etc in there it becomes a huge problem.

pclmulqdq · 2025-06-02T00:14:07 1748823247

Usually, when people think about the prompt tokens for a chat model, the initial system prompt is the vast majority of the tokens and it's the same regardless for many usage modes. You might have a slightly different system prompt for code than you have for English or for chatting, but that is 3 prompts which you can permanently put in some sort of persistent KV cache. After that, only your specific request in that mode is uncached.

nardi · 2025-05-06T04:34:59 1746506099

I don't know what I'm talking about, but could you use distillation techniques?

koljab · 2025-05-07T12:56:32 1746622592

Maybe possible, I did not look into that much for Coqui XTTS. What i know is that the quantized versions for Orpheus sound noticably worse. I feel audio models are quite sensitive to quantization.

nardi · on Aug 17, 2024

I think you missed the last bit:

> scientists have found no evidence this phenomenon might make sounds that are audible to the human ear.

Which I take to mean they’ve measured ultrasounds but no audible sounds.

RHSeeger · on Aug 18, 2024

I'm not sure why you say I missed that. I didn't. Can you expand on what you meant by your reply.

To me, your reply actually highlights what I was talking about; because your use of "this phenomenon" is _somewhat_ ambiguous.

1. "this phenomenon" can be the sound of "large pressure splits in tree trunks, caused by sap freezing and expanding" (presumably audible to humans)

2. "this phenomenon" can be the sound of "freezing sap in trees" (presumably not audible to humans)

solardev · on Aug 18, 2024

Seems like they would've recored both in the field, no? If they were recording sap freezing in the field, presumably the mics would pick up on other parts of the tree undergoing stresses and making audible sounds.

For that to have not been the case, either they would've had to freeze sap in the lab, or they would've had to go way out of their way to isolate recordings of just the sap in the field without the rest of the tree (is that even possible with normal recording tech?)

bmicraft · on Aug 17, 2024

"Found no evidence" and "didn't even try to measure" isn't really the same, is it?

nardi · on July 9, 2024

Meta: Can someone with Linux/bootloader knowledge tell me whether most of these comments are as clueless as they seem?

juped · on July 9, 2024

Many seem a bit confused but I have only skimmed the comments.

I don't understand the point of the thing described in the OP (I have not watched the talk, just skimmed the notes), myself. Linux kernels can EFI load themselves; if you want more flexibility than a precompiled kernel command line, or to load from ext4/other non-FAT filesystems, refind exists, fits on the ESP (kernel + initramfs can get big; I keep mine on the ESP but wanting to keep it on a larger ext4 filesystem is very understandable) and is very high quality.

Bootloaders are obsolete in this sense; every OS provides an EFI stub loader, except Linux where kernels are their own EFI stub; nevertheless, distros continue to install GRUB alongside themselves on UEFI systems out of inertia. If Red Hat wants to supplant it... okay, but it can be supplanted today with very good components, even if they weren't invented there.

creshal · on July 9, 2024

If I had a nickel for every time the RedHat ecosystem overengineered itself into a corner and decided the only possible solution was more overengineering, I could probably buy IBM.

nardi · on June 11, 2024

Many people in this thread are extremely cynical and also ignorant of the actual security guarantees. If you don’t think Apple is doing what they say they’re doing, you can go audit the code and prove it doesn’t work. Apple is open sourcing all of it to prove it’s secure and private. If you don’t believe them, the code is right there.

nardi · on June 11, 2024

This is what the “attestation” bit is supposed to take care of—if it works, which I’m assuming it will, because they’re open sourcing it for security auditing.

nardi · on May 17, 2024

I would agree, except the seller seems to have made a new forgery of their receipt on the fly in response to Cabel's inquiry, which leads me to believe they probably made the original forgery as well.

polpo · on May 17, 2024

And the other item the seller had was a Taylor Swift concert ticket that used the wrong font. There’s definitely a pattern there. https://social.panic.com/@cabel/112452964691814590

aix1 · on May 17, 2024

And here's Steve Jobs thanking someone for an amazing "proyect":

https://www.ebay.com/itm/285775420457

(same seller.)

P.S. The seller is located in Spain. :)

nardi · on Feb 3, 2024

It means you compile-in a direct reference to the node that needs to be updated when some property changes, so instead of searching the tree of n nodes to find it, you already have the reference.

nardi · on Sept 27, 2023

The fact that the data exists somewhere is small comfort if users cannot read the privacy implications in the app store itself when they're deciding whether to download an app.