Hacker News new | past | comments | ask | show | jobs | submit | atty's comments login

Jax is, as far as an outsider can tell, Google research/DeepMind’s primary computation library now, not Tensorflow. So it’s safer than most Google projects, unless you think they’re secretly developing their third tensor computation library in 10 years (I suppose if anyone was going to do it, it would be Google).


Perhaps these groups could be regulated and insured by an even larger entity, perhaps even one with the authority to punish individuals and organizations for wrongdoing?


Rather cynical comments so far. I personally am very interested to see how this line of chips does, both in terms of performance (really efficiency, for this sort of chip), and market performance. Hopefully things like Lunar Lake, Arrow Lake, etc, and their 18a node all turn out to be as good as some of the early leaks and press releases make them would indicate, because Intel needs some big wins to get back on track.


We spent decades taking leaps and bounds with every chip release. We've now seemingly settled into the incremental improvement phase. The chip makers have responded by burning tons of transistors on extra crap that spends most of it's life powered down.

It's hard not to be cynical.


? since 2016 Ryzen made intel have to compete again and the 2020 M-series from Apple made day long battery life a reality.

CPUs have been very interesting the past 8 years or so.


As someone who writes assembly they really haven't been interesting for a long while.


I think it's all about being realistic.

We made leaps and bounds before because clock speeds were going up 50% or more between generations. Add in architecture improvements and it was easy to see actual performance double from one generation to the next.

But we're struggling to get clocks faster now, and I always imagined that it's because the speed of electricity isn't fast enough. At 6 Ghz, in one clock cycle, light travels only about 6 inches/15 cm. Electricity moves slower than the speed of light, depending on the medium it's going through. At the frequencies we're operating at, I figure that transistor switching speed and clock skew just within the CPU can start to be an issue.

We already have tons of CPU optimizations. Out-of-order execution, branch prediction, register renaming, I could go on. There's probably not much more we can do to improve single-threaded performance. Every avenue for optimizing x86 has been taken.

And so we go multi-core, but that ends up making heat a primary concern. It also relies on your task being parallel.

Or we go ARM, but now some of your software that has had x86-specific optimizations like using AVX-512 has to be rewritten.


Being realistic doesn’t mean a need to be cynical. Leaps and bounds of progress never lasts forever. Incremental improvement is still worth celebrating.

I totally disagree that specialized processing units are wasteful because they spending most of their life powered down. Your iPhone uses the neural engine every time you open the camera app. The announced AI features for the next iOS version will be using on-device AI a lot of times you use Siri - which is used a lot by a lot of people.

The old school version of this would be like if you were dissing multimedia instructions like hardware encoders/decoders. How do you think your laptop so effortlessly plays back 4K video and somehow get better battery life than when you’re working on a Word document? It’s that part of your processor that usually “sits there doing nothing.”

You just don’t realize how much these segments of the chip are accelerating your experience.


You want a chip that never powers down? Boy, have I got a deal for you. zero transistor waste, zero extra crap just like you asked for. It's a 286. Limited availability so gonna have to ask $5000 per chip.


There's nothing wrong with the state of things; however, I'm merely pointing out that the states are significantly different than they used to be and a period of change in expectation might be warranted.

CPUs used to be purely about computing power, now they're about computing accessories, which is a different type of market and purchase all together.

If you can't acknowledge the differences without becoming irrationally aggressive as if I've insulted you personally then this is not going to be a great conversation.


> CPUs used to be purely about computing power,

Yes, and now they are about saving power. If the employer pays the wasted hours, why not.


Does it boot Xenix?


It’s very reasonable to want Google and Bing to index your page for search, but not have your data collected for training models, i think. I’m not familiar with robots.txt to know if it has a whitelisting mechanism


I see a lot of Robots.txt files that use non-wildcards for Allow but what would be the use case for using a non-widlcard for Disallow?


Apples memory is on package, not on die.


To be fair, defense is an existential risk for the US and its allies. NATO can’t really afford to not have a reasonably up-to-date combat jet. They also need to continually feed money into the military industrial complex so that suppliers don’t go under/downsize too much/etc.

Not disagreeing with your sentiment, just think that certain fields like defense, healthcare, etc have slightly different priority lists.


> defense is an existential risk for the US

this is not a true statement for a huge country with oceans on 2 sides and nukes. it is a true statement about people relying on the us military to make money tho


Oceans are only a defense if you can float a navy on them. The British Islands got invaded a couple of times... until they built a big navy.

Having two oceans is great, but now you need at least two fleets.

Nukes are only worth something if you have a lot of them and can credibly delivery them in multiple ways. Now you need subs, long-range bombers and the fighters to protect them, and missile silos.

Now add in reliance on a global supply chain (many types of oil and minerals, grades of steel not made in the US, TSMC), and all of a sudden you need to be able to help protect your partners on the other side of the world.

Now sprinkle in a couple of crazy dictators with nuclear arsenals and huge armies of their own, and it's starting to make sense why the US military needs constant re-investment.


Don't forget satellites and SIGINT (which just might involve crazy submarines and big "scientific" radio astronomy dishes). Or cover stories about ships and manganese extraction worthy of James Bond.


> Glomar Explorer

That was the CIA's plot, which the Navy vehemently objected to. The Navy said it was farcically complicated, too large of a plot to keep secret and likely to fail. Both proved true. The Navy offered to recover Intel from the Soviet submarine using DSVs and ROVs, low risk operations they could have easily kept secret. But the CIA won this dispute and fumbled the submarine and got putted by the press.


> Now you need subs, long-range bombers and the fighters to protect them, and missile silos.

The role of the long-range bombers in the nuclear triad is heavily questioned. It is not quite certain that you "need" them.


You need them anyway because almost 100% of the time you’ll be dropping conventional ordnance / paratroopers / perhaps drones in future.


this is hyperbolic. there are various coastal defenses and naval deployments that are not nearly as intensive as you describe.

the "crazy dictator" theory is a conjuring of the govt and media in service of empire. they are acquiring nukes because they are afraid of being invaded by us!


Not all crazy dictators are dangerous purely because they have nukes.

Some sponsor anti-US terrorism.

Some are chomping at the bit to invade Europe (Russia) or Taiwan (China).

Some are just nuts (North Korea).

But now that these nutjobs have nukes, they’re all the more dangerous.


I think this is using the OpenAI Whisper repo? If they want a real comparison, they should be comparing MLX to faster-whisper or insanely-fast-whisper on the 4090. Faster whisper runs sequentially, insanely fast whisper batches the audio in 30 second intervals.

We use whisper in production and this is our findings: We use faster whisper because we find the quality is better when you include the previous segment text. Just for comparison, we find that faster whisper is generally 4-5x faster than OpenAI/whisper, and insanely-fast-whisper can be another 3-4x faster than faster whisper.


Is insanely-fast-whisper fast enough to actually run on the CPU and still trascribe in realtime? I see that none of these are running quantized models, it's still fp16. Seems like there's more speed left to be found.

Edit: I see it doesn't yet support CPU inference, should be interesting once it's added.


Insanely fast whisper is mainly taking advantage of a GPU’s parallelization capabilities by increasing the batch size from 1 to N. I doubt it would meaningfully improve CPU performance unless you’re finding that running whisper sequentially is leaving a lot of your CPU cores idle/underutilized. It may be more complicated if you have a matrix co-processor available, I’m really not sure.


Does insanely-fast-whisper use beam size of 5 or 1? And what is the speed comparison when set to 5?

Ideally it also exposes that parameter to the user.

Speed comparisons seem moot when quality is sacrificed for me, I'm working with very poor audio quality so transcription quality matters.


It's beam size 1. From my quick tests on a Colab T4, CTranslate2 (faster-whisper's backend) is about 30% faster with like for like settings. I decoded the audio, got mel features, split into 30s segments, and ran it batched (beam size 1, batch size 24, no temperature fallback passes). Takes a bit more effort than a cli utility but isn't too hard.

Side note, the insanely fast whisper readme gives benchmarks on an A100 but only the FA2 lines were. The rest were on a T4 looking at the notebooks/history. Turing doesn't support FA2 so the gap should be smaller with it, but based on the distil-whisper paper CTranslate2 is probably still faster.

TensorRT-LLM might be faster but I haven't looked into it yet.


Hugging Face Whisper (the backend to insanely-fast-whisper) now supports PyTorch SDPA attention with PyTorch>=2.1.1

It's enabled by default with the latest Transformers version, so just make sure you have:

* torch>=2.1.1

* transformers>=4.36.0


Nice, thanks for your work on everything Whisper related. I tested it a couple weeks ago which largely matched the results in the insanely fast whisper notebook. Comparison was with BetterTransformers.

I just reran the notebook with 4.36.1 (minus the to_bettertransformer line) but it was slower (the batch size 24 section took 8 vs 5 min). Is there something I need to change? Going back to 4.35.2 gives the old numbers so the T4 instance seems fine.


Our comparisons were a little while ago so I apologize I can’t remember if we used BS 1 or 5 - whichever we picked, we were consistent across models.

Insanely fast whisper (god I hate the name) is really a CLI around Transformers’ whisper pipeline, so you can just use that and use any of the settings Transformers exposes, which includes beam size.

We also deal with very poor audio, which is one of the reasons we went with faster whisper. However, we have identified failure modes in faster whisper that are only present because of the conditioning on the previous segment, so everything is really a trade off.


Indeed, insanely-fast-whisper supports beam-search with a small code modification to this code snippet: https://huggingface.co/openai/whisper-large-v3

Just call the pipeline with:

result = pipe(sample, generate_kwargs={"num_beams": 5})


yeah well, I find that super-duper-insanely-fast-whisper is 3-4x faster than insanely-fast-whisper.

/s


Yes I am not a fan of the naming either :)


As another commenter pointed out, you can give context for the decoder. So you can feed previous chunks into the model as the context. This is how we do it for streaming, at least.


I think the Ars writer here is very uninformed. Floc is completely different from Topics, and if you actually read the Topics spec, it seems to be significantly better than 3rd party cookies? At least to me. Maybe I’m missing something.


Topics is a refinement of FloC… no third-party cookies and no Topics would be significantly better but Google is an adtech company and there’s anti-competitive concerns from other ad tech providers


Is this something that would cause a noticeable difference in quality, or more of an academic one?


There are two aspects of the "difference":

1) The timber that was once a locally (or nearby) grown tree might nowadays be - while the same or very similar species - originating from some other country (with different climate/humidity/etc.) and thus be different at a structural level.

2) The timber was once seasoned over several months or even years time (again locally), while most nowadays is artificially seasoned very quickly (usually in a kiln or autoclave for a few days)

While you cannot do much for #1, except of course choosing the best batch you can find, the #2 (which is actually what makes the larger differences in stability and also strength over time) can sometimes be avoided by buying the wood before it is seasoned and season it yourself naturally (of course you need the space and time needed, for furniture or making of windows/doors seasoning for a couple of years is not uncommon).

The difference between artificially seasoned wood and naturally seasoned is essentially that the latter is more "stable", i.e. it will tend to crack and/or bend/deform much less.


Awesome, thank you for the info!

Edit: not awesome that wood is lower quality of course, the awesome was for the explanation.


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: