More

Palmik · 2024-08-23T15:04:28.000000Z

Would love to hear how you went about doing things at Framer!

Palmik · 2024-08-17T10:15:37.000000Z

Quite shady indeed: https://www.saashub.com/theres-an-ai-alternatives -- see the icon, it's almost the same as https://theresanaiforthat.com/

Palmik · 2024-08-17T10:12:07.000000Z

The name is a prefix of the most popular website in the niche: thereisanaiforthat.com. Maybe some shady SEO tactic?

There are tons of these. Many of them are quite fishy -- they often reach out to you offering you to list your tool for a fee -- there are only a few directories where it's worth to do so.

Btw, was this supposed to be a Show HN?

Palmik · 2024-08-16T07:14:29.000000Z

For a reference, that's at least 40% more than what H100 sxm would cost if you are willing to reserve for a month (so not apples to apples).

H100 will also be much faster, especially if you are willing to use fp8. Maybe 3-4x

Palmik · 2024-07-24T10:29:53.000000Z

I don't think these analogies work.

Meta provides open source code to modify the the weights (fine tune the model). In this context, fine-tuning the model is better converted to being able to modify the code of the game.

fishermanbill · 2024-07-24T10:32:56.000000Z

So do video game developers (provide source code to modify their games) the analogy absolutely works. I can list a huge amount of actually open source software that I can see the source code and data for which is very different from Llama etc.

Palmik · 2024-07-23T20:53:44.000000Z

In the LLM world there are many open source solutions to find tuning, maybe the best one being from Meta: https://github.com/pytorch/torchtune

In terms of inference and interface (since you mentioned comfy) there are many truly open source options such as vLLM (though there isn't a single really performant open source solution for inference yet).

tmsh · 2024-07-23T21:27:18.000000Z

Thanks! Good to know.

Palmik · 2024-07-23T20:48:28.000000Z

It's much better than sharing a binary artifact of regular software, since the weights can be and are easily and frequently modified by fine tuning the model. This means you can modify the "binary artifact" to your needs, similar to how you might change the code of open source software to add features etc.

Palmik · 2024-07-18T16:11:16.000000Z

They specifically call out fp8 aware training and TensoRT LLM is really good (efficient) with fp8 inference on H100 and other hopper cards. It's possible that they run the 7b natively in fp16 as smaller models suffer more from even "modest" quantization like this.

Palmik · 2024-07-09T18:02:23.000000Z

People arguing that this is "just extension" are ignoring the fact that extensions have special priviledges compared to websites, and you would not want all websites to have the full power of arbitrary extension.

If it's "just extension", make it available to all domains.

FateOfNations · 2024-07-10T22:16:30.000000Z

The primary "special privilege" is that the extension is shipped with the browser and hidden. The API itself is available to any extension developer.

https://developer.chrome.com/docs/extensions/reference/api/s...

Palmik · 2024-06-29T11:00:11.000000Z

It has fp8 support. Not sure whether fp8 on MI300x is supported by vLLM yet.

Also, many of these comparisons use vLLM for both setups, but for Nvidia you can and should use TensorRT-LLM which tends to do quite a bit better than vLLM at high loads.

latchkey · 2024-06-29T14:18:09.000000Z

Elio, the person who did the testing confirmed with me that he has fp8 working.