Hacker News new | past | comments | ask | show | jobs | submit login
WebGPU GPT Model Demo (kmeans.org)
154 points by b_mc2 on April 21, 2023 | hide | past | favorite | 29 comments



It indeed works and loads quick. I am more interested currently in the Vicuna 7B example from https://mlc.ai/web-llm/

Also instead of just "Update Chrome to v113" the domain owner could sign up for an origin trial https://developer.chrome.com/origintrials/#/view_trial/11821...


Thanks for the tip!


Np! Although there may still be some special WebGPU features in v113 that make v94-112 less ideal.


I actually just tried this on my own domain and although it enabled WebGPU it had another error, which was:

"Find an error initializing the WebGPU device Error: Cannot initialize runtime because of requested maxBufferSize exceeds limit. requested=1024MB, limit=256MB. This error may be caused by an older version of the browser (e.g. Chrome 112). You can try to upgrade your browser to Chrome 113 or later."


My 250M parameter model runs in 50ms/token ;)

Releasing April 26th when Chrome 113 hits stable. Open source NPM library you can add to any project.

Preview here: https://twitter.com/fleetwood___/status/1646608499126816799?...


That's pretty impressive. What model are you using?


FLAN-T5 Base currently, 780M parameter variant coming shortly!


What kind of model? Question-Answering? I imagine it must be quite specialised at <1b Params when many are 7b, 13b, or more?


   > WebGPU is supported in your browser!

   > Uncaught (in promise) DOMException: WebGPU is not yet available in Release or Beta builds.
Anyone using Chromium care to chime in?

If no one chimes in I might set up a Chromium browser up just to take a look at this, seems pretty cool.


I'm on the latest Brave (Chromium 112.0.5615.165) and it tells me it is not supported.


It needs to be Chromium 113+, otherwise you'll have to enable webGPU in the browser flags like I have for Firefox.


Try chrome canary


Chrome Beta works as well.


Thorium 111 is working with the WebGPU flags enabled.


Question. I can see in the code the WGSL that's needed to implement inference on the GPU. https://github.com/0hq/WebGPT/blob/main/kernels.js

Could this code also be used to train models or only for inference?

What I'm getting at, is could I take the WGSL and using rust wgpu create a mini ChatGPT that runs on all GPU's?


This repo only does inference but it should be possible to write training code that runs on WebGPU.


No, that is not how that works, sadly you cannot.


Why not? You could absolutely write the training code in WGSL.


> At the moment, WebGPT averages ~300ms per token on GPT-2 124M running on a 2020 M1 Mac with Chrome Canary.

How do ChatGPT on GPT-3.5 / GPT-4 compare?


I can do 60 tokens in 24s on my 2020 ASUS G14, Thorium 111 + Windows 10. Nvidia-smi says my RTX 2060 is 24% loaded and has ~1.1G eaten up.

Thats slower than Vicuna 7B (aka LLaMa 7000M, GPT 3ish? model) on linux on the same machine, where I get about 3.5 tokens/sec and 97% usage. So... yeah, performance is not so great yet.


I don't have exact numbers but trying all 3 out: on a local 4090 this seems mildly slower than GPT-4 and ridiculously slower than GPT 3.5 (both via cloud GPUs of course). That said all are within the realm of usability in terms of speed though GPT 3.5 is really in a whole different class of being able to be used nearly interactively without delay.


Interesting, I'm getting 100ms/token on plain old wasm with 4 threads via ggml, using a 1.7B quantized cerebras model.


It's really a shame that there is no 8-bit float support in the WebGPU spec. Even though few cards support it natively, it'd still massively benefit ML workloads.

Another annoying constraint but specific to wgpu (Rust's implementation of WebGPU) is that it does not support f16 yet (which IS in the spec), only through SPIR-V passthrough...


Any way to run this kind of thing outside the browser? Chrome hasn't enabled WebGPU on Linux yet.


I've got it running on Chrome v113 beta on Ubuntu with an older AMD RX 580. The feature flags don't seem to be taking for me in chrome GUI but if you start chrome from terminal like this it works.

google-chrome --enable-unsafe-webgpu --enable-features=Vulkan,UseSkiaRenderer --enable-dawn-features=disable_robustness

GPU doesn't work in --headless though.



Is there any plan to support larger models than GPT-2?


Omg, no pytorch/wsl/conda hiccups... This could save me some sleepless nights


I didn't understand why I need WebGPU to use WebGPT...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: