Hacker News new | past | comments | ask | show | jobs | submit | BryceSchroeder's comments login

You might want to consider the sampler you're using. Consider using a repetition penalty.


Would also like to know.


I am running fp16 LLaMA 30B (via vanilla-llama) on six AMD MI25s. Computer has 384 GB of RAM but the model fits in the VRAM. It takes up about 87 GB of VRAM out of the 96 GB available on the six cards. Performance is about 1.6 words per second in an IRC chat log continuation task and it pulls about 400W additional when "thinking."


I wanted to use my inexpensive Chinese fiber laser engraver without the buggy, Windows-only EzCAD2 software. So I reverse engineered the protocol and wrote some simple tools for interfacing with it. GPL3.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: