Hacker News new | past | comments | ask | show | jobs | submit login

Probably in the hundreds of dollar for 7B model, and may be a thousand or two for the 13B at worst



Far far less. Alpaca-7B's compute cost was around $60-$70 for Stanford and around $0.60 (yes 60 cents) for equivalent fine tunes using the Parameter Efficient Fine Tuning (PEFT) strategy of Low Rank Adapters (LoRA).

The repo above can be replicated for similar costs. Easily less than $10 for up to 30B using LoRA (which requires only 24GB of VRAM for 30B/33B and smaller).


I thought so too, but for newcomers, they should expect to train model a dozen times or so :-)


I am interested in this. What would be the cost for the best model possible by the public?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: