What kind of workstation would you build/buy for local GPT development with a budget of $3000? Is remote dev a viable alternative to local workstations?
Local workstation is much cheaper in the long run.
Even ignoring that, most of the development is running experiments. You're gonna be hesitant to run lots of experiments if they each cost money whereas when you pay upfront for the hardware, you're gonna have the incentive to fully utilize it with lots of experiments.
I'd go with rtx 4090 and deal with memory limitation through software tricks. It's an underrated card that's as performant as cards that are magnitude pricier. It's great way to get started with that budget.
I agree with you but right now RTX 4090 cards are pushing $2000, which doesn't leave much budget left. I'd suggest picking up a used 3090 card from eBay, which are currently around $800. This will still give 24gb of VRAM like the 4090.
i've seen some blog posts saying if you buy a used 3090 that has been used for bitcoin mining then there is a risk of thermal throttling because the thermal paste on the vram is not great and worse if it was run hot for a long time.
any recommendations on how to buy one? e.g. 24GB model, any particular model to run LLMs? what is the biggest baddest LLM you can run on a single card?
have been thinking about it but was sticking with cloud/colab for experiments so far.
I remember videos (on youtube likely) of thermal paste replacement, that was upgrade to stock card. So, average person should be able to do it. It'll cost a few $$ for the paste. I would go with local workstation, then don't have to think much about while running stable diffusion. Plus, if it's used from ebay, prices cannot go much lower, you'll get something back at the end. Also, for image things training dataset can be quite big for network transfers.
Strong endorse here. I pick up used RTX 3090s from Facebook Marketplace and eBay at $800 maximum. Can usually find them locally for $700-750, and typically can test them too, which is fine (though I've had no issues yet).
Depending on what you're doing, 2x used 3090s are the same price and offer you more VRAM. That's what I'm planning on doing, in any case - being able to run 70B LLMs entirely on the GPU is more useful than being able to run 34B faster.
Yeah multiple 3090s is the best budget way to go for sure. Also older server boards with tons of PCIe lanes if you can swing rack mounted hardware and have some technical skills.
Not OP, but I asked myself that same question two years ago. Then I looked at the energy prices in Germany and knew I had no chance against cloud GPUs. Maybe you live in a country with lower energy prices, like Bermuda (or any other country on earth), in which case this may not be as important to you. A side benefit of going cloud that you can pick and choose the right GPU for whatever project you’re working on, and you’re really just paying while you’re running them. Also, no hardware or Cuda drivers that may divert your attention.
I’d go with a remote dev solution. Training/finetuning of large models requires much more resources anyway, so the GPUs in the local machine would be unused most of the time.
I got a 13900k + 4090 workstation for ~$3500. But I hear what people are doing is getting 2x (or more) 3090s instead, because they are cheap used, and having more VRAM and VRAM bandwidth is the important thing at the moment, even if it is split between cards.
I'm happy with my 4090 though. Dealing with splitting between GPUs sounds like a chore and also I like the gaming abilities of the 4090.
I would do remote dev using vast.ai and other cheap cloud computing resources to ensure you want to do this and have utility for it, then build your own. 3090s are typically the most budget friendly, and if you have any IT chops (and tolerance for noise), then server rack-mounted hardware, PSUs, and riser cables tend to be the most efficient with tons of PCIe lanes (which is a hidden issue people have with consumer-grade gaming PCs as they scale).
I made a custom pc in May with a 4090 for playing around.
It's good and fast but for production workload or serious training you want more ram.
It was less than ~3k in components (~2k after considering removing vat and expending it / depreciating it in a limited company)
If you also want a Mac consider an m3 MacBook with maxed ram.
A regular gaming PC will do, about a grand, then slot in a 3090 or something off eBay. Congratulations, you're a grand under-budget, but already have a viable development machine.