Assuming you are asking this because of the deep learning/ChatGPT hype, the first question you should ask yourself is, do you really need to? The skills needed for CUDA are completely unrelated to building machine learning models. It's like learning to make a TLS library so you can get a full stack web development job. The skills are completely orthogonal. CUDA belongs to the domain of game developers, graphics people, high performance computing and computer engineers (hardware). From the point of view of machine learning development and research, it's nothing more than an implementation detail.
Make sure you are very clear on what you want. Most HR departments cast a wide net (it's like how every junior role requires "3-5 years of experience" when in reality they don't really care). Similarly when hiring, most companies pray for the unicorn developer who can understand the entire stack from the GPU to the end user product domain when the day to day is mostly in Python.
I'm going with Pangolin, small hosted VPS on Hetzner, to front my Homelab. Takes away much of the complications of serving securely directly from the home LAN.
My laptop from 2011 idles at 8W, with two SATA SSDs. I have an Intel 10th-gen mini PC that idles at 5W with one SSD. 3W is not groundbreaking, but for a computer you paid $0, it would take many years to offset the $180 paid on a mini PC.
That is the key. The RPi works for idling, but anything else gets throttled pretty bad. I used to self host on the RPi, but it was just not enough[1]. Laptops/mini-PCs will have a much better burstable-to-idle power ratio (6/3W vs 35/8W).
> That is the key. The RPi works for idling, but anything else gets throttled pretty bad.
I don't have a dog in this race, but I recall that RPi's throttling issues when subjected to high loads were actually thermal throttling. Meaning, you picked up a naked board and started blasting benchmarks until it overheated.
You cannot make sweeping statements about RPi's throttling while leaving out the root cause.
amd64 processors will have lots of hardware acceleration built in. I couldn’t get past 20MB/s over SSH on the Pi4, vs 80MB/s on my i3. So while they can show similar geekbench results, the experience of using the Pi is a bit more frustrating than on paper.
RPi is amazing for IOT tasks cuz it’s pretty portable but not for running general purpose server tasks, you’d get better performance per watt with used gear
Go also statically links all dependencies and reinvents all the wheels usually provided by the system land. Cross compilation is trivial. It is unrivaled when it comes to deployment simplicity.
Rust can cross-compile, yes, but is not as seamless. For example, Rust can not cross-compile Windows binaries from Linux without external support like MinGW.
Go can cross-compile from Linux to Windows, Darwin and FreeBSD without requiring any external tooling.
Both languages have enormous cargo-culting issues when you try to do anything that isn't fizzbuzz. The bigger difference that I'd expect people to identify is that Rust generates freestanding binaries where Go software requires a carefully-set runtime. There are pros and cons to each approach.