He's got 8x3090s are you fucking kidding? Like is this some kind of AI reply? "W... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

bongodongobob 44 days ago | parent | context | favorite | on: Serving AI from the Basement – 192GB of VRAM Setup

He's got 8x3090s are you fucking kidding? Like is this some kind of AI reply?

"Wow great post! I enjoy your valuable contributions. Can you tell me more about graphics cards and how they compare to other different types of computers? I am interested and eager to learn! :)"

emptiestplace 44 days ago | [–]

It's one thing to be an asshole, but you're also hilariously clueless.

bongodongobob 44 days ago | | [–]

Yeah, because an M2 is in the same ballpark as 8 GPUs. Yes, you can use CPU now but it's not even close to this setup. This is hackernews. I know we're supposed to be nice and this isn't reddit, but comments like parent are ridiculous and for sure don't add to the discussion any more than mine do.

TaylorAlexander 44 days ago | | | [–]

I simply didn’t know the answer and the response “it would be much slower” is a perfectly acceptable reply. I disagree that it was ridiculous to ask. I was curious and I wanted to know the answer and now I do. What is obvious to you is not obvious to other people, and you will get nowhere in life insulting people who are asking questions in good faith.

TaylorAlexander 44 days ago | | [–]

What? No I just don’t know the difference, sorry. I am interested in learning more about running 405b parameter models, which I believe you can do on a 192gb M series Mac.

The answer here is that the Nvidia system has much better performance. I’ve been focused on “can I even run the model” I didn’t think about the actual performance of the system.

talldayo 44 days ago | | [–]

It's kinda hard to believe that someone would stumble onto the landmine of AI performance comparison between Apple Silicon and Nvidia hardware. People are going to be rude because this kinda behavior is genuinely indistinguishable from bad-faith trolling. From benchmarks alone, you can easily tell that the performance-per-watt of any Mac Studio gets annihilated by a 4090: https://browser.geekbench.com/opencl-benchmarks

If Apple Silicon was in any way a more scalable, better-supported or more ubiquitous solution, then OpenAI and the rest of the research community would use their hardware instead of Nvidia's. Given Apple's very public denouncement of OpenCL and the consequences of them refusing to sign Nvidia drivers, Apple's falling-behind in AI is like the #1 topic in the tech sector right now. Apple Silicon for AI training is a waste of time and a headache that is beyond the capacity of professional and productive teams. Apple Silicon for AI inference is too slow to compete against the datacenter incumbents fielded by Nvidia and even AMD. Until Apple changes things and takes the datacenter market seriously (and not just advertise that they are), this status quo will remain the same. Datacenters don't want to pay the Apple premium just so they can be treated like a traitorous sideshow.

TaylorAlexander 44 days ago | | | [–]

> It's kinda hard to believe that someone would stumble onto the landmine of AI performance comparison between Apple Silicon and Nvidia hardware.

I encourage you to update your beliefs about other people. I’m a very technical person, but I work in robotics closer to the hardware level - I design motor controllers and Linux motherboards and write firmware and platform level robotics stacks, but I’ve never done any work that required running inference in a professional capacity. I’ve played with machine learning, even collecting and hand labeling my own dataset and training a semantic segmentation network. But I’ve only ever had my little desktop with one Nvidia card to run it all. Back in the day, performance of CNNs was very important and I might have looked at benchmarks, but since the dawn of LLMs, my ability to run networks has been limited entirely by RAM constraints, not other factors like tokens per second. So when I heard that MacBooks have shared memory and can run large models with it, I started to notice that could be a (relatively) accessible way to run larger models. I can’t even remotely afford a $6k Mac any more than I could afford a $12k Nvidia cluster machine, so I never really got to the practical considerations of whether there would be any serious performance concerns. It has been idle thinking like “hmm I wonder how well that would work”.

So I asked the question. I said roughly “hey can someone explain why OP didn’t go with this cheaper solution”. The very simple answer is that it would be much slower and the performance per dollar would be 10x worse. Great! Question answered. All this rude incredulousness coming from people who cannot fathom that another person might not know the answer is really odd to me. I simply never even thought to check benchmarks because it was never a real consideration for me to buy a system.

Also the “#1 topic in the tech sector right now” funny in my circles people are talking about unions, AI compute exacerbating climate change, and AI being used to disenfranchise and make more precarious the tech working class. We all live in bubbles.

talldayo 43 days ago | | | [–]

It's simply bizarre that you would ask that question when the research to figure it all out is trivially accessed. Everyone thought that "unified memory" would be a boon when it was advertised, but Apple never delivered on a CUDA alternative. They killed OpenCL in the cradle, and pushed developers to use Metal Compute Shaders instead of a proper GPGPU layer. If you are an Apple dev, the mere existence of CoreML ought to be the white flag that makes you realize Apple hardware was never made for GPU compute.

Again, I'm not accusing you of bad-faith. I'm just saying that asking such a bald-faced and easily-Googled question is indistinguishable from flamebait. There is so much signalling that should suggest to you that Apple hardware is far from optimized for AI workloads. You can look at it from the software angle, where Apple has no accessible GPGPU primitives. You can look at it from a hardware perspective, where Apple cannot beat the performance-per-watt of desktop or datacenter Nvidia hardware. You can look at it from a practical perspective, where literally nobody is using Apple Silicon for cost-effective inference or training. Every single scrap of salient evidence suggests that Apple just doesn't care about AI and the industry cannot be bothered to do Apple's dirty work for them. Hell, even a passing familiarity with the existence of Xserve should say everything you need to know about Apple competing in markets they can't manipulate.

> funny in my circles people are talking about unions, AI compute exacerbating climate change, and AI being used to disenfranchise and make more precarious the tech working class.

Sounds like your circles aren't focused on technology, but popular culture and Twitter topics. Unionization, the "cost" of cloud and fictional AI-dominated futures were barely cutting-edge in the 90s, let alone today.

wtallis 44 days ago | | | | [–]

> From benchmarks alone, you can easily tell that the performance-per-watt of any Mac Studio gets annihilated by a 4090: https://browser.geekbench.com/opencl-benchmarks

The Geekbench GPU compute benchmarks are nearly worthless in any context, and most certainly are useless for evaluating suitability for running LLMs, or anything involving multiple GPUs.

defrost 44 days ago | | | [–]

You might enjoy last fortnight's The Register article:

Buying a PC for local AI? These are the specs that matter

https://www.theregister.com/2024/08/25/ai_pc_buying_guide/

It died without interest here: https://news.ycombinator.com/item?id=41347785 likely time of day, I'm mostly active in HN off peak hours.

TaylorAlexander 44 days ago | | | [–]

Nice, thank you!

bongodongobob 44 days ago | | | [–]

You're interested in the different between a single CPU and 8 GPUs? A Ford fiesta vs a freight train.

mrnonchalant 44 days ago | | | [–]

One can be interested in the differences between a Ford Fiesta and a freight train…

therouwboat 44 days ago | | | [–]

Are you fucking kidding me? A single train car can weight 130 tons, a fiesta can carry maybe 500kg, its not even close. /s

bongodongobob 44 days ago | | | [–]

How is that even sarcasm

skavi 44 days ago | | | | [–]

A single SoC, which includes a GPU (two GPUs, kinda).

TaylorAlexander 44 days ago | | | | [–]

Yeah. I can’t afford a freight train.

defrost 44 days ago | | | [–]

Keep an eye on the SV going out of business fire sales. Not all the AI kites will fly.

As for actual trains, they can be suprisingly affordable (to live in):

https://atrservices.com.au/product/sa-red-hen-416/ https://en.wikipedia.org/wiki/South_Australian_Railways_Redh...

and the freight rolling stock flatcars make great bridges (single or sectioned) with concrete pylons either end for farms - once the axles are shot they can go pretty damn cheap and the beds are good enough to roll a car or small truck over.

addendum: in case you miss fresh reply to old comment: https://news.ycombinator.com/item?id=41484529

yunohn 44 days ago | | [–]

While this reply might be a bit too harsh, I fully agree with the well warranted criticism of Apple fans chiming in on every AI Nvidia discussion with “but M chips have large amount of RAM and Apple says they’re amazing for AI”.

nexoft 44 days ago | [–]

kind of violent approach but I agree on the bottom line. I don't see why should somebody be enthousiast about this. someone was just able to spend 8× the figures a random teen is able to spend for his gaming rig, and he just iterated the teen's rig 8 times, he then installed ubuntu+cuda and called it a day.

something that is actually interesting that is attempting to bring something on the table : check tinygrad/tinycorp

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact