This is meant to run on GPUs with 16GB RAM. Most M1/M2 users have at least 32GB ...

fancyfredbot · on Feb 20, 2023

If you look closely there's 16GB of GPU memory and over 200GB of CPU memory. So none of the currently available M* have the same kind of capacity. Let's hope this changes in the future!

ricardobeat · on Feb 24, 2023

Apple silicon has unified memory, the GPU has access to the entire 32/64/96/128GB of RAM. It's part of the appeal.

I would really like to see how stuff performs on a Mac Studio with 128GB memory, 8TB SSD (at 6GB/s), not to mention the extra 32 "neural engine" cores. It seems the performance of these machines has been barely explored so far.

fancyfredbot · on Feb 25, 2023

I think that here the main bottleneck is data movement. If you are streaming weight data from a 6GB/s SSD you'll get under 10% of the performance shown for 3090 (which will be moving data at PCIe 4 speeds of 64GB/s).

Once in unified memory the weights are accessible at about half the rate they are on the 3090 (400GB/sec on M2 Max vs 936GB/sec on 3090).