"The recent change also means you can run multiple LLaMA ./main processes at the... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

abhimskywalker on April 1, 2023 | parent | context | favorite | on: Llama.cpp 30B runs with only 6GB of RAM now

"The recent change also means you can run multiple LLaMA ./main processes at the same time, and they'll all share the same memory resources." So this could have a main and multiple sub-worker llm processes possibly collaborating while sharing same memory footprint?

l33tman on April 1, 2023 [–]

Yes, if the model is mmap'ed read-only (as I'm sure it is).

There are other bottlenecks than CPU cores though, it might not be very useful to run multiple in parallel..

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact