> Have you considered a limited LLM that could run locally?
I think there are two main issues here. LLM are large (the name even hints at it ;) ) and the smaller ones (still, multiple GB) are really, really bad.
Edit: and uses a ton of memory, either RAM if CPU or VRAM if GPU.
Compared to GPT-4, most of them are not super great, yeah. I've tested out most of the ones released for the last weeks and nothing have been getting the same quality of results, even the medium sized (30GB and up) models that require >24GB of VRAM to run GPU. I have yet to acquire hardware to run the absolute biggest of models, but I haven't seen any reports that they are much better either for general workloads.
I think there are two main issues here. LLM are large (the name even hints at it ;) ) and the smaller ones (still, multiple GB) are really, really bad.
Edit: and uses a ton of memory, either RAM if CPU or VRAM if GPU.