Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Cheapest GPU provider to host fine-tuned models?
1 point by DidISayTooMuch on Dec 15, 2023 | hide | past | favorite | 3 comments
Who provides cheapest GPU inferencing and hosting of fine-tuned models (7B size)? I already have the finetuned model ready, just looking for a cheap place to host and run inferencing.

I've looked at Replicate and Together.ai, they both provide really the best tools in this space, but hosting is expensive. Together costs about 1.4/hr to host a 7B model. Replicate is more expensive.

Ideally, I wouldn't be charged for idle time and only active time (replicate does this already, but your finetuned model needs to be based off of a limited set of base models)

Any recommendations?




Following - we host our own models for a variety of architectures in vocal synthesis, and have tried using Replicate and Mystic as well.

Roll your own k8s? Predibase?


Thanks for the tip. Predibase has support for Zephyr-7B, but I wonder if they offer the same price per 1k token for a fine-tuned version of Zephyr-7B? Most likely, they will ask me to get a dedicated instance for that, which is the same as together.ai.


Just checked out mystic.ai, it looks like you only pay for usage on any model and not idle time. Might actually fit my requirements.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: