We’ve just launched Gradient — an API that helps you build private LLMs that you own. We simplify inference and fine-tuning on open-source LLMs such as llama2, and you only pay by the token.
Our API platform makes it possible for you to create private models with a single API call. Run inference on your fine tuned model instantly with no cold boot (and no need to pay for compute costs).
The product is truly on demand - when you run fine tuning and inference on our platform, there's nearly 0 startup latency for these API calls. And you're not paying for the compute, you just pay for the tokens you're consuming.
This makes it possible for developers to build and serve nearly unlimited fine tuned models without incurring ridiculous infrastructure fees.
We currently support Llama2 7B and Nous Hermes2 (Llama2 13B unlocked variant), and are releasing LlamaCoder and Llama2 70b in a few weeks.
Gradient is also SOC 2 and HIPAA compliant.
When you sign up, we give you $10 in free credits to start experimenting. We'd love to get your feedback, and let us know what you're building on our Discord!
Sign up (https://gradient.ai/)
Twitter (https://twitter.com/Gradient_AI_)
Discord (https://discord.gg/yvgVKEgkmd)