Litellm proxy is a pretty good project on its own. I am obviously biased because we are competitors. Here are my thoughts.
* Litellm is declarative and it let you define everything in yaml
* Bricks is not declarative and you control everything via API
* Litellm does not have an UI
* Bricks has a non open source UI
* Litellm is written in python
* Bricks is written in Golang
* Litellm does not persist rate limits. Therefore can't accurately rate limit across distributed instances
* Bricksllm let you create API keys with accurate rate limits and spend limits that work across distributed instances
* Litellm provides high level spend metrics on API keys
* Bricks provides granular spend, request and latency metrics breakdown by model and custom id
* Litellm is not compatible with OpenAI SDK. You have to adopt Litellm python client
* Bricks is designed to be compatible with OpenAI SDK
* Litellm only supports OpenAI completion and embedding
* Bricks supports almost all OpenAI endpoints except image and audio
* Litellm has exact request caching
* Bricks does not have caching as for now
* Litellm has OpenTelemetry integration
* Bricks has statsd integration
* Litellm supports orchestration of API calls. When this API call fails, use this model or call this API endpoint instead
* Bricks does not support orchestration of API calls since I believe that it is something that the client should handle
LiteLLM proxy (100+ LLMs in OpenAI format) is exactly compatible with the OpenAI endpoint. Here's how to call it with the openai sdk:
```
import openai
client = openai.OpenAI(
api_key="anything", # proxy key - if set
base_url="http://0.0.0.0:8000" # proxy url
)
# request sent to model set on litellm proxy,
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
* Litellm is declarative and it let you define everything in yaml * Bricks is not declarative and you control everything via API
* Litellm does not have an UI * Bricks has a non open source UI
* Litellm is written in python * Bricks is written in Golang
* Litellm does not persist rate limits. Therefore can't accurately rate limit across distributed instances * Bricksllm let you create API keys with accurate rate limits and spend limits that work across distributed instances
* Litellm provides high level spend metrics on API keys * Bricks provides granular spend, request and latency metrics breakdown by model and custom id
* Litellm is not compatible with OpenAI SDK. You have to adopt Litellm python client * Bricks is designed to be compatible with OpenAI SDK
* Litellm only supports OpenAI completion and embedding * Bricks supports almost all OpenAI endpoints except image and audio
* Litellm has exact request caching * Bricks does not have caching as for now
* Litellm has OpenTelemetry integration * Bricks has statsd integration
* Litellm supports orchestration of API calls. When this API call fails, use this model or call this API endpoint instead * Bricks does not support orchestration of API calls since I believe that it is something that the client should handle