Show HN: I built an OSS alternative to Azure OpenAI services

kristjansson · 2023-12-11T20:09:01.000000Z

This looks like a useful value-add, but I'd hesitate to call it a replacement for Azure OpenAI. Governance and observability features of Azure OpenAI are really secondary to the stability and reliability guarantees of Azure owning and operating the model instead of OpenAI...

0xDEF · 2023-12-11T20:49:30.000000Z

Azure OpenAI services also makes it possible to choose data centers outside the US.

3abiton · 2023-12-11T23:04:58.000000Z

But aeguably, being OSS, you can integrate additional OSS that performs thebafromentiones tasks of Obserbability. Just more hacking things together, but in theory should be doable.

HellsMaddy · 2023-12-11T21:27:51.000000Z

This reminds me of an idea I had for an OpenAI proxy that transparently handles batching of requests. The use case is that OpenAI has rate limits not only on tokens but also requests per minute. By batching multiple requests together you can avoid hitting the requests limit.

This isn’t really feasible to implement if your app runs on lambda or edge functions, you’d need a persistent server.

Here’s a diagram I drew of a simple approach that came to mind: https://gist.github.com/b0o/a73af0c1b63fccf3669fa4b00ac4be52

It would be awesome to see this functionality built into BricksLLM.

heyn05tradamu5 · 2023-12-11T21:52:58.000000Z

They’ve recently added this functionality to AWS Bedrock thankfully. Doesn’t support OpenAI models, but does support Anthropic.

https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-be...

te_chris · 2023-12-12T07:02:32.000000Z

If you can get Claude approved.

swyx · 2023-12-11T21:48:36.000000Z

how exactly are you intending to batch different prompts together in the openai api? its not like they accept an array of parallel inputs

computerex · 2023-12-11T21:43:32.000000Z

OpenAI API doesn't support batching afaik.

HellsMaddy · 2023-12-11T21:53:25.000000Z

They do: https://platform.openai.com/docs/guides/rate-limits/batching...

treprinum · 2023-12-11T21:55:29.000000Z

Embeddings can be batched.

mdeeks · 2023-12-11T21:17:50.000000Z

Looks interesting! We're in the middle of building something similar right now for ourselves. We may look at this as an alternative for ourselves.

By the way, I saw this after a quick glance poking through the code. This isn't encryption, it is hashing. Not sure where or how it is used but it is worth a rename at least: https://github.com/bricks-cloud/BricksLLM/blob/main/internal...

sharathr · 2023-12-12T06:45:38.000000Z

You might want to also look at: https://www.getjavelin.io

bottled_poe · 2023-12-12T13:42:22.000000Z

No commits to the repo for 3 months, doesn’t invoke confidence..

luyuanxin1995 · 2023-12-11T21:56:47.000000Z

you are right. will update the language

ashishb · 2023-12-12T05:08:50.000000Z

I ended up building (closed-source) product that not only tracks surprise bills in real-time but minimizes them via semantic caching - https://observeapi.ashishb.net/

Demo: https://gptcache.ashishb.net/

gigel82 · 2023-12-11T19:12:33.000000Z

No you didn't.

The whole point of using Azure OpenAI over plain OpenAI is the fact that your data doesn't get donated to OpenAI for training (this solves "compliance" for enterprise customers that need it), which you're not solving (because you obviously can't run the OpenAI models in your own data center).

ajcp · 2023-12-11T20:01:49.000000Z

Many here think that I'm afraid of my companies data being used to train models, but that is only half-correct.

The value proposition of Azure OpenAI to the enterprise is that it's bound by the Enterprise Agreement for not just data-use, but data access writ-large. I don't want my data to be accessible for *any reason* by a vendor that isn't covered explicitly in the EA.

While OpenAI says they delete your data after 30-days I have no contract or agreement in place that ensures that, or that within those 30-days they don't have carte blanche to my data for whatever else they may want to do with it.

The Azure OpenAI agreement *explicitly* states they only store my data for 30-days and can *only* access it if they suspect abuse of the service according to the ToC and our EA, which is no different than for other Azure services on my tenant. IIRC that's only after notifying me as well, and cannot exfiltrate that data even if they access it.

gigel82 · 2023-12-11T20:09:20.000000Z

I suspect a lot of folks never worked with Enterprise / Gov customers and don't understand the restrictions and compliance requirements (like data residence, access control, FedRAMP, TISAX, reliability SLAs, etc. which you get with Azure but not with some "move fast and break things" startup like OpenAI).

My comment is not dissing on the author, I'm just pointing out what most folks get from Azure is compliance (and maybe safety), and an OSS cannot solve that (unless they're running some other models in owned infrastructure or on-prem).

ajcp · 2023-12-11T21:30:28.000000Z

Yup, and I agree with you.

Even beyond enterprise thought people seem to think that the only thing their data is good for as far as OpenAI is concerned is for training-datasets, but that's just not the case.

The authors of this have built something great, but it doesn't protect you from any of those non-training use-cases.

quickthrower2 · 2023-12-11T20:40:42.000000Z

Even just doing SOC2 we prefer it

Let alone if you are a SaaS your customers may demand it.

robertlagrant · 2023-12-11T21:25:32.000000Z

> most folks get from Azure is compliance

To be fair, this doesn't prevent problems. Paperwork doesn't plug any security gaps in any cloud provider.

nonameiguess · 2023-12-12T00:20:13.000000Z

When people say stuff like this on Hacker News, it makes me think even more they haven't done a lot of work with government, or at least not the parts of the government I'm familiar with. Obviously, there are a lot of governments out there. But the FedRamp private enclaves with IL5-certification for CUI handling offered by the major cloud providers are a hell of a lot more secure than OpenAI's servers, and for workloads that require it, the classified enclaves are probably close to impossible to breach if you're not Mossad. Data centers on military installations, no connection to the Internet, private DX hardware encrypted on the installation with point-to-point tunneling through national fiber backbone only, and if you get anywhere near the cables, men in black SUVs suddenly show up out of of nowhere to bring you in and figure out why. I'm not even just saying that as a hypothetical. I've literally seen it happen when AT&T dug too close to the wrong line they didn't even know about because it was used for a testing facility the Navy doesn't publicly acknowledge. And the data they really cared about didn't even use that. It was hand-carried by armed couriers who kept hard drives in Pelican cases.

They may be tedious as fuck to implement and make what should be simple work take forever, but there are plenty of compliance checklists out there that really do give you security.

robertlagrant · 2023-12-12T12:02:15.000000Z

> When people say stuff like this on Hacker News, it makes me think even more they haven't

I have done work with government and defence. Ad hominem stuff is really pointless.

toomuchtodo · 2023-12-11T21:30:36.000000Z

It potentially plugs the contractual and liability risks, which might be more important (talk to your legal and compliance folks). None of your data is going to launch nuclear missiles, if it leaks it would be unfortunate, but not as much as the litigation and regulatory costs you could potentially incur.

Everyone gets popped eventually. It's your job to show you operated from a commercially reasonable security posture (and potentially your third party dependency graph, depending on regulatory and cyber insurance requirements).

(i report to a CISO, and we report to a board, thoughts and opinions are my own)

cebert · 2023-12-12T02:20:49.000000Z

> (i report to a CISO, and we report to a board, thoughts and opinions are my own)

That sounds like an interesting role. How did you get there? Did you start as a security analyst and work your way up?

toomuchtodo · 2023-12-12T03:38:08.000000Z

Word of mouth referral into the org, last ~5 years as a security architect/cybersecurity subject matter expert, before that DevOps/infra engineer. 20+ years in tech. I rely solely on network and reputation.

Be interesting to people who can provide you opportunity, and ask whenever an opportunity presents itself. If you don’t ask, the answer is default no. Being genuinely curious and desiring to help doesn't hurt either.

robertlagrant · 2023-12-11T23:50:49.000000Z

I'm not disagreeing, but I'm making a separate point. I'm familiar with CYA, and need to use it myself, but that doesn't affect my previous point.

ralph84 · 2023-12-11T21:39:30.000000Z

Compliance isn’t about preventing problems. It’s about identifying risks and determining who is responsible for mitigating those risks and who is on the hook for damages if the risks aren’t sufficiently mitigated.

droidno9 · 2023-12-11T19:24:01.000000Z

OpenAI also does not use API data to train its models.

Source: https://openai.com/enterprise-privacy

kordlessagain · 2023-12-12T14:45:32.000000Z

Needs more attention. Anything that replicates ChatGPT using OpenAI APIs is valuable because of this fact for enterprise use cases.

https://mitta.ai uses these API calls, has an Open Source code base for inspection, a strong user privacy policy, and doesn't store data in transit, other than documents that are uploaded. I'm working on TTLs for the files stored, so there is no data left behind (other than what might be stored by the user in the DB).

j45 · 2023-12-11T19:21:14.000000Z

The Azure OpenAI API to me was more about reliability and minimizing downtime.

The OpenAI API also allows opting out of training the OpenAI for training. If you are referring to something else with data, happy to learn.

https://help.openai.com/en/articles/5722486-how-your-data-is...

"API

OpenAI does not use data submitted to and generated by our API to train OpenAI models or improve OpenAI’s service offering. In order to support the continuous improvement of our models, you can fill out this form to opt-in to share your data with us. "

danenania · 2023-12-11T19:25:11.000000Z

According to OpenAI, they don't use data sent to the API for training and delete it after 30 days. So your concern is misplaced unless you think OpenAI is outright lying about their policies.

happytiger · 2023-12-11T19:54:25.000000Z

Whoa there. That’s only half the story.

https://techcrunch.com/2023/03/01/addressing-criticism-opena...

They said only if they opt in not that they wouldn’t do it.

And they used to and could change those terms again after the market matures and developers have deeply integrated application investments.

Plenty of companies have followed this strategy in the past.

Let’s not oversimplify the situation as it was a recent change as well…

kristjansson · 2023-12-11T20:04:57.000000Z

That's only for ChatGPT. The APIs are governed by [0]

> We will only use Customer Content as necessary to provide you with the Services, comply with applicable law, and enforce OpenAI Policies. We will not use Customer Content to develop or improve the Services.

[0]: https://openai.com/policies/business-terms

voiceblue · 2023-12-12T02:59:11.000000Z

That language seems to be doing a lot of heavy lifting. A more straightforward phrasing would've been "we will never process or otherwise consume customer content once it has been delivered to you". As written, they could use it to train GPT-5 or what-have-you and refuse to share it with you, making that exempt as it is apart from "services". Or all manner of other shenanigans, if they have competent lawyers.

kristjansson · 2023-12-12T03:57:27.000000Z

You can't really read the language at face value. 'Services' is defined term in the contract:

> “Services” means any services for businesses and developers we make available for purchase or use, along with any of our associated software, tools, developer services, documentation, and websites, but excluding any Third Party Offering.

Of course it's still language, and one can quibble with any language, but it's reasonably restrictive.

voiceblue · 2023-12-13T17:10:21.000000Z

That’s my point though, they can clearly use your data to train as long as they don’t sell/share those models themselves, as per this contract.

At the end of the day you can’t blindly trust your data is safe, even with a solid contract (bad actors exist, after all). So whatever you do, do it at your own risk.

danenania · 2023-12-11T20:30:24.000000Z

Microsoft can also change their terms so I don’t see much difference there.

I can understand preferring the Azure endpoints, but let’s be accurate about what the policies are.

linkenaxx · 2023-12-11T22:03:28.000000Z

+1 i dont think Azure provides that much more in terms of data privacy unless youre saying we should believe Azure's policies but not OpenAI's...

Jayakumark · 2023-12-11T19:17:51.000000Z

saberience · 2023-12-12T00:42:07.000000Z

This is a few basic tools put into an api, not a replacement for Azure OpenAI services. I know because I built similar tools to help me run the ChatGPT apis locally and it was a day or two of coding at max, even including calculating accurate token counts and cost.

weekay · 2023-12-12T03:38:53.000000Z

If I were to self host an open source model like mistral or llama , are there options similar to this as an api gateway to proxy and authenticate , create api keys , monitor spends by api etc .,? How are people running open source LLM”s in production ? Thanks

donovan-so · 2023-12-12T04:20:29.000000Z

We do support self-hosted models, as long as they're exposed as an API. Would this endpoint work for you? https://github.com/bricks-cloud/BricksLLM?tab=readme-ov-file...

aaronharnly · 2023-12-12T01:09:34.000000Z

Congratulations on shipping! We are currently evaluating replacing our homegrown version of an LLM proxy with this project:

https://github.com/BerriAI/litellm

Any comparison or contrast you would point out?

luyuanxin1995 · 2023-12-12T02:22:32.000000Z

Litellm proxy is a pretty good project on its own. I am obviously biased because we are competitors. Here are my thoughts.

* Litellm is declarative and it let you define everything in yaml * Bricks is not declarative and you control everything via API

* Litellm does not have an UI * Bricks has a non open source UI

* Litellm is written in python * Bricks is written in Golang

* Litellm does not persist rate limits. Therefore can't accurately rate limit across distributed instances * Bricksllm let you create API keys with accurate rate limits and spend limits that work across distributed instances

* Litellm provides high level spend metrics on API keys * Bricks provides granular spend, request and latency metrics breakdown by model and custom id

* Litellm is not compatible with OpenAI SDK. You have to adopt Litellm python client * Bricks is designed to be compatible with OpenAI SDK

* Litellm only supports OpenAI completion and embedding * Bricks supports almost all OpenAI endpoints except image and audio

* Litellm has exact request caching * Bricks does not have caching as for now

* Litellm has OpenTelemetry integration * Bricks has statsd integration

* Litellm supports orchestration of API calls. When this API call fails, use this model or call this API endpoint instead * Bricks does not support orchestration of API calls since I believe that it is something that the client should handle

detente18 · 2023-12-14T17:09:37.000000Z

LiteLLM proxy (100+ LLMs in OpenAI format) is exactly compatible with the OpenAI endpoint. Here's how to call it with the openai sdk:

``` import openai

client = openai.OpenAI( api_key="anything", # proxy key - if set base_url="http://0.0.0.0:8000" # proxy url )

# request sent to model set on litellm proxy,

response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ { "role": "user", "content": "this is a test request, write a short poem" } ])

print(response)

``` Docs - https://docs.litellm.ai/docs/proxy/quick_start

ij23 · 2023-12-14T12:20:52.000000Z

hi i'm the maintainer of litellm - we persist rate limits, they're written to a DB: https://docs.litellm.ai/docs/proxy/virtual_keys

- LiteLLM Proxy IS Exactly Compatible with the OpenAI SDK

luyuanxin1995 · 2023-12-14T19:16:02.000000Z

Update: Litellm is compatible with OpenAI SDK

aaronharnly · 2023-12-12T03:37:19.000000Z

Thank you, I appreciate the earnest and thoughtful reply!

sharathr · 2023-12-12T06:43:43.000000Z

Have you taken a look at: https://www.getjavelin.io

aaronharnly · 2023-12-12T12:16:08.000000Z

I haven’t, thanks! Signed up. Not wild about another layer of SaaS, but perhaps they’ll have a self hosted option.

marvz · 2023-12-13T05:06:51.000000Z

have you had a look at https://www.pulze.ai/ ? LLM API that provides the highest-quality LLM responses through intelligent routing, beating GPT-4 and any other single LLM across all prompt categories.

linkenaxx · 2023-12-11T21:59:23.000000Z

great idea for simple use cases, but not sure if it can load handle as well as Azure i like that its open sourced too since most solutions out there are black box and we can only rely on the "good will" of the company that they will do as they say

101008 · 2023-12-11T19:28:02.000000Z

How do you calculate usage limits if you don't tokenize inputs and outputs?

luyuanxin1995 · 2023-12-11T19:30:18.000000Z

we do tokenize inputs and outputs. sorry if that is not clear in the post

LouisvilleGeek · 2023-12-12T13:59:01.000000Z

Tried this out last night and kept getting a key is not authorized error even though the key absolutely exists. (I had just created it via the steps on the Github page)

js4ever · 2023-12-11T19:31:55.000000Z

Congradulations (from the readme) :)

luyuanxin1995 · 2023-12-11T19:49:15.000000Z

fixed. embarrassing on my part

hluska · 2023-12-11T22:47:34.000000Z

You put in a lot of effort and built something cool - a spelling error is absolutely nothing to be embarrassed about.

cvhashim04 · 2023-12-15T04:42:30.000000Z

Think your website is down? Would love to contribute to this.

te_chris · 2023-12-12T07:03:32.000000Z

Worth checking out Helicone and others