Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: An open-source AI Gateway with integrated guardrails (github.com/portkey-ai)
21 points by roh26it 5 months ago | hide | past | favorite | 5 comments
Hi HN,

I've been developing Portkey Gateway, an open-source AI gateway that's now processing billions of tokens daily across 200+ LLMs. Today, we're launching a significant update: integrated Guardrails at the gateway level.

Key technical features: 1. Guardrails as middleware: We've implemented a hooks architecture that allows guardrails to act as middleware in the request/response flow. This enables real-time LLM output evaluation and transformation. 2. Flexible orchestration: The gateway can now route requests based on guardrail verdicts. This allows for complex logic like fallbacks to different models or prompts based on output quality. 3. Plugin system: We've designed a modular plugin system that allows integration of various guardrail implementations (e.g., anthropic/constrained-llm, microsoft/guidance). 4. Stateless design: The guardrails implementation maintains the gateway's stateless nature, ensuring scalability and allowing for easy horizontal scaling. 5. Unified API: Despite the added complexity, we've maintained our unified API across different LLM providers, now extended to include guardrail configurations.

Implementation details: * The guardrails are implemented as async functions in the request pipeline. * We use a combination of regex and LLM-based evaluation for output validation. * The system supports both pre-processing (input modification) and post-processing (output filtering/transformation) guardrails.

Performance impact: * Latency increase is minimal (<50ms) for most deterministic guardrails. * We've implemented caching mechanisms to reduce repeated evaluations. * Since the gateway lives on the edge, it avoids longer roundtrips

Challenges we're still tackling: * Balancing strict guardrails with maintaining model creativity * Standardizing evaluation metrics across different types of guardrails * Handling guardrail false positives/negatives effectively

We believe this approach of integrating guardrails at the gateway level provides a powerful tool for managing LLM behavior in production environments.

The code is open-source, and we welcome contributions and feedback. We're particularly interested in hearing about specific use cases or challenges you've faced in implementing reliable LLM systems.

Detailed documentation: https://portkey.wiki/guardrails

What are your thoughts on this approach? Are there specific guardrail implementations or orchestration patterns you'd like to see added?




Love this!


Coming over from Twitter/X (@iamrobotbear) -- congrats on the launch! Will dive into the docs, thanks for this!


thanks for the support!


saw your tweet on X, nice work and congrats on launching!

i'm curious about the caching mechanisms you've implemented to reduce repeated evaluations - are you using a traditional cache store like redis or something more bespoke?


We use a bunch of caching mechanisms on the LLM requests themselves and extend the same to guardrails now.

So there's 2 levels of cache - the LLM request itself might be cached (simple and semantic) and the guardrail response can be cached as well.

We use a mix of a distributed kv store and a vector DB to actually store the data




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: