Hacker News

by roh26iton 8/14/2024, 2:40 PMwith 5 comments

Hi HN,

I've been developing Portkey Gateway, an open-source AI gateway that's now processing billions of tokens daily across 200+ LLMs. Today, we're launching a significant update: integrated Guardrails at the gateway level.

Key technical features: 1. Guardrails as middleware: We've implemented a hooks architecture that allows guardrails to act as middleware in the request/response flow. This enables real-time LLM output evaluation and transformation. 2. Flexible orchestration: The gateway can now route requests based on guardrail verdicts. This allows for complex logic like fallbacks to different models or prompts based on output quality. 3. Plugin system: We've designed a modular plugin system that allows integration of various guardrail implementations (e.g., anthropic/constrained-llm, microsoft/guidance). 4. Stateless design: The guardrails implementation maintains the gateway's stateless nature, ensuring scalability and allowing for easy horizontal scaling. 5. Unified API: Despite the added complexity, we've maintained our unified API across different LLM providers, now extended to include guardrail configurations.

Implementation details: * The guardrails are implemented as async functions in the request pipeline. * We use a combination of regex and LLM-based evaluation for output validation. * The system supports both pre-processing (input modification) and post-processing (output filtering/transformation) guardrails.

Performance impact: * Latency increase is minimal (<50ms) for most deterministic guardrails. * We've implemented caching mechanisms to reduce repeated evaluations. * Since the gateway lives on the edge, it avoids longer roundtrips

Challenges we're still tackling: * Balancing strict guardrails with maintaining model creativity * Standardizing evaluation metrics across different types of guardrails * Handling guardrail false positives/negatives effectively

We believe this approach of integrating guardrails at the gateway level provides a powerful tool for managing LLM behavior in production environments.

The code is open-source, and we welcome contributions and feedback. We're particularly interested in hearing about specific use cases or challenges you've faced in implementing reliable LLM systems.

Detailed documentation: https://portkey.wiki/guardrails

What are your thoughts on this approach? Are there specific guardrail implementations or orchestration patterns you'd like to see added?

by hrishion 8/14/2024, 8:43 PM

Love this!

by brianjkingon 8/14/2024, 3:13 PM

Coming over from Twitter/X (@iamrobotbear) -- congrats on the launch! Will dive into the docs, thanks for this!

by namanyaygon 8/14/2024, 2:57 PM

saw your tweet on X, nice work and congrats on launching!

i'm curious about the caching mechanisms you've implemented to reduce repeated evaluations - are you using a traditional cache store like redis or something more bespoke?

Show HN: An open-source AI Gateway with integrated guardrails