Hey HN,
I built AgentBudget after an AI agent loop cost me $187 in 10 minutes — GPT-4o retrying a failed analysis over and over. Existing tools (LangSmith, Langfuse) track costs after execution but don't prevent overspend.
AgentBudget is a Python SDK that gives each agent session a hard dollar budget with real-time enforcement. Integration is two lines:
import agentbudget
agentbudget.init("$5.00")
It monkey-patches the OpenAI and Anthropic SDKs (same pattern as Sentry/Datadog), so existing code works without changes. When the budget is hit, it raises BudgetExhausted before the next API call goes out.How it works:
- Two-phase enforcement: estimates cost pre-call (input tokens + average completion), reconciles post-call with actual usage. Worst-case overshoot is bounded to one call. - Loop detection: sliding window over (tool_name, argument_hash, timestamp) tuples. Catches infinite retries even if budget remains. - Cost engine: pricing table for 50+ models across OpenAI, Anthropic, Google, Mistral, Cohere. Fuzzy matching for dated model variants. - Unified ledger: tracks both LLM calls and external tool costs (via track() or @track_tool decorator) in a single session.
Benchmarks: 3.5ÎĽs median overhead per enforcement check. Zero budget overshoot across all tested scenarios. Loop detection: 0 false positives on diverse workloads, catches pathological loops at exactly N+1 calls.
No infrastructure needed — it's a library, not a platform. No Redis, no cloud services, no accounts.
I also wrote a whitepaper covering the architecture and integration with Coinbase's x402 payment protocol (where agents make autonomous stablecoin payments): https://doi.org/10.5281/zenodo.18720464
1,300+ PyPI installs in the first 4 days, all organic. Apache 2.0.
Happy to answer questions about the design.
Real-time budget enforcement is a smart approach, especially for agentic loops where costs can spiral from retries. We've tackled the cost side by building an AI gateway at https://simplio.dev that automatically routes requests to the most affordable provider that meets your quality threshold, which has cut our own API bills substantially.
This is exactly the pain point with agents: spend isn’t linear because fanout + retries compound. One thing that helped us debug/contain spikes is tracking cost per “user-action/outcome” (not just per call) plus a retry ratio trend (429/timeouts). Do you support budgets per step/tool in the chain, or only per overall run?
I found this from your twitter post, crazy that i found your post here hahaha, i am trying to implement it for my side project to keep the agents from taking over my side project budget.
Looks really promising so far!
Curious about the granularity: does it support per-token budgeting or integration with providers like OpenAI's API for predictive alerts?
Interesting to see budget enforcement paired with x402. We've been building in the same space — Apiosk (https://apiosk.com) approaches it from the server side: a gateway that enforces per-request x402 payments so API providers can monetize without accounts or keys.
Your budget SDK + Apiosk would be a natural combo — the agent has a spending ceiling (AgentBudget) and the APIs it calls use x402 for micropayments (Apiosk handles gateway/verification). Have you thought about hooks for x402-aware budget tracking where the ledger automatically records on-chain settlements?