RateLimit
◉ USAGE LIMITS

AI API Proxy Usage Limits

Implement robust usage limits, rate limiting, and quota management for AI API proxies. Protect your infrastructure while ensuring fair access.

75%

Monthly Quota Usage

75,000 / 100,000 requests used

🛡️

Rate Limiting

Control request rates per user, API key, or IP address. Prevent abuse and ensure fair resource allocation.

📊

Quota Management

Set daily, weekly, or monthly usage quotas. Track consumption and notify users approaching limits.

Rate Limiter Implementation

rate_limiter.py
# Token Bucket Rate Limiter import time from collections import defaultdict class TokenBucketLimiter: def __init__(self, rate, capacity): self.rate = rate # tokens per second self.capacity = capacity self.buckets = defaultdict(lambda: {"tokens": capacity, "last": time.time()}) def allow_request(self, key): bucket = self.buckets[key] now = time.time() # Refill tokens based on time elapsed elapsed = now - bucket["last"] bucket["tokens"] = min( self.capacity, bucket["tokens"] + elapsed * self.rate ) bucket["last"] = now if bucket["tokens"] >= 1: bucket["tokens"] -= 1 return True return False def get_remaining(self, key): return int(self.buckets[key]["tokens"])

Common Rate Limit Strategies

rate_limits.yaml
rate_limits: # Per-user rate limits user: requests_per_minute: 60 requests_per_hour: 1000 requests_per_day: 10000 # Per-API-key limits api_key: requests_per_minute: 100 tokens_per_minute: 150000 # Per-IP limits ip: requests_per_second: 10 requests_per_minute: 300 # Burst handling burst: max_burst_size: 20 burst_window: 5 # seconds

◈ Related Topics