Kubernetes Native

AI API Proxy
Kubernetes Operator

Deploy and manage AI API proxies with GitOps-ready Kubernetes Operators. Automate scaling, self-healing, and lifecycle management for production AI workloads.

Operator Architecture

Declarative configuration with automatic reconciliation and self-healing

📝
CRD
Custom Resource
🔄
Controller
Watch & Reconcile
🐳
Proxy Pods
Managed Workloads
🌐
Service
Load Balancer
🎯
Declarative Configuration
Define desired state in YAML, operator handles the rest automatically.
🔧
Self-Healing
Automatic recovery from failures with configurable retry policies.
📈
Auto-Scaling
HPA integration for dynamic scaling based on API traffic.
🔒
Secret Management
Secure API key handling with Kubernetes secrets integration.
📊
Observability
Built-in Prometheus metrics and structured logging.
🔄
GitOps Ready
Compatible with ArgoCD and Flux for declarative deployments.

Custom Resource Definitions

Kubernetes-native API for managing AI proxy configurations

AIProxyConfig
apiEndpoint string
provider enum
rateLimit RateLimitSpec
cacheConfig CacheSpec
authSecretRef SecretReference
AIProxyDeployment
replicas integer
image string
resources ResourceRequirements
configRef AIProxyConfig
hpa HPASpec
AIProxyRoute
pathPrefix string
targetModels []string
loadBalance LoadBalancePolicy
timeout Duration
retryPolicy RetryPolicy

Deployment Example

Complete example of deploying an AI proxy with the operator

ai-proxy-deployment.yaml
# AI Proxy Configuration
apiVersion: ai-proxy.io/v1
kind: AIProxyConfig
metadata:
  name: openai-config
spec:
  provider: openai
  apiEndpoint: https://api.openai.com/v1
  authSecretRef:
    name: openai-api-key
    key: api-key
  rateLimit:
    requestsPerSecond: 100
    burst: 50
  cacheConfig:
    enabled: true
    ttl: "5m"

---

# AI Proxy Deployment
apiVersion: ai-proxy.io/v1
kind: AIProxyDeployment
metadata:
  name: ai-gateway
spec:
  replicas: 3
  image: ai-proxy:1.0.0
  configRef: openai-config
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "2000m"
      memory: "2Gi"
  hpa:
    minReplicas: 3
    maxReplicas: 20
    targetCPUUtilization: 70

Operator Lifecycle Management

The operator handles all aspects of the proxy lifecycle

1
Reconcile
Watch for CR changes and compare to current state
2
Plan
Calculate required changes to reach desired state
3
Execute
Apply changes to deployments, services, configs
4
Verify
Check health status and update conditions
5
Heal
Auto-recover from failures with retry logic

Installation

Deploy the operator using kubectl or Helm

Terminal
# Install using kubectl
kubectl apply -f https://github.com/ai-proxy/operator/releases/download/v1.0.0/crds.yaml
kubectl apply -f https://github.com/ai-proxy/operator/releases/download/v1.0.0/operator.yaml

# Or using Helm
helm repo add ai-proxy https://charts.ai-proxy.io
helm install ai-proxy-operator ai-proxy/operator --namespace ai-proxy-system --create-namespace

# Verify installation
kubectl get pods -n ai-proxy-system
kubectl get crd | grep ai-proxy