Kubernetes Native

AI API Proxy
Kubernetes Operator

Deploy and manage AI API proxies with GitOps-ready Kubernetes Operators. Automate scaling, self-healing, and lifecycle management for production AI workloads.

Get Started View Architecture

Operator Architecture

Declarative configuration with automatic reconciliation and self-healing

📝

CRD

Custom Resource

→

🔄

Controller

Watch & Reconcile

→

🐳

Proxy Pods

Managed Workloads

→

🌐

Service

Load Balancer

🎯

Declarative Configuration

Define desired state in YAML, operator handles the rest automatically.

🔧

Self-Healing

Automatic recovery from failures with configurable retry policies.

📈

Auto-Scaling

HPA integration for dynamic scaling based on API traffic.

🔒

Secret Management

Secure API key handling with Kubernetes secrets integration.

📊

Observability

Built-in Prometheus metrics and structured logging.

🔄

GitOps Ready

Compatible with ArgoCD and Flux for declarative deployments.

Custom Resource Definitions

Kubernetes-native API for managing AI proxy configurations

AIProxyConfig

apiEndpoint string

provider enum

rateLimit RateLimitSpec

cacheConfig CacheSpec

authSecretRef SecretReference

AIProxyDeployment

replicas integer

image string

resources ResourceRequirements

configRef AIProxyConfig

hpa HPASpec

AIProxyRoute

pathPrefix string

targetModels []string

loadBalance LoadBalancePolicy

timeout Duration

retryPolicy RetryPolicy

Deployment Example

Complete example of deploying an AI proxy with the operator

                    
                    
                    
                    ai-proxy-deployment.yaml
                

# AI Proxy Configuration
apiVersion: ai-proxy.io/v1
kind: AIProxyConfig
metadata:
  name: openai-config
spec:
  provider: openai
  apiEndpoint: https://api.openai.com/v1
  authSecretRef:
    name: openai-api-key
    key: api-key
  rateLimit:
    requestsPerSecond: 100
    burst: 50
  cacheConfig:
    enabled: true
    ttl: "5m"

---

# AI Proxy Deployment
apiVersion: ai-proxy.io/v1
kind: AIProxyDeployment
metadata:
  name: ai-gateway
spec:
  replicas: 3
  image: ai-proxy:1.0.0
  configRef: openai-config
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "2000m"
      memory: "2Gi"
  hpa:
    minReplicas: 3
    maxReplicas: 20
    targetCPUUtilization: 70
                

Operator Lifecycle Management

The operator handles all aspects of the proxy lifecycle

Reconcile

Watch for CR changes and compare to current state

Plan

Calculate required changes to reach desired state

Execute

Apply changes to deployments, services, configs

Verify

Check health status and update conditions

Heal

Auto-recover from failures with retry logic

Installation

Deploy the operator using kubectl or Helm

                    
                    Terminal
                
# Install using kubectl
kubectl apply -f https://github.com/ai-proxy/operator/releases/download/v1.0.0/crds.yaml
kubectl apply -f https://github.com/ai-proxy/operator/releases/download/v1.0.0/operator.yaml

# Or using Helm
helm repo add ai-proxy https://charts.ai-proxy.io
helm install ai-proxy-operator ai-proxy/operator --namespace ai-proxy-system --create-namespace

# Verify installation
kubectl get pods -n ai-proxy-system
kubectl get crd | grep ai-proxy