🐳 Containerized Deployment Guide

LiteLLM Proxy Docker Deployment

Master containerized LiteLLM proxy deployment with Docker and Kubernetes. Learn Dockerfile creation, multi-stage builds, docker-compose orchestration, Kubernetes deployment manifests, environment configuration, secrets management, and production best practices for scalable AI infrastructure.

5 min
Deployment Time
50MB
Image Size
100%
Reproducible
0
Downtime Updates

Why Docker Deployment?

Understanding the benefits of containerizing your LiteLLM proxy

Docker deployment provides consistency, isolation, and portability for your LiteLLM proxy. Containers ensure your proxy runs identically across development, staging, and production environments. This eliminates environment-specific issues and simplifies the deployment process significantly. Containerization also enables easy scaling, rolling updates, and efficient resource utilization through orchestration platforms like Kubernetes or Docker Swarm.

📦

Consistent Environments

Eliminate "works on my machine" problems. Docker ensures your LiteLLM proxy runs identically everywhere by packaging all dependencies, runtime, and configuration into a single container image.

🔒

Isolation & Security

Containers provide process isolation, limiting the blast radius of potential security issues. Each LiteLLM instance runs in its own isolated environment with controlled resource access.

🚀

Fast Deployment

Deploy new instances in seconds rather than minutes. Docker images start quickly, enabling rapid scaling in response to traffic changes and fast recovery from failures.

🔄

Easy Updates

Perform rolling updates with zero downtime. Pull new images and restart containers without affecting running traffic, ensuring continuous availability of your AI services.

📊

Resource Efficiency

Containers share the host OS kernel, making them lightweight compared to VMs. Run more LiteLLM instances on the same hardware, optimizing resource utilization and reducing costs.

🛠️

Simplified CI/CD

Integrate Docker builds into your CI/CD pipeline for automated testing and deployment. Every code change can trigger container builds and deployments automatically.

Quick Start

Get LiteLLM running in Docker in under 5 minutes

Container Deployment Flow
Dockerfile
Image Definition
Build
Create Image
Container
Running Instance
Proxy
API Endpoint

Option 1: Use Official Image

Terminal Bash
# Pull official LiteLLM image
docker pull ghcr.io/berriai/litellm:main-latest

# Run with environment variables
docker run -d \
  --name litellm-proxy \
  -p 4000:4000 \
  -e OPENAI_API_KEY=sk-your-key \
  -e ANTHROPIC_API_KEY=sk-ant-your-key \
  -e LITELLM_MASTER_KEY=sk-master-key \
  ghcr.io/berriai/litellm:main-latest

# Test the proxy
curl http://localhost:4000/health
                    

Option 2: Custom Dockerfile

Dockerfile Docker
# Use Python slim image for smaller size
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install LiteLLM with proxy extras
RUN pip install --no-cache-dir litellm[proxy]

# Copy configuration file
COPY litellm_config.yaml /app/config.yaml

# Create non-root user
RUN useradd -m -u 1000 litellm && \
    chown -R litellm:litellm /app

USER litellm

# Expose port
EXPOSE 4000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:4000/health || exit 1

# Run LiteLLM proxy
CMD ["litellm", "--config", "/app/config.yaml", "--port", "4000"]
                    
✅ Image Size Optimization

Using python:3.11-slim instead of full python image reduces container size from ~1GB to ~200MB. For even smaller images, consider using distroless or alpine-based images, though they require additional configuration for some LiteLLM dependencies.

Docker Compose Setup

Orchestrate LiteLLM with databases, caches, and monitoring

docker-compose.yml YAML
version: '3.8'

services:
  litellm:
    build: .
    container_name: litellm-proxy
    ports:
      - "4000:4000"
    environment:
      - DATABASE_URL=postgresql://litellm:password@postgres:5432/litellm
      - REDIS_URL=redis://redis:6379
      - LITELLM_MASTER_KEY=${LITELLM_MASTER_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - ./litellm_config.yaml:/app/config.yaml:ro
    depends_on:
      - postgres
      - redis
    restart: unless-stopped
    networks:
      - litellm-network

  postgres:
    image: postgres:15-alpine
    container_name: litellm-postgres
    environment:
      POSTGRES_DB: litellm
      POSTGRES_USER: litellm
      POSTGRES_PASSWORD: password
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - litellm-network

  redis:
    image: redis:7-alpine
    container_name: litellm-redis
    volumes:
      - redis_data:/data
    networks:
      - litellm-network

volumes:
  postgres_data:
  redis_data:

networks:
  litellm-network:
    driver: bridge
                    

Run with Docker Compose

Terminal Bash
# Create .env file with secrets
cat > .env << EOF
LITELLM_MASTER_KEY=sk-your-master-key
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-key
EOF

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f litellm

# Scale horizontally
docker-compose up -d --scale litellm=3

# Stop services
docker-compose down
                    

Kubernetes Deployment

Deploy LiteLLM to Kubernetes for production scalability

1

Create Namespace & Secrets

Set up a dedicated namespace and store API keys securely using Kubernetes Secrets.

  • kubectl create namespace litellm
  • kubectl create secret generic litellm-secrets
  • Store API keys and master key
  • Use sealed-secrets for GitOps
2

Deploy with ConfigMap

Create ConfigMap for LiteLLM configuration and Deployments for running instances.

  • ConfigMap for config.yaml
  • Deployment with replicas
  • Resource limits and requests
  • Liveness and readiness probes
3

Configure Services

Create Service for internal communication and Ingress for external access.

  • ClusterIP Service
  • Ingress with TLS
  • LoadBalancer for cloud
  • Horizontal Pod Autoscaler
4

Set Up Monitoring

Deploy Prometheus and Grafana for comprehensive monitoring and alerting.

  • Prometheus metrics endpoint
  • Grafana dashboards
  • AlertManager rules
  • Log aggregation with Loki
litellm-deployment.yaml YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: litellm-proxy
  namespace: litellm
spec:
  replicas: 3
  selector:
    matchLabels:
      app: litellm-proxy
  template:
    metadata:
      labels:
        app: litellm-proxy
    spec:
      containers:
      - name: litellm
        image: ghcr.io/berriai/litellm:main-latest
        ports:
        - containerPort: 4000
        envFrom:
        - secretRef:
            name: litellm-secrets
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 4000
          initialDelaySeconds: 30
          periodSeconds: 10
                    

Production Best Practices

Ensure reliability, security, and performance in production

Production Checklist

Area Requirement Priority
Security Use secrets management, not env vars in code Critical
High Availability Run minimum 3 replicas across availability zones Critical
Monitoring Implement metrics, logging, and alerting Critical
Backups Regular database backups with tested restore High
Resource Limits Set appropriate CPU and memory limits High
SSL/TLS Enable HTTPS with valid certificates Critical
Rate Limiting Implement to protect against quota exhaustion High
Disaster Recovery Document and test recovery procedures High
⚠️ Security Warning

Never commit API keys or secrets to version control. Use Kubernetes Secrets, Docker secrets, or external secret management tools like HashiCorp Vault. Rotate credentials regularly and implement proper access controls for production environments.