LiteLLM Proxy Docker Deployment - Complete Container Guide 2024

Why Docker Deployment?

Understanding the benefits of containerizing your LiteLLM proxy

Docker deployment provides consistency, isolation, and portability for your LiteLLM proxy. Containers ensure your proxy runs identically across development, staging, and production environments. This eliminates environment-specific issues and simplifies the deployment process significantly. Containerization also enables easy scaling, rolling updates, and efficient resource utilization through orchestration platforms like Kubernetes or Docker Swarm.

📦

Consistent Environments

Eliminate "works on my machine" problems. Docker ensures your LiteLLM proxy runs identically everywhere by packaging all dependencies, runtime, and configuration into a single container image.

🔒

Isolation & Security

Containers provide process isolation, limiting the blast radius of potential security issues. Each LiteLLM instance runs in its own isolated environment with controlled resource access.

🚀

Fast Deployment

Deploy new instances in seconds rather than minutes. Docker images start quickly, enabling rapid scaling in response to traffic changes and fast recovery from failures.

🔄

Easy Updates

Perform rolling updates with zero downtime. Pull new images and restart containers without affecting running traffic, ensuring continuous availability of your AI services.

📊

Resource Efficiency

Containers share the host OS kernel, making them lightweight compared to VMs. Run more LiteLLM instances on the same hardware, optimizing resource utilization and reducing costs.

🛠️

Simplified CI/CD

Integrate Docker builds into your CI/CD pipeline for automated testing and deployment. Every code change can trigger container builds and deployments automatically.

Quick Start

Get LiteLLM running in Docker in under 5 minutes

Container Deployment Flow

Dockerfile

Image Definition

→

Build

Create Image

→

Container

Running Instance

→

Proxy

API Endpoint

Option 1: Use Official Image

                    Terminal
                    Bash
                

                    # Pull official LiteLLM image
docker pull ghcr.io/berriai/litellm:main-latest

# Run with environment variables
docker run -d \
  --name litellm-proxy \
  -p 4000:4000 \
  -e OPENAI_API_KEY=sk-your-key \
  -e ANTHROPIC_API_KEY=sk-ant-your-key \
  -e LITELLM_MASTER_KEY=sk-master-key \
  ghcr.io/berriai/litellm:main-latest

# Test the proxy
curl http://localhost:4000/health
                    
                

Option 2: Custom Dockerfile

                    Dockerfile
                    Docker
                

                    # Use Python slim image for smaller size
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install LiteLLM with proxy extras
RUN pip install --no-cache-dir litellm[proxy]

# Copy configuration file
COPY litellm_config.yaml /app/config.yaml

# Create non-root user
RUN useradd -m -u 1000 litellm && \
    chown -R litellm:litellm /app

USER litellm

# Expose port
EXPOSE 4000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:4000/health || exit 1

# Run LiteLLM proxy
CMD ["litellm", "--config", "/app/config.yaml", "--port", "4000"]
                    
                

✅ Image Size Optimization

Using python:3.11-slim instead of full python image reduces container size from ~1GB to ~200MB. For even smaller images, consider using distroless or alpine-based images, though they require additional configuration for some LiteLLM dependencies.

Docker Compose Setup

Orchestrate LiteLLM with databases, caches, and monitoring

                    docker-compose.yml
                    YAML
                

                    version: '3.8'

services:
  litellm:
    build: .
    container_name: litellm-proxy
    ports:
      - "4000:4000"
    environment:
      - DATABASE_URL=postgresql://litellm:password@postgres:5432/litellm
      - REDIS_URL=redis://redis:6379
      - LITELLM_MASTER_KEY=${LITELLM_MASTER_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - ./litellm_config.yaml:/app/config.yaml:ro
    depends_on:
      - postgres
      - redis
    restart: unless-stopped
    networks:
      - litellm-network

  postgres:
    image: postgres:15-alpine
    container_name: litellm-postgres
    environment:
      POSTGRES_DB: litellm
      POSTGRES_USER: litellm
      POSTGRES_PASSWORD: password
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - litellm-network

  redis:
    image: redis:7-alpine
    container_name: litellm-redis
    volumes:
      - redis_data:/data
    networks:
      - litellm-network

volumes:
  postgres_data:
  redis_data:

networks:
  litellm-network:
    driver: bridge
                    
                

Run with Docker Compose

                    Terminal
                    Bash
                

                    # Create .env file with secrets
cat > .env << EOF
LITELLM_MASTER_KEY=sk-your-master-key
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-key
EOF

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f litellm

# Scale horizontally
docker-compose up -d --scale litellm=3

# Stop services
docker-compose down
                    
                

Kubernetes Deployment

Deploy LiteLLM to Kubernetes for production scalability

1

Create Namespace & Secrets

Set up a dedicated namespace and store API keys securely using Kubernetes Secrets.

kubectl create namespace litellm
kubectl create secret generic litellm-secrets
Store API keys and master key
Use sealed-secrets for GitOps

2

Deploy with ConfigMap

Create ConfigMap for LiteLLM configuration and Deployments for running instances.

ConfigMap for config.yaml
Deployment with replicas
Resource limits and requests
Liveness and readiness probes

3

Configure Services

Create Service for internal communication and Ingress for external access.

ClusterIP Service
Ingress with TLS
LoadBalancer for cloud
Horizontal Pod Autoscaler

4

Set Up Monitoring

Deploy Prometheus and Grafana for comprehensive monitoring and alerting.

Prometheus metrics endpoint
Grafana dashboards
AlertManager rules
Log aggregation with Loki

                    litellm-deployment.yaml
                    YAML
                

                    apiVersion: apps/v1
kind: Deployment
metadata:
  name: litellm-proxy
  namespace: litellm
spec:
  replicas: 3
  selector:
    matchLabels:
      app: litellm-proxy
  template:
    metadata:
      labels:
        app: litellm-proxy
    spec:
      containers:
      - name: litellm
        image: ghcr.io/berriai/litellm:main-latest
        ports:
        - containerPort: 4000
        envFrom:
        - secretRef:
            name: litellm-secrets
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 4000
          initialDelaySeconds: 30
          periodSeconds: 10
                    
                

Production Best Practices

Ensure reliability, security, and performance in production

Production Checklist

Area	Requirement	Priority
Security	Use secrets management, not env vars in code	Critical
High Availability	Run minimum 3 replicas across availability zones	Critical
Monitoring	Implement metrics, logging, and alerting	Critical
Backups	Regular database backups with tested restore	High
Resource Limits	Set appropriate CPU and memory limits	High
SSL/TLS	Enable HTTPS with valid certificates	Critical
Rate Limiting	Implement to protect against quota exhaustion	High
Disaster Recovery	Document and test recovery procedures	High

⚠️ Security Warning

Never commit API keys or secrets to version control. Use Kubernetes Secrets, Docker secrets, or external secret management tools like HashiCorp Vault. Rotate credentials regularly and implement proper access controls for production environments.