LiteLLM Proxy Getting Started - Complete Tutorial 2024

Why Choose LiteLLM?

Understanding the benefits before getting started

LiteLLM is a lightweight, open-source library that provides a unified interface for calling multiple LLM providers using OpenAI's format. It eliminates the need to learn different APIs for each provider, making it incredibly easy to switch between models or use multiple providers in the same application. The built-in proxy server adds powerful features like load balancing, caching, and rate limiting.

🔌

Unified API

Call OpenAI, Anthropic, Google, Azure, AWS Bedrock, and 100+ models using the exact same OpenAI-format API. Zero code changes needed when switching providers.

⚡

Quick Setup

Get running in under 5 minutes with pip install and a simple configuration file. No complex setup or infrastructure required for development and testing.

💰

Cost Optimization

Built-in caching reduces API costs by 40-70%. Automatic fallbacks ensure reliability. Track usage and costs across all providers in one place.

🔄

Load Balancing

Distribute requests across multiple providers automatically. Implement fallback chains for high availability and optimal performance.

📊

Analytics Dashboard

Track API usage, costs, and performance metrics with the built-in dashboard. Monitor token consumption across all providers.

🔐

Enterprise Security

API key management, rate limiting per user, and audit logging. Keep your LLM provider keys secure behind your proxy.

Quick Start Guide

Get your LiteLLM proxy running in minutes

Install LiteLLM

Install LiteLLM with proxy support using pip. The installation includes all core dependencies needed to run a fully-featured proxy server.

                        Terminal
                        Bash
                    
                        # Install LiteLLM with proxy support
pip install litellm[proxy]

# Verify installation
litellm --version

# Expected output: litellm version x.x.x

Set Up API Keys

Configure your LLM provider API keys as environment variables. LiteLLM will automatically detect and use these keys for authentication with each provider.

                        Terminal
                        Bash
                    

                        # Set OpenAI API key
export OPENAI_API_KEY="sk-your-openai-key"

# Set Anthropic API key
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"

# Set Google AI key
export GEMINI_API_KEY="your-google-ai-key"

# Set Azure OpenAI credentials
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-resource.openai.azure.com"
                        
                    

💡 Security Tip

Never commit API keys to version control. Store them in environment variables, use .env files (add to .gitignore), or use a secrets manager for production deployments.

Create Configuration File

Create a YAML configuration file that defines which models to expose through your proxy. This file maps friendly model names to actual provider models.

                        litellm_config.yaml
                        YAML
                    

                        model_list:
  # OpenAI models
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4-turbo-preview
      api_key: os.environ/OPENAI_API_KEY

  - model_name: gpt-35
    litellm_params:
      model: openai/gpt-3.5-turbo
      api_key: os.environ/OPENAI_API_KEY

  # Anthropic models
  - model_name: claude-3-opus
    litellm_params:
      model: anthropic/claude-3-opus-20240229
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: claude-3-sonnet
    litellm_params:
      model: anthropic/claude-3-sonnet-20240229
      api_key: os.environ/ANTHROPIC_API_KEY

general_settings:
  master_key: sk-1234  # Proxy authentication key
  database: os.environ/DATABASE_URL  # Optional: for persistence
                        
                    

Start the Proxy Server

Launch your LiteLLM proxy server using the configuration file. The server will start on port 4000 by default and provide an OpenAI-compatible API.

                        Terminal
                        Bash
                    
                        # Start the proxy server
litellm --config litellm_config.yaml

# Or specify a custom port
litellm --config litellm_config.yaml --port 8080

# Server will start at http://localhost:4000
# API endpoint: http://localhost:4000/v1/chat/completions

📋 Access Your Proxy

Your proxy is now running at http://localhost:4000. The master key you configured (sk-1234) is required for all API requests. Use this as your Authorization: Bearer sk-1234 header when making requests.

Test Your Proxy

Make your first API request through the proxy to verify everything is working correctly. Use curl or any HTTP client to send a chat completion request.

                        Test Request
                        Bash
                    

                        # Test with curl
curl http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
                        
                    

Supported Providers

LiteLLM supports 100+ models from major providers

Provider	Models	Configuration
OpenAI	GPT-4, GPT-3.5, DALL-E, Whisper	openai/gpt-4-turbo-preview
Anthropic	Claude 3 Opus, Sonnet, Haiku	anthropic/claude-3-opus-20240229
Google AI	Gemini Pro, PaLM 2	gemini/gemini-pro
Azure OpenAI	GPT-4, GPT-3.5 (Azure)	azure/gpt-4-deployment-name
AWS Bedrock	Claude, Llama 2, Titan	bedrock/anthropic.claude-3
Cohere	Command, Embed, Summarize	cohere/command-r-plus
Replicate	Llama 2, Mistral, Vicuna	replicate/meta/llama-2-70b

API Usage Examples

Learn how to use the proxy with your applications

Python SDK

Use the OpenAI Python SDK with your LiteLLM proxy by simply changing the base_url parameter.

                        python_example.py
                        Python
                    

                        from openai import OpenAI

# Point to your LiteLLM proxy
client = OpenAI(
    base_url="http://localhost:4000/v1",
    api_key="sk-1234"  # Your proxy master key
)

# Make requests exactly like OpenAI
response = client.chat.completions.create(
    model="gpt-4",  # Use model names from your config
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)
                        
                    

Streaming Responses

Enable streaming for better user experience with long responses.

                        streaming_example.py
                        Python
                    

                        stream = client.chat.completions.create(
    model="claude-3-opus",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
                        
                    

⚠️ Production Deployment

For production, configure proper authentication, enable SSL/TLS, set up a database for persistence, implement rate limiting, and deploy with Docker or Kubernetes. See the deployment guides for detailed instructions.

Why Choose LiteLLM?

Unified API

Quick Setup

Cost Optimization

Load Balancing

Analytics Dashboard

Enterprise Security

Quick Start Guide

Install LiteLLM

Set Up API Keys

Create Configuration File

Start the Proxy Server

Test Your Proxy

Supported Providers

API Usage Examples

Python SDK

Streaming Responses

Continue Learning

Complete Setup Guide

Docker Deployment

Kubernetes Deployment

Authentication Setup