🚀 Beginner-Friendly Tutorial

LiteLLM Proxy Getting Started

Master LiteLLM proxy in under 10 minutes with this comprehensive beginner's guide. Learn installation, configuration, multi-provider setup, API integration, and best practices to start building AI applications with a unified interface across all major LLM providers.

1
Install
2
Configure
3
Run Proxy
4
Test API

Why Choose LiteLLM?

Understanding the benefits before getting started

LiteLLM is a lightweight, open-source library that provides a unified interface for calling multiple LLM providers using OpenAI's format. It eliminates the need to learn different APIs for each provider, making it incredibly easy to switch between models or use multiple providers in the same application. The built-in proxy server adds powerful features like load balancing, caching, and rate limiting.

🔌

Unified API

Call OpenAI, Anthropic, Google, Azure, AWS Bedrock, and 100+ models using the exact same OpenAI-format API. Zero code changes needed when switching providers.

Quick Setup

Get running in under 5 minutes with pip install and a simple configuration file. No complex setup or infrastructure required for development and testing.

💰

Cost Optimization

Built-in caching reduces API costs by 40-70%. Automatic fallbacks ensure reliability. Track usage and costs across all providers in one place.

🔄

Load Balancing

Distribute requests across multiple providers automatically. Implement fallback chains for high availability and optimal performance.

📊

Analytics Dashboard

Track API usage, costs, and performance metrics with the built-in dashboard. Monitor token consumption across all providers.

🔐

Enterprise Security

API key management, rate limiting per user, and audit logging. Keep your LLM provider keys secure behind your proxy.

Quick Start Guide

Get your LiteLLM proxy running in minutes

1

Install LiteLLM

Install LiteLLM with proxy support using pip. The installation includes all core dependencies needed to run a fully-featured proxy server.

Terminal Bash
# Install LiteLLM with proxy support
pip install litellm[proxy]

# Verify installation
litellm --version

# Expected output: litellm version x.x.x
                        
2

Set Up API Keys

Configure your LLM provider API keys as environment variables. LiteLLM will automatically detect and use these keys for authentication with each provider.

Terminal Bash
# Set OpenAI API key
export OPENAI_API_KEY="sk-your-openai-key"

# Set Anthropic API key
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"

# Set Google AI key
export GEMINI_API_KEY="your-google-ai-key"

# Set Azure OpenAI credentials
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-resource.openai.azure.com"
                        
💡 Security Tip

Never commit API keys to version control. Store them in environment variables, use .env files (add to .gitignore), or use a secrets manager for production deployments.

3

Create Configuration File

Create a YAML configuration file that defines which models to expose through your proxy. This file maps friendly model names to actual provider models.

litellm_config.yaml YAML
model_list:
  # OpenAI models
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4-turbo-preview
      api_key: os.environ/OPENAI_API_KEY

  - model_name: gpt-35
    litellm_params:
      model: openai/gpt-3.5-turbo
      api_key: os.environ/OPENAI_API_KEY

  # Anthropic models
  - model_name: claude-3-opus
    litellm_params:
      model: anthropic/claude-3-opus-20240229
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: claude-3-sonnet
    litellm_params:
      model: anthropic/claude-3-sonnet-20240229
      api_key: os.environ/ANTHROPIC_API_KEY

general_settings:
  master_key: sk-1234  # Proxy authentication key
  database: os.environ/DATABASE_URL  # Optional: for persistence
                        
4

Start the Proxy Server

Launch your LiteLLM proxy server using the configuration file. The server will start on port 4000 by default and provide an OpenAI-compatible API.

Terminal Bash
# Start the proxy server
litellm --config litellm_config.yaml

# Or specify a custom port
litellm --config litellm_config.yaml --port 8080

# Server will start at http://localhost:4000
# API endpoint: http://localhost:4000/v1/chat/completions
                        
📋 Access Your Proxy

Your proxy is now running at http://localhost:4000. The master key you configured (sk-1234) is required for all API requests. Use this as your Authorization: Bearer sk-1234 header when making requests.

5

Test Your Proxy

Make your first API request through the proxy to verify everything is working correctly. Use curl or any HTTP client to send a chat completion request.

Test Request Bash
# Test with curl
curl http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
                        

Supported Providers

LiteLLM supports 100+ models from major providers

Provider Models Configuration
OpenAI GPT-4, GPT-3.5, DALL-E, Whisper openai/gpt-4-turbo-preview
Anthropic Claude 3 Opus, Sonnet, Haiku anthropic/claude-3-opus-20240229
Google AI Gemini Pro, PaLM 2 gemini/gemini-pro
Azure OpenAI GPT-4, GPT-3.5 (Azure) azure/gpt-4-deployment-name
AWS Bedrock Claude, Llama 2, Titan bedrock/anthropic.claude-3
Cohere Command, Embed, Summarize cohere/command-r-plus
Replicate Llama 2, Mistral, Vicuna replicate/meta/llama-2-70b

API Usage Examples

Learn how to use the proxy with your applications

Python SDK

Use the OpenAI Python SDK with your LiteLLM proxy by simply changing the base_url parameter.

python_example.py Python
from openai import OpenAI

# Point to your LiteLLM proxy
client = OpenAI(
    base_url="http://localhost:4000/v1",
    api_key="sk-1234"  # Your proxy master key
)

# Make requests exactly like OpenAI
response = client.chat.completions.create(
    model="gpt-4",  # Use model names from your config
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)
                        

Streaming Responses

Enable streaming for better user experience with long responses.

streaming_example.py Python
stream = client.chat.completions.create(
    model="claude-3-opus",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
                        
⚠️ Production Deployment

For production, configure proper authentication, enable SSL/TLS, set up a database for persistence, implement rate limiting, and deploy with Docker or Kubernetes. See the deployment guides for detailed instructions.