Why Choose LiteLLM?
Understanding the benefits before getting started
LiteLLM is a lightweight, open-source library that provides a unified interface for calling multiple LLM providers using OpenAI's format. It eliminates the need to learn different APIs for each provider, making it incredibly easy to switch between models or use multiple providers in the same application. The built-in proxy server adds powerful features like load balancing, caching, and rate limiting.
Unified API
Call OpenAI, Anthropic, Google, Azure, AWS Bedrock, and 100+ models using the exact same OpenAI-format API. Zero code changes needed when switching providers.
Quick Setup
Get running in under 5 minutes with pip install and a simple configuration file. No complex setup or infrastructure required for development and testing.
Cost Optimization
Built-in caching reduces API costs by 40-70%. Automatic fallbacks ensure reliability. Track usage and costs across all providers in one place.
Load Balancing
Distribute requests across multiple providers automatically. Implement fallback chains for high availability and optimal performance.
Analytics Dashboard
Track API usage, costs, and performance metrics with the built-in dashboard. Monitor token consumption across all providers.
Enterprise Security
API key management, rate limiting per user, and audit logging. Keep your LLM provider keys secure behind your proxy.
Quick Start Guide
Get your LiteLLM proxy running in minutes
Install LiteLLM
Install LiteLLM with proxy support using pip. The installation includes all core dependencies needed to run a fully-featured proxy server.
# Install LiteLLM with proxy support pip install litellm[proxy] # Verify installation litellm --version # Expected output: litellm version x.x.x
Set Up API Keys
Configure your LLM provider API keys as environment variables. LiteLLM will automatically detect and use these keys for authentication with each provider.
# Set OpenAI API key export OPENAI_API_KEY="sk-your-openai-key" # Set Anthropic API key export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key" # Set Google AI key export GEMINI_API_KEY="your-google-ai-key" # Set Azure OpenAI credentials export AZURE_API_KEY="your-azure-key" export AZURE_API_BASE="https://your-resource.openai.azure.com"
Never commit API keys to version control. Store them in environment variables, use .env files (add to .gitignore), or use a secrets manager for production deployments.
Create Configuration File
Create a YAML configuration file that defines which models to expose through your proxy. This file maps friendly model names to actual provider models.
model_list: # OpenAI models - model_name: gpt-4 litellm_params: model: openai/gpt-4-turbo-preview api_key: os.environ/OPENAI_API_KEY - model_name: gpt-35 litellm_params: model: openai/gpt-3.5-turbo api_key: os.environ/OPENAI_API_KEY # Anthropic models - model_name: claude-3-opus litellm_params: model: anthropic/claude-3-opus-20240229 api_key: os.environ/ANTHROPIC_API_KEY - model_name: claude-3-sonnet litellm_params: model: anthropic/claude-3-sonnet-20240229 api_key: os.environ/ANTHROPIC_API_KEY general_settings: master_key: sk-1234 # Proxy authentication key database: os.environ/DATABASE_URL # Optional: for persistence
Start the Proxy Server
Launch your LiteLLM proxy server using the configuration file. The server will start on port 4000 by default and provide an OpenAI-compatible API.
# Start the proxy server litellm --config litellm_config.yaml # Or specify a custom port litellm --config litellm_config.yaml --port 8080 # Server will start at http://localhost:4000 # API endpoint: http://localhost:4000/v1/chat/completions
Your proxy is now running at http://localhost:4000. The master key you configured (sk-1234) is required for all API requests. Use this as your Authorization: Bearer sk-1234 header when making requests.
Test Your Proxy
Make your first API request through the proxy to verify everything is working correctly. Use curl or any HTTP client to send a chat completion request.
# Test with curl curl http://localhost:4000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-1234" \ -d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}] }'
Supported Providers
LiteLLM supports 100+ models from major providers
| Provider | Models | Configuration |
|---|---|---|
| OpenAI | GPT-4, GPT-3.5, DALL-E, Whisper | openai/gpt-4-turbo-preview |
| Anthropic | Claude 3 Opus, Sonnet, Haiku | anthropic/claude-3-opus-20240229 |
| Google AI | Gemini Pro, PaLM 2 | gemini/gemini-pro |
| Azure OpenAI | GPT-4, GPT-3.5 (Azure) | azure/gpt-4-deployment-name |
| AWS Bedrock | Claude, Llama 2, Titan | bedrock/anthropic.claude-3 |
| Cohere | Command, Embed, Summarize | cohere/command-r-plus |
| Replicate | Llama 2, Mistral, Vicuna | replicate/meta/llama-2-70b |
API Usage Examples
Learn how to use the proxy with your applications
Python SDK
Use the OpenAI Python SDK with your LiteLLM proxy by simply changing the base_url parameter.
from openai import OpenAI # Point to your LiteLLM proxy client = OpenAI( base_url="http://localhost:4000/v1", api_key="sk-1234" # Your proxy master key ) # Make requests exactly like OpenAI response = client.chat.completions.create( model="gpt-4", # Use model names from your config messages=[ {"role": "user", "content": "Hello!"} ] ) print(response.choices[0].message.content)
Streaming Responses
Enable streaming for better user experience with long responses.
stream = client.chat.completions.create( model="claude-3-opus", messages=[{"role": "user", "content": "Tell me a story"}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")
For production, configure proper authentication, enable SSL/TLS, set up a database for persistence, implement rate limiting, and deploy with Docker or Kubernetes. See the deployment guides for detailed instructions.