Quick Install
Get LiteLLM running in under 5 minutes with pip installation
Easy Config
Simple YAML configuration for multiple AI providers
Deploy Ready
Production deployment with Docker and Kubernetes support
Installation Guide
Follow these steps to install and configure LiteLLM proxy on your system
Prerequisites Check
Before installing LiteLLM, ensure your system meets the following requirements. LiteLLM requires Python 3.8 or higher and works best with virtual environments. Having these prerequisites in place ensures a smooth installation process and avoids common setup issues that many developers encounter during initial configuration.
- Python 3.8+ installed on your system
- pip package manager (usually comes with Python)
- Virtual environment tool (venv or conda recommended)
- API keys for your chosen LLM providers
- At least 512MB of available RAM
# Check Python version (requires 3.8+)
python --version
# Create virtual environment
python -m venv litellm-env
# Activate virtual environment
source litellm-env/bin/activate # Linux/Mac
litellm-env\Scripts\activate # Windows
Install LiteLLM Package
Install LiteLLM using pip, the Python package manager. The installation includes all core dependencies needed to run the proxy server. For production deployments, we recommend installing the proxy extras which include additional features like rate limiting, caching, and authentication middleware that enhance the security and performance of your proxy deployment.
# Install LiteLLM with proxy support
pip install litellm[proxy]
# Or install basic version
pip install litellm
# Verify installation
litellm --version
Set Environment Variables
Configure your API keys and environment variables. LiteLLM uses environment variables to authenticate with various LLM providers. Store your API keys securely and never commit them to version control. Using environment variables provides better security and flexibility when deploying across different environments like development, staging, and production.
# OpenAI API key
export OPENAI_API_KEY="sk-your-openai-key"
# Anthropic API key
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"
# Azure OpenAI configuration
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-resource.openai.azure.com"
# Google AI / PaLM
export PALM_API_KEY="your-palm-key"
Create Configuration File
Create a configuration file to define your LLM providers, models, and proxy settings. The YAML configuration file provides a centralized way to manage all your settings, making it easy to version control and deploy across different environments. LiteLLM supports extensive configuration options for load balancing, fallbacks, rate limiting, and custom routing rules.
model_list:
- model_name: gpt-4
litellm_params:
model: openai/gpt-4-turbo-preview
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-3
litellm_params:
model: anthropic/claude-3-opus-20240229
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: gpt-35
litellm_params:
model: azure/gpt-35-turbo
api_key: os.environ/AZURE_API_KEY
api_base: os.environ/AZURE_API_BASE
general_settings:
master_key: sk-1234 # Proxy API key
database: os.environ/DATABASE_URL
Start the Proxy Server
Launch your LiteLLM proxy server with the configuration file. The server will start on port 4000 by default and provide an OpenAI-compatible API endpoint that your applications can use. You can customize the port, enable authentication, configure logging, and set up monitoring through additional command-line flags or configuration options.
# Start proxy with config file
litellm --config litellm_config.yaml
# Start with custom port
litellm --config litellm_config.yaml --port 8080
# Start with detailed logging
litellm --config litellm_config.yaml --detailed_debug
# Start in production mode
litellm --config litellm_config.yaml --num_workers 4
Your proxy is now running at http://localhost:4000. You can test it by making a request to http://localhost:4000/v1/chat/completions with your master key in the Authorization header. The proxy automatically routes requests to the appropriate LLM provider based on the model name specified in your configuration file.
Configuration Options
Explore the comprehensive configuration options available for LiteLLM proxy
Essential Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
| master_key | API key for authenticating requests to the proxy server | None |
| database | PostgreSQL connection URL for storing usage data and API keys | SQLite |
| max_parallel_requests | Maximum number of concurrent requests to handle | 100 |
| request_timeout | Timeout in seconds for upstream LLM API requests | 600 |
| fallbacks | List of fallback models if primary model fails | Empty |
| cache | Enable response caching for repeated queries | false |
| success_callback | Webhook URL for success notifications | None |
| failure_callback | Webhook URL for failure notifications | None |
Supported LLM Providers
LiteLLM supports 100+ models from multiple providers with unified API
Key Features
Discover the powerful features that make LiteLLM the preferred choice for AI proxy
Automatic Load Balancing
Distribute requests across multiple providers and models automatically. LiteLLM intelligently routes traffic based on availability, cost, and performance metrics to optimize your AI operations.
Response Caching
Reduce API costs by up to 70% with intelligent response caching. Cache identical or similar queries and serve instant responses without hitting upstream providers, dramatically improving latency.
Authentication & Security
Secure your proxy with API key authentication, rate limiting, and usage tracking. Set up virtual keys for different teams and control access with fine-grained permissions.
Usage Analytics
Track token usage, costs, and performance metrics across all your AI applications. Generate detailed reports and set up budget alerts to control spending.
Fallback & Retry
Ensure high availability with automatic fallback to backup models and intelligent retry logic. Handle provider outages gracefully without affecting your applications.
OpenAI-Compatible API
Use the standard OpenAI SDK and API format with any LLM provider. No code changes required when switching between models or providers in your applications.
Production Deployment
Choose the deployment method that best fits your infrastructure requirements
For production deployments, always configure SSL/TLS certificates, set up proper authentication, implement rate limiting, enable logging and monitoring, use managed databases for persistence, and configure health checks and auto-restart policies. Test your deployment thoroughly before going live to ensure reliability and security.