AI API Proxy Documentation
Complete API reference documentation for AI API Proxy implementations. Version 2026.3
🚀 Quick Start
Get started in minutes with these basic examples:
# Install the SDK
npm install ai-api-proxy
# Make your first API call
import { AIProxyClient } from 'ai-api-proxy';
const client = new AIProxyClient({
apiKey: 'your_api_key_here',
baseURL: 'https://api.example.com/v1'
});
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }]
});
Overview
The AI API Proxy provides a unified interface for accessing multiple AI providers through a single API. It handles authentication, rate limiting, request routing, and response formatting for you.
🔑 Authentication
All API requests require authentication using API keys. Include your API key in the request headers:
GET /v1/models Authorization: Bearer YOUR_API_KEY X-API-Key: YOUR_API_KEY
API keys can be obtained from your dashboard. Keep them secure and never expose them in client-side code.
âš¡ Rate Limiting
The API implements rate limiting to ensure fair usage. Limits are applied per API key and IP address.
| Limit Type | Requests | Window | Description |
|---|---|---|---|
| Standard | 60 | 1 minute | Basic rate limit for all endpoints |
| Chat Completions | 20 | 1 minute | Specific limit for chat endpoints |
| Embeddings | 100 | 1 minute | Higher limit for embeddings |
API Endpoints
The API follows RESTful conventions and returns JSON responses. All endpoints are prefixed with /v1/.
Create chat completions with AI models. Supports streaming responses.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Required | ID of the model to use (e.g., gpt-4, claude-3) |
messages |
array | Required | Array of message objects |
temperature |
number | Optional | Sampling temperature (0-2) |
max_tokens |
number | Optional | Maximum tokens to generate |
stream |
boolean | Optional | Enable streaming responses |
Request Example
{
"model": "gpt-4",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
Response Example
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
Generate embeddings for input text using AI models.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Required | ID of the embedding model |
input |
string/array | Required | Text to embed (string or array of strings) |
encoding_format |
string | Optional | Format of embeddings (float, base64) |
Response Example
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
0.0023064255,
-0.009327292,
... (1536 floats total for text-embedding-ada-002)
],
"index": 0
}
],
"model": "text-embedding-ada-002",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}
List available AI models and their capabilities.
Response Example
{
"object": "list",
"data": [
{
"id": "gpt-4",
"object": "model",
"created": 1677610602,
"owned_by": "openai",
"permission": [...],
"root": "gpt-4",
"parent": null,
"context_length": 8192,
"capabilities": ["chat", "completion"]
},
{
"id": "claude-3-opus",
"object": "model",
"created": 1680000000,
"owned_by": "anthropic",
"permission": [...],
"root": "claude-3-opus",
"parent": null,
"context_length": 200000,
"capabilities": ["chat", "reasoning"]
}
]
}
Error Codes
The API uses conventional HTTP response codes to indicate success or failure of requests.
The request was malformed or missing required parameters.
Invalid or missing authentication credentials.
Valid credentials but insufficient permissions.
Rate limit exceeded. Check rate limit headers.
Server-side error. Contact support if persistent.
Upstream provider unavailable. Retry with exponential backoff.
Error Response Format
{
"error": {
"message": "Invalid API key",
"type": "invalid_request_error",
"param": null,
"code": "invalid_api_key"
}
}
Best Practices
🔄 Retry Logic
Implement exponential backoff for retrying failed requests:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
const response = await fetch(url, options);
if (response.ok) return response;
// Exponential backoff
const delay = Math.pow(2, i) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
} catch (error) {
if (i === maxRetries - 1) throw error;
}
}
}
📊 Monitoring & Logging
- Log all API requests and responses (excluding sensitive data)
- Monitor rate limit usage using response headers
- Track token usage for cost optimization
- Implement alerting for error rate spikes
Related Documentation
Explore these related guides for comprehensive API integration knowledge.
API Gateway Proxy Guide
Comprehensive guide to API gateway proxy implementation
AI API Gateway Tutorial
Step-by-step tutorial for building AI gateways
OpenAI Gateway Setup
Specific setup guide for OpenAI gateway integration
API Gateway Proxy Comparison
Comparison of different proxy solutions and architectures