AI API Proxy Documentation

Complete API reference documentation for AI API Proxy implementations. Version 2026.3

🚀 Quick Start

Get started in minutes with these basic examples:

bash

# Install the SDK
npm install ai-api-proxy

# Make your first API call
import { AIProxyClient } from 'ai-api-proxy';

const client = new AIProxyClient({
    apiKey: 'your_api_key_here',
    baseURL: 'https://api.example.com/v1'
});

const response = await client.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Hello!' }]
});

Overview

The AI API Proxy provides a unified interface for accessing multiple AI providers through a single API. It handles authentication, rate limiting, request routing, and response formatting for you.

🔑 Authentication

All API requests require authentication using API keys. Include your API key in the request headers:

http

GET /v1/models
Authorization: Bearer YOUR_API_KEY
X-API-Key: YOUR_API_KEY

API keys can be obtained from your dashboard. Keep them secure and never expose them in client-side code.

⚡ Rate Limiting

The API implements rate limiting to ensure fair usage. Limits are applied per API key and IP address.

Limit Type	Requests	Window	Description
Standard	60	1 minute	Basic rate limit for all endpoints
Chat Completions	20	1 minute	Specific limit for chat endpoints
Embeddings	100	1 minute	Higher limit for embeddings

API Endpoints

The API follows RESTful conventions and returns JSON responses. All endpoints are prefixed with /v1/.

POST /v1/chat/completions

Create chat completions with AI models. Supports streaming responses.

Request Parameters

Parameter	Type	Required	Description
`model`	string	Required	ID of the model to use (e.g., gpt-4, claude-3)
`messages`	array	Required	Array of message objects
`temperature`	number	Optional	Sampling temperature (0-2)
`max_tokens`	number	Optional	Maximum tokens to generate
`stream`	boolean	Optional	Enable streaming responses

Request Example

json

{
  "model": "gpt-4",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Response Example

200 OK Success Response

json

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

POST /v1/embeddings

Generate embeddings for input text using AI models.

Request Parameters

Parameter	Type	Required	Description
`model`	string	Required	ID of the embedding model
`input`	string/array	Required	Text to embed (string or array of strings)
`encoding_format`	string	Optional	Format of embeddings (float, base64)

Response Example

200 OK Success Response

json

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        ... (1536 floats total for text-embedding-ada-002)
      ],
      "index": 0
    }
  ],
  "model": "text-embedding-ada-002",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

GET /v1/models

List available AI models and their capabilities.

Response Example

200 OK Success Response

json

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1677610602,
      "owned_by": "openai",
      "permission": [...],
      "root": "gpt-4",
      "parent": null,
      "context_length": 8192,
      "capabilities": ["chat", "completion"]
    },
    {
      "id": "claude-3-opus",
      "object": "model",
      "created": 1680000000,
      "owned_by": "anthropic",
      "permission": [...],
      "root": "claude-3-opus",
      "parent": null,
      "context_length": 200000,
      "capabilities": ["chat", "reasoning"]
    }
  ]
}

Error Codes

The API uses conventional HTTP response codes to indicate success or failure of requests.

400 Bad Request

The request was malformed or missing required parameters.

401 Unauthorized

Invalid or missing authentication credentials.

403 Forbidden

Valid credentials but insufficient permissions.

429 Too Many Requests

Rate limit exceeded. Check rate limit headers.

500 Internal Server Error

Server-side error. Contact support if persistent.

502 Bad Gateway

Upstream provider unavailable. Retry with exponential backoff.

Error Response Format

json

{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "param": null,
    "code": "invalid_api_key"
  }
}

Best Practices

🔄 Retry Logic

Implement exponential backoff for retrying failed requests:

javascript

async function makeRequestWithRetry(url, options, maxRetries = 3) {
    for (let i = 0; i < maxRetries; i++) {
        try {
            const response = await fetch(url, options);
            if (response.ok) return response;
            
            // Exponential backoff
            const delay = Math.pow(2, i) * 1000;
            await new Promise(resolve => setTimeout(resolve, delay));
        } catch (error) {
            if (i === maxRetries - 1) throw error;
        }
    }
}

📊 Monitoring & Logging

Log all API requests and responses (excluding sensitive data)
Monitor rate limit usage using response headers
Track token usage for cost optimization
Implement alerting for error rate spikes

Overview

🔑 Authentication

⚡ Rate Limiting

API Endpoints

Request Parameters

Request Example

Response Example

Request Parameters

Response Example

Response Example

Error Codes

Error Response Format

Best Practices

🔄 Retry Logic

📊 Monitoring & Logging

Related Documentation

API Gateway Proxy Guide

AI API Gateway Tutorial

OpenAI Gateway Setup

API Gateway Proxy Comparison