AI API Proxy
API Documentation

Complete reference for AI proxy API endpoints, authentication methods, request/response formats, and integration patterns for production deployments.

https://api.aiproxy.io/v1

Authentication

All API requests require authentication using Bearer tokens. Include your API key in the Authorization header of each request.

curl https://api.aiproxy.io/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

API keys can be obtained from your dashboard. Keys are scoped to specific projects and can have different permission levels.

Note: Never expose your API key in client-side code or public repositories. Use environment variables or secret management services in production.
GET /models

List Available Models

Retrieves a list of all AI models available through the proxy, including their capabilities, context windows, and pricing information.

Request Parameters

Parameter Type Required Description
provider string No Filter by provider (openai, anthropic, google)
type string No Filter by model type (chat, completion, embedding)

Response

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4-turbo",
      "object": "model",
      "provider": "openai",
      "type": "chat",
      "context_window": 128000,
      "pricing": {
        "input": 0.01,
        "output": 0.03
      }
    }
  ]
}
200

Successful response

401

Invalid API key

500

Server error

POST /completions

Create Completion

Generates text completions based on a provided prompt. Supports multiple AI providers with automatic failover and load balancing.

Request Body

Parameter Type Required Description
model string Yes Model ID to use for completion
prompt string | array Yes Input prompt(s) for completion
max_tokens integer No Maximum tokens to generate (default: 16)
temperature number No Sampling temperature 0-2 (default: 1)
provider string No Preferred provider (auto-selected if omitted)

Example Request

curl https://api.aiproxy.io/v1/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4-turbo",
    "prompt": "Explain quantum computing in simple terms",
    "max_tokens": 500,
    "temperature": 0.7
  }'

Response

{
  "id": "cmpl-abc123",
  "object": "text_completion",
  "created": 1699000000,
  "model": "gpt-4-turbo",
  "provider": "openai",
  "choices": [
    {
      "text": "Quantum computing is a revolutionary approach...",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 488,
    "total_tokens": 500
  }
}
POST /chat/completions

Create Chat Completion

Creates a chat completion with conversation context. Supports multi-turn conversations, system prompts, and various message roles.

Request Body

Parameter Type Required Description
model string Yes Model ID for chat completion
messages array Yes Array of message objects with role and content
temperature number No Sampling temperature 0-2
stream boolean No Enable streaming responses

Example Request

{
  "model": "claude-3-opus",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is machine learning?"}
  ],
  "temperature": 0.8,
  "stream": false
}
200

Chat completion created

400

Invalid request body

429

Rate limit exceeded

POST /embeddings

Create Embeddings

Generates vector embeddings for text inputs. Useful for semantic search, clustering, and similarity comparisons.

Request Body

Parameter Type Required Description
model string Yes Embedding model ID (e.g., text-embedding-3-large)
input string | array Yes Text input(s) to embed
dimensions integer No Output dimensions (model-dependent)
curl https://api.aiproxy.io/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Error Handling

The API uses standard HTTP status codes to indicate errors. Error responses include detailed messages to help diagnose issues.

Error Response Format

{
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_api_key",
    "message": "The API key provided is invalid",
    "param": null
  }
}

Common Error Codes

Code Type Description
invalid_api_key authentication_error API key is missing or invalid
rate_limit_exceeded rate_limit_error Too many requests in time window
model_not_found invalid_request_error Specified model does not exist
context_length_exceeded invalid_request_error Input exceeds model context window

Rate Limits

Rate limits are applied per API key and vary based on your subscription tier. Rate limit information is included in response headers.

Rate Limit Headers

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1699000000
Best Practice: Implement exponential backoff when receiving 429 errors. Start with 1 second delay and double on each subsequent failure up to 60 seconds.

Partner Resources