Authentication
All API requests require authentication using Bearer tokens. Include your API key in the Authorization header of each request.
curl https://api.aiproxy.io/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
API keys can be obtained from your dashboard. Keys are scoped to specific projects and can have different permission levels.
List Available Models
Retrieves a list of all AI models available through the proxy, including their capabilities, context windows, and pricing information.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| provider | string | No | Filter by provider (openai, anthropic, google) |
| type | string | No | Filter by model type (chat, completion, embedding) |
Response
{
"object": "list",
"data": [
{
"id": "gpt-4-turbo",
"object": "model",
"provider": "openai",
"type": "chat",
"context_window": 128000,
"pricing": {
"input": 0.01,
"output": 0.03
}
}
]
}
200
Successful response
401
Invalid API key
500
Server error
Create Completion
Generates text completions based on a provided prompt. Supports multiple AI providers with automatic failover and load balancing.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID to use for completion |
| prompt | string | array | Yes | Input prompt(s) for completion |
| max_tokens | integer | No | Maximum tokens to generate (default: 16) |
| temperature | number | No | Sampling temperature 0-2 (default: 1) |
| provider | string | No | Preferred provider (auto-selected if omitted) |
Example Request
curl https://api.aiproxy.io/v1/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4-turbo",
"prompt": "Explain quantum computing in simple terms",
"max_tokens": 500,
"temperature": 0.7
}'
Response
{
"id": "cmpl-abc123",
"object": "text_completion",
"created": 1699000000,
"model": "gpt-4-turbo",
"provider": "openai",
"choices": [
{
"text": "Quantum computing is a revolutionary approach...",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 488,
"total_tokens": 500
}
}
Create Chat Completion
Creates a chat completion with conversation context. Supports multi-turn conversations, system prompts, and various message roles.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID for chat completion |
| messages | array | Yes | Array of message objects with role and content |
| temperature | number | No | Sampling temperature 0-2 |
| stream | boolean | No | Enable streaming responses |
Example Request
{
"model": "claude-3-opus",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is machine learning?"}
],
"temperature": 0.8,
"stream": false
}
200
Chat completion created
400
Invalid request body
429
Rate limit exceeded
Create Embeddings
Generates vector embeddings for text inputs. Useful for semantic search, clustering, and similarity comparisons.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Embedding model ID (e.g., text-embedding-3-large) |
| input | string | array | Yes | Text input(s) to embed |
| dimensions | integer | No | Output dimensions (model-dependent) |
curl https://api.aiproxy.io/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-large",
"input": "The quick brown fox jumps over the lazy dog"
}'
Error Handling
The API uses standard HTTP status codes to indicate errors. Error responses include detailed messages to help diagnose issues.
Error Response Format
{
"error": {
"type": "invalid_request_error",
"code": "invalid_api_key",
"message": "The API key provided is invalid",
"param": null
}
}
Common Error Codes
| Code | Type | Description |
|---|---|---|
| invalid_api_key | authentication_error | API key is missing or invalid |
| rate_limit_exceeded | rate_limit_error | Too many requests in time window |
| model_not_found | invalid_request_error | Specified model does not exist |
| context_length_exceeded | invalid_request_error | Input exceeds model context window |
Rate Limits
Rate limits are applied per API key and vary based on your subscription tier. Rate limit information is included in response headers.
Rate Limit Headers
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1699000000
Partner Resources
AI API Gateway API Reference
Complete API reference for gateway configuration and management.
API Gateway Proxy Documentation
Comprehensive guide for proxy setup and configuration.
LLM API Gateway Swagger
OpenAPI specifications for LLM gateway endpoints.
AI API Gateway Comparison
Compare features and pricing across gateway providers.