LLM API Gateway Node.js

Build high-performance AI API gateways with Node.js ecosystem, JavaScript/TypeScript, and modern frameworks for real-time streaming and scalable AI applications.

18M+ Downloads/Day
98% Uptime
<50ms Latency
10K Concurrent Users

Node.js Framework Options

E

Express.js

Minimalist web framework with vast middleware ecosystem, perfect for lightweight AI gateways.

Pros: Simple, flexible, huge community
Best for: Quick prototypes, simple gateways
N

Nest.js

Progressive framework with built-in TypeScript support, dependency injection, and modular architecture.

Pros: Type-safe, enterprise-ready, scalable
Best for: Large-scale, complex AI gateways
F

Fastify

Low-overhead web framework focused on performance with JSON schema validation.

Pros: Fastest Node.js framework, low memory
Best for: High-performance AI APIs

Node.js Gateway Architecture

🚀

HTTP Server

Express/Fastify/Nest.js handling incoming AI requests with middleware pipeline.

🔄

Request Router

Intelligent routing to different AI providers based on model, priority, and cost.

Streaming Engine

Real-time streaming responses for chat completions and AI-generated content.

🔒

Security Layer

API key validation, rate limiting, request signing, and DDoS protection.

Node.js Code Examples

TypeScript
import express, { Request, Response } from 'express';
import cors from 'cors';
import rateLimit from 'express-rate-limit';
import { OpenAI } from 'openai';

const app = express();
const port = process.env.PORT || 3000;

// Middleware
app.use(cors());
app.use(express.json());

// Rate limiting
const limiter = rateLimit({
    windowMs: 15 * 60 * 1000, // 15 minutes
    max: 100, // limit each IP to 100 requests per windowMs
    message: 'Too many requests, please try again later.'
});
app.use('/api/', limiter);

// OpenAI client
const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY
});

// AI Gateway endpoint
app.post('/api/v1/chat/completions', async (req: Request, res: Response) => {
    try {
        const { messages, model, temperature } = req.body;

        // Validate API key
        const apiKey = req.headers.authorization?.replace('Bearer ', '');
        if (!apiKey || !validateApiKey(apiKey)) {
            return res.status(401).json({ error: 'Invalid API key' });
        }

        // Forward to OpenAI
        const completion = await openai.chat.completions.create({
            model: model || 'gpt-4',
            messages,
            temperature: temperature || 0.7,
            stream: false
        });

        // Log usage
        await logUsage(apiKey, completion.usage);

        // Return response
        res.json({
            id: completion.id,
            object: completion.object,
            created: completion.created,
            model: completion.model,
            choices: completion.choices,
            usage: completion.usage
        });

    } catch (error) {
        console.error('Chat completion error:', error);
        res.status(500).json({ 
            error: 'Internal server error',
            message: error.message 
        });
    }
});

// Streaming endpoint
app.post('/api/v1/chat/completions/stream', async (req: Request, res: Response) => {
    res.setHeader('Content-Type', 'text/event-stream');
    res.setHeader('Cache-Control', 'no-cache');
    res.setHeader('Connection', 'keep-alive');

    try {
        const stream = await openai.chat.completions.create({
            model: 'gpt-4',
            messages: req.body.messages,
            temperature: 0.7,
            stream: true
        });

        for await (const chunk of stream) {
            const data = `data: ${JSON.stringify(chunk)}\n\n`;
            res.write(data);
        }

        res.write('data: [DONE]\n\n');
        res.end();

    } catch (error) {
        const errorData = `data: ${JSON.stringify({ error: error.message })}\n\n`;
        res.write(errorData);
        res.end();
    }
});

// Start server
app.listen(port, () => {
    console.log(`AI API Gateway running on port ${port}`);
});

Real-time Streaming Demo

Node.js enables true real-time AI responses with streaming capabilities.

$ curl -X POST https://api.gateway.example.com/v1/chat/completions/stream \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}], "model": "gpt-4"}'

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1700000000,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1700000000,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}]}
data: [DONE]

Essential NPM Packages

Express

Web framework

🧩

Nest.js

Framework

🚀

Fastify

Fast framework

🔒

Rate Limiter

Request limiting

📦

OpenAI SDK

AI integration

⚙️

Axios

HTTP client

📊

Winston

Logging

🔧

Jest

Testing