Node.js Express

Node.js OpenAI Proxy

Build async OpenAI proxy servers with Node.js and Express. Non-blocking I/O, streaming support, and npm ecosystem for production-ready AI API gateways.

Why Node.js for OpenAI Proxy?

Node.js excels at building network applications thanks to its event-driven, non-blocking I/O model. This architecture is ideal for proxy servers that must handle many concurrent connections to LLM APIs, each potentially involving long-running requests or streaming responses.

The JavaScript ecosystem provides mature HTTP clients like axios and node-fetch, along with Express.js for building RESTful APIs. Combined with Node's native streaming capabilities, you can efficiently proxy LLM streaming responses without buffering entire payloads in memory.

NPM offers thousands of packages for every conceivable need: rate limiting with express-rate-limit, caching with node-cache, logging with winston, and metrics with prom-client. This rich ecosystem accelerates development while maintaining flexibility.

~50MB Memory Footprint
10K+ Concurrent Connections
2M+ NPM Packages
Event Loop Single Thread

Basic Proxy Implementation

Create a simple Express server that forwards requests to OpenAI's API. This implementation includes error handling and proper header forwarding.

server.js
const express = require('express'); const axios = require('axios'); const app = express(); app.use(express.json()); // Proxy endpoint for chat completions app.post('/v1/chat/completions', async (req, res) => { try { const response = await axios.post( 'https://api.openai.com/v1/chat/completions', req.body, { headers: { 'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`, 'Content-Type': 'application/json' } } ); res.json(response.data); } catch (error) { res.status(error.response?.status || 500).json({ error: error.response?.data || { message: 'Proxy error' } }); } }); const PORT = process.env.PORT || 3000; app.listen(PORT, () => { console.log(`Proxy server running on port ${PORT}`); });

Streaming Implementation

Streaming responses are essential for real-time chat experiences. Node.js streams enable efficient proxying of server-sent events without buffering entire responses.

streaming.js
const express = require('express'); const app = express(); app.use(express.json()); app.post('/v1/chat/completions', async (req, res) => { // Set headers for SSE res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); const response = await fetch('https://api.openai.com/v1/chat/completions', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ ...req.body, stream: true }) }); // Pipe the stream directly to response response.body.pipe(res); }); const PORT = process.env.PORT || 3000; app.listen(PORT);

Key Features

Non-Blocking I/O

Handle thousands of concurrent connections with a single thread. Event loop architecture eliminates thread-per-connection overhead.

📦

NPM Ecosystem

Access millions of packages for every need. From authentication to caching, find battle-tested solutions ready to use.

🔄

Native Streaming

Built-in stream support for handling large payloads and real-time data. Pipe data efficiently without memory overhead.

🚀

Express.js

Minimal, flexible web framework for building APIs. Middleware pattern enables clean separation of concerns.

🐳

Docker Ready

Lightweight container images for easy deployment. Official Node.js Docker images optimized for production.

📊

Observability

Winston for logging, Prometheus for metrics, OpenTelemetry for tracing. Comprehensive monitoring tools available.

Middleware Pattern

Express middleware enables composable request processing. Add authentication, rate limiting, logging, and caching as separate middleware functions for clean, maintainable code.

middleware.js
const rateLimit = require('express-rate-limit'); // Rate limiting middleware const limiter = rateLimit({ windowMs: 60 * 1000, // 1 minute max: 100, // limit each IP to 100 requests per windowMs message: { error: 'Too many requests' } }); // Authentication middleware function authenticate(req, res, next) { const apiKey = req.headers['x-api-key']; if (!apiKey || !validApiKeys.has(apiKey)) { return res.status(401).json({ error: 'Unauthorized' }); } req.user = { apiKey }; next(); } // Apply middleware app.use('/v1/', limiter, authenticate);

Benefits

JavaScript Everywhere

Share code between frontend and backend. Use the same language across your entire stack for easier maintenance.

Fast Development

Dynamic typing and extensive NPM ecosystem accelerate prototyping. Go from idea to production quickly.

JSON Native

JSON is JavaScript's native format. No serialization overhead for API request/response handling.

Large Community

Massive developer community and extensive documentation. Find solutions to common problems easily.

Async/Await

Modern async/await syntax for clean asynchronous code. No callback hell with proper error handling.

Serverless Ready

Deploy to AWS Lambda, Google Cloud Functions, or Azure Functions. Node.js is the preferred runtime for serverless.

Production Considerations

Process Management: Use PM2 for process management and clustering. PM2 enables zero-downtime reloads, automatic restarts, and multi-core utilization.

Environment Variables: Store API keys and configuration in environment variables. Use dotenv for local development and secure secret management in production.

Health Checks: Implement /health and /ready endpoints for orchestration platforms. Return meaningful status information for monitoring.

Graceful Shutdown: Handle SIGTERM and SIGINT signals properly. Drain in-flight requests before terminating the process.

Build Your Node.js OpenAI Proxy

Create production-ready AI API gateways with Node.js and Express.

Get Started