Production-Ready Node.js: Scalable Web Server Setup with Load Balancing, Monitoring & Security


2 views

For mission-critical applications, you'll want a process manager that automatically restarts crashed instances. PM2 is the industry standard for Node.js:

// Install PM2 globally
npm install pm2 -g

// Start your app in cluster mode (auto-restart + load balancing)
pm2 start app.js -i max --name "api-cluster"

// Save current process list
pm2 save

// Generate startup script (for server reboots)
pm2 startup

// Monitor logs in real-time
pm2 logs

Node.js clustering vs. reverse proxy approaches:

// Native cluster module example
const cluster = require('cluster');
const os = require('os');

if (cluster.isMaster) {
  const numCPUs = os.cpus().length;
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
  
  cluster.on('exit', (worker) => {
    console.log(Worker ${worker.process.pid} died);
    cluster.fork();
  });
} else {
  require('./app');
}

Alternatively, use Nginx as reverse proxy:

# nginx.conf example
upstream node_app {
  server 127.0.0.1:3000;
  server 127.0.0.1:3001;
  server 127.0.0.1:3002;
  keepalive 64;
}

server {
  listen 80;
  
  location / {
    proxy_pass http://node_app;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $host;
    proxy_cache_bypass $http_upgrade;
  }
}

Essential tools for production monitoring:

  • PM2 monitoring: pm2 monit
  • Node Clinic for performance profiling
  • New Relic/DataDog APM integration
  • Custom health check endpoints
// Health check endpoint example
app.get('/health', (req, res) => {
  const healthcheck = {
    uptime: process.uptime(),
    memory: process.memoryUsage(),
    cpu: process.cpuUsage(),
    status: 'OK',
    timestamp: Date.now()
  };
  res.json(healthcheck);
});

Critical security measures for production Node.js:

  1. Always use helmet.js middleware
  2. Implement rate limiting
  3. Keep dependencies updated with npm audit
  4. Environment variable management
// Basic security setup
const helmet = require('helmet');
const rateLimit = require('express-rate-limit');

app.use(helmet());
app.use(rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100 // limit each IP to 100 requests per window
}));

Example GitHub Actions workflow for zero-downtime deployments:

# .github/workflows/deploy.yml
name: Node.js CI/CD

on:
  push:
    branches: [ main ]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    
    - name: Setup Node.js
      uses: actions/setup-node@v2
      with:
        node-version: '16'
        
    - name: Install dependencies
      run: npm ci
      
    - name: Run tests
      run: npm test
      
    - name: Deploy to production
      run: |
        ssh user@server "cd /var/www/app && \
        git pull origin main && \
        npm ci --production && \
        pm2 reload ecosystem.config.js"

For production deployments, never run Node.js directly. Use process managers like PM2 or Forever. Here's a PM2 configuration example:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'api-server',
    script: './server.js',
    instances: 'max',
    exec_mode: 'cluster',
    max_memory_restart: '1G',
    error_file: './logs/err.log',
    out_file: './logs/out.log',
    merge_logs: true,
    env: {
      NODE_ENV: 'production'
    }
  }]
}

Key features to enable:

  • watch mode for development (disabled in production)
  • autorestart to handle crashes
  • max_restarts to prevent infinite crash loops
  • cluster mode to utilize multiple CPU cores

For horizontal scaling, combine Node.js clusters with either:

// Native cluster module example
const cluster = require('cluster');
const os = require('os');

if (cluster.isMaster) {
  const cpuCount = os.cpus().length;
  for (let i = 0; i < cpuCount; i++) {
    cluster.fork();
  }
  
  cluster.on('exit', (worker) => {
    console.log(Worker ${worker.id} died);
    cluster.fork();
  });
} else {
  require('./server');
}

Alternatively, use Nginx as reverse proxy:

# nginx.conf
upstream node_backend {
  server 127.0.0.1:3000;
  server 127.0.0.1:3001;
  keepalive 64;
}

server {
  listen 80;
  
  location / {
    proxy_pass http://node_backend;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
  }
}

Essential monitoring tools:

  1. PM2 monitoring: pm2 monit
  2. New Relic Node.js agent
  3. Datadog APM
  4. Custom health check endpoints:
app.get('/health', (req, res) => {
  const health = {
    uptime: process.uptime(),
    memory: process.memoryUsage(),
    cpu: process.cpuUsage(),
    status: 'OK'
  };
  res.json(health);
});

Critical security practices:

// In your Express app
const helmet = require('helmet');
const rateLimit = require('express-rate-limit');

app.use(helmet());
app.use(rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100
}));

// Environment configuration
require('dotenv').config();
process.env.NODE_ENV = 'production';

Additional measures:

  • Regular dependency updates with npm audit
  • HTTPS enforcement via LetsEncrypt
  • Proper CORS configuration
  • Input validation for all endpoints

Production-grade error management:

const winston = require('winston');
const { Loggly } = require('winston-loggly-bulk');

const logger = winston.createLogger({
  transports: [
    new winston.transports.File({ filename: 'error.log', level: 'error' }),
    new Loggly({
      token: 'YOUR_TOKEN',
      subdomain: 'YOUR_SUBDOMAIN',
      tags: ['NodeJS'],
      json: true
    })
  ]
});

process.on('uncaughtException', (err) => {
  logger.error('Uncaught exception:', err);
  process.exit(1);
});

process.on('unhandledRejection', (reason) => {
  logger.error('Unhandled rejection:', reason);
});

Sample CI/CD pipeline (.github/workflows/deploy.yml):

name: Node.js CI

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Use Node.js
      uses: actions/setup-node@v1
      with:
        node-version: '14.x'
    - run: npm ci
    - run: npm test
    - name: Deploy to production
      if: success()
      uses: appleboy/ssh-action@master
      with:
        host: ${{ secrets.PRODUCTION_HOST }}
        username: ${{ secrets.PRODUCTION_USER }}
        key: ${{ secrets.PRODUCTION_SSH_KEY }}
        script: |
          cd /var/www/app
          git pull origin main
          npm ci --production
          pm2 reload ecosystem.config.js