Implement Rate Limiting in Node.js with Redis (2026 Guide)

Mar 16, 2026
9 min read
Implement Rate Limiting in Node.js with Redis (2026 Guide)

Why Rate Limiting Matters

Rate limiting protects your SaaS APIs from abuse, prevents resource exhaustion, and ensures fair usage across tenants. Simple INCR + EXPIRE approaches allow ~2x bursts at window boundaries—production systems need sliding window or token bucket algorithms with Redis for precision and horizontal scalability.

This guide implements both patterns using atomic Lua scripts in Redis with ioredis for Express.js middleware.

Algorithm Comparison (Redis-Based)

AlgorithmRedis StructuresMemory per ClientAccuracyBurst BehaviorBest For
Sliding Window Counter2 STRING keys + Lua2 keysNear-exactSmoothed edgesGeneral APIs; approximates true sliding by overlapping counters
Sliding Window LogSORTED SET + LuaO(n) entriesExactNo burstsAudit-heavy APIs; logs timestamps, trims old ones
Token Bucket1 HASH (tokens, last-refill) + Lua1 key (2 fields)ExactControlled burstsBursty traffic; refills tokens at fixed rate

Fixed window (simple INCR + EXPIRE) is easier but allows ~2x bursts at edges—use sliding/token for production.

Sliding Window Counter Implementation

Uses two counters: current and previous window, weighted by overlap for smoothness. Atomic Lua script prevents race conditions.

Lua Script for Atomic Rate Check

local key = KEYS[1]
local current_window = tonumber(ARGV[1])
local window_size = tonumber(ARGV[2])
local max_requests = tonumber(ARGV[3])
local current_time = tonumber(ARGV[4])

local prev_window = current_window - window_size
local curr_key = key .. ':' .. current_window
local prev_key = key .. ':' .. prev_window

-- Remove old window if expired
redis.call('EXPIRE', prev_key, 0)

local curr_count = tonumber(redis.call('GET', curr_key) or 0)
local prev_count = tonumber(redis.call('GET', prev_key) or 0)
local weight = (current_time - (current_window * window_size)) / window_size

local total = curr_count + (prev_count * weight)
if total >= max_requests then
  local ttl = redis.call('TTL', curr_key)
  return {0, ttl}
end

redis.call('INCR', curr_key)
redis.call('EXPIRE', curr_key, window_size)
return {1, total}

Node.js Express Middleware

const Redis = require('ioredis');
const redis = new Redis();

const WINDOW_SIZE = 60; // seconds
const MAX_REQUESTS = 100;

const slidingWindowScript = `
local key = KEYS[1]
local current_window = tonumber(ARGV[1])
local window_size = tonumber(ARGV[2])
local max_requests = tonumber(ARGV[3])
local current_time = tonumber(ARGV[4])

local prev_window = current_window - window_size
local curr_key = key .. ':' .. current_window
local prev_key = key .. ':' .. prev_window

redis.call('EXPIRE', prev_key, 0)

local curr_count = tonumber(redis.call('GET', curr_key) or 0)
local prev_count = tonumber(redis.call('GET', prev_key) or 0)
local weight = (current_time - (current_window * window_size)) / window_size

local total = curr_count + (prev_count * weight)
if total >= max_requests then
  local ttl = redis.call('TTL', curr_key)
  return {0, ttl}
end

redis.call('INCR', curr_key)
redis.call('EXPIRE', curr_key, window_size)
return {1, total}
`;

const luaScript = new redis.Script(slidingWindowScript);

async function slidingWindowRateLimit(req, res, next) {
  const identifier = req.ip;
  const key = `ratelimit:${identifier}`;
  const now = Math.floor(Date.now() / 1000);
  const windowStart = Math.floor(now / WINDOW_SIZE) * WINDOW_SIZE;

  const result = await luaScript.eval(redis, [key], [windowStart, WINDOW_SIZE, MAX_REQUESTS, now]);
  
  if (result[0] === 0) {
    const ttl = result[1];
    res.set('Retry-After', ttl);
    return res.status(429).json({ error: 'Rate limit exceeded', retryAfter: ttl });
  }
  
  res.set('X-RateLimit-Remaining', MAX_REQUESTS - result[1]);
  next();
}

module.exports = slidingWindowRateLimit;

How it works:

  • Computes overlapping windows via timestamps
  • Lua atomically increments, weights previous count, checks limit, sets TTL
  • Returns allowed (1) or blocked (0) + TTL
  • Express middleware injects rate limit headers for client feedback

Token Bucket Implementation

Tracks available tokens in a HASH; refills at rate (e.g., 100/hour = ~0.028/sec). Allows bursts up to bucket capacity—ideal for uneven traffic patterns.

Token Bucket Lua Script

local key = KEYS[1]
local rate = tonumber(ARGV[1])  -- tokens per second
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

local data = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(data[1] or capacity)
local last_refill = tonumber(data[2] or now)

local elapsed = now - last_refill
tokens = math.min(capacity, tokens + (elapsed * rate))

if tokens >= 1 then
  tokens = tokens - 1
  redis.call('HSET', key, 'tokens', tokens, 'last_refill', now)
  redis.call('EXPIRE', key, 3600)  -- 1hr window
  return {1, tokens}
else
  local ttl = redis.call('TTL', key)
  return {0, ttl}
end

Node.js Middleware

const tokenBucketScript = `
local key = KEYS[1]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

local data = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(data[1] or capacity)
local last_refill = tonumber(data[2] or now)

local elapsed = now - last_refill
tokens = math.min(capacity, tokens + (elapsed * rate))

if tokens >= 1 then
  tokens = tokens - 1
  redis.call('HSET', key, 'tokens', tokens, 'last_refill', now)
  redis.call('EXPIRE', key, 3600)
  return {1, tokens}
else
  local ttl = redis.call('TTL', key)
  return {0, ttl}
end
`;

const tokenScript = new redis.Script(tokenBucketScript);

async function tokenBucketRateLimit(req, res, next) {
  const identifier = req.ip;
  const key = `tokenbucket:${identifier}`;
  const RATE = 100 / 3600; // 100 tokens/hour
  const CAPACITY = 10; // burst up to 10
  const now = Math.floor(Date.now() / 1000);

  const result = await tokenScript.eval(redis, [key], [RATE, CAPACITY, now]);
  
  if (result[0] === 0) {
    res.set('Retry-After', result[1]);
    return res.status(429).json({ error: 'Rate limit exceeded' });
  }
  
  res.set('X-RateLimit-Remaining', Math.floor(result[1]));
  next();
}

module.exports = tokenBucketRateLimit;

How it works:

  • Refills tokens proportional to elapsed time since last check
  • Consumes 1 token per request; blocks if zero
  • Lua ensures atomicity; ideal for uneven traffic
  • Burst capacity allows short traffic spikes without penalty

Production Setup & Best Practices

Express.js Integration

const express = require('express');
const slidingWindowRateLimit = require('./middleware/slidingWindow');
const tokenBucketRateLimit = require('./middleware/tokenBucket');

const app = express();

// Global rate limit (sliding window)
app.use(slidingWindowRateLimit);

// Per-endpoint rate limit (token bucket for bursty endpoints)
app.post('/api/upload', tokenBucketRateLimit, (req, res) => {
  res.json({ status: 'uploaded' });
});

app.get('/', (req, res) => res.send('Rate limited API'));
app.listen(3000, () => console.log('Server on :3000'));

Per-Endpoint Customization

// Different limits per endpoint
function createRateLimiter(endpoint, maxRequests, windowSize) {
  return async (req, res, next) => {
    const identifier = req.ip;
    const key = `ratelimit:${endpoint}:${identifier}`;
    // ... (use key in Lua script)
  };
}

app.post('/api/heavy', createRateLimiter('heavy', 10, 60), handler);
app.get('/api/light', createRateLimiter('light', 1000, 60), handler);

Production Deployment Tips

  • Shared Redis Cluster: Use Redis Cluster or Sentinel for high availability; all app instances share the same rate limit state
  • Headers: Always add X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After for client awareness
  • Fallback: Handle Redis failures with in-memory fallback (e.g., rate-limit-redis + express-rate-limit)
  • Monitoring: Track Redis INFO metrics for memory usage and command latency
  • Authentication: Use req.user.id instead of req.ip for authenticated endpoints to prevent IP-based bypasses
  • Libraries: For simpler setup, use rate-limit-redis with express-rate-limit (supports sliding windows)

Horizontal Scaling

StrategyProsConsUse Case
Shared RedisExact limits across instancesSingle point of failure (use cluster)Multi-instance production
Local + Redis SyncLow latencyDrift between instancesApproximate rate limiting
Edge Rate LimitingOffload to CDN/API GatewayLess controlSimple global limits

FAQs

What's the difference between sliding window and token bucket?

Sliding window provides smooth, predictable rate limiting by overlapping time windows. Token bucket allows controlled bursts up to capacity while maintaining average rate. Use sliding window for fairness, token bucket for handling traffic spikes.

Why use Lua scripts instead of multiple Redis commands?

Lua scripts execute atomically on Redis, preventing race conditions when multiple app instances check/update limits simultaneously. Without Lua, two requests could both pass the limit check before either increments the counter.

How do I handle multi-tenant rate limiting?

Include tenant ID in the Redis key: ratelimit:${tenantId}:${endpoint}:${userId}. Configure different limits per tier (free/pro/enterprise) by adjusting MAX_REQUESTS based on req.user.tier.

What happens if Redis goes down?

Implement fallback to in-memory rate limiting using libraries like express-rate-limit with memory store. Set shorter TTLs on fallback to prevent over-blocking. Monitor Redis health and alert on failures.

Can I rate limit by API key instead of IP?

Yes—replace req.ip with req.headers['x-api-key'] or req.user.id for authenticated endpoints. Always validate the identifier exists before building the Redis key.

Need an expert team to provide digital solutions for your business?

Book A Free Call

Related Articles & Resources

Dive into a wealth of knowledge with our unique articles and resources. Stay informed about the latest trends and best practices in the tech industry.

View All articles
Get in Touch

Let's build somethinggreat together.

Tell us about your vision. We'll respond within 24 hours with a free AI-powered estimate.

🎁This month only: Free UI/UX Design worth $3,000
Takes just 2 minutes
* How did you hear about us?
or prefer instant chat?

Quick question? Chat on WhatsApp

Get instant responses • Just takes 5 seconds

Response in 24 hours
100% confidential
No commitment required
🛡️100% Satisfaction Guarantee — If you're not happy with the estimate, we'll refine it for free
Propelius Technologies

You bring the vision. We handle the build.

facebookinstagramLinkedinupworkclutch

© 2026 Propelius Technologies. All rights reserved.