服务端缓存策略

问题

服务端有哪些缓存策略？如何设计多级缓存架构？缓存穿透、击穿、雪崩怎么解决？本地缓存和分布式缓存怎么选？缓存 key 怎么设计？如何保证缓存和数据库的一致性？

答案

多级缓存架构

在高并发系统中，单一缓存层远远不够。多级缓存通过在不同层级设置缓存，逐层拦截请求，最大限度减少数据库压力。

缓存层级	典型命中率	响应速度	适用场景	容量	一致性
CDN	60-95%	< 10ms	静态资源、API 响应缓存	海量	最终一致
Nginx	40-80%	< 5ms	热点 API、页面片段	GB 级	TTL 控制
应用内存	30-60%	< 0.1ms	配置、字典、热点数据	MB 级	进程内一致
Redis	80-99%	1-5ms	通用数据缓存、Session	TB 级	需主动维护
数据库	—	10-100ms	数据源	—	强一致

多级缓存的核心思想

越靠近用户的缓存，命中率越高、响应越快、但一致性越难保证。实际设计中一般 2-3 层即可，不要盲目堆叠层级。

CDN 缓存

CDN 缓存主要用于静态资源和可缓存的 API 响应。通过设置 HTTP 缓存头控制：

nestjs/cdn-cache-headers.ts
import { Controller, Get, Header } from '@nestjs/common';

@Controller('api')
export class ProductController {
  // 对于不经常变化的数据，设置 CDN 和浏览器缓存
  @Header('Cache-Control', 'public, max-age=300, s-maxage=600')
  @Header('CDN-Cache-Control', 'max-age=600') // Cloudflare 等 CDN 专用头
  @Get('products/popular')
  async getPopularProducts() {
    return this.productService.getPopular();
  }

  // 用户相关数据禁止 CDN 缓存
  @Header('Cache-Control', 'private, no-store')
  @Get('user/profile')
  async getUserProfile() {
    return this.userService.getProfile();
  }
}

Nginx 代理缓存

Nginx 的 proxy_cache 可以将后端响应缓存到本地磁盘或内存，减少对应用服务器的请求。

nginx.conf
# 定义缓存区域
proxy_cache_path /var/cache/nginx levels=1:2
  keys_zone=api_cache:100m    # 共享内存区域，存储缓存 key 的元信息
  max_size=10g                # 缓存最大磁盘空间
  inactive=60m                # 60 分钟未访问的缓存自动清除
  use_temp_path=off;          # 避免文件从临时目录移动到缓存目录

server {
    location /api/ {
        proxy_pass http://backend;

        # 启用缓存
        proxy_cache api_cache;
        proxy_cache_valid 200 10m;      # 200 状态码缓存 10 分钟
        proxy_cache_valid 404 1m;       # 404 缓存 1 分钟
        proxy_cache_key "$scheme$request_method$host$request_uri";

        # 缓存状态头，方便调试
        add_header X-Cache-Status $upstream_cache_status;

        # 后端故障时使用过期缓存
        proxy_cache_use_stale error timeout updating http_500 http_502;

        # 防止缓存击穿：同一时间只有一个请求穿透到后端
        proxy_cache_lock on;
        proxy_cache_lock_timeout 5s;
    }

    # 静态资源长期缓存
    location /static/ {
        expires 1y;
        add_header Cache-Control "public, immutable";
    }
}

Nginx 缓存状态

X-Cache-Status 头的值包括：MISS（未命中）、HIT（命中）、EXPIRED（过期，已重新请求后端）、STALE（使用了过期缓存）、BYPASS（绕过缓存）。

应用内存缓存

应用内存缓存是最快的缓存层（纳秒级访问），适用于高频访问的小数据量场景。

nestjs/memory-cache.service.ts
import { Injectable, OnModuleInit } from '@nestjs/common';
import { LRUCache } from 'lru-cache';

interface CacheEntry<T> {
  value: T;
  refreshAt: number; // 后台刷新时间戳
}

@Injectable()
export class MemoryCacheService implements OnModuleInit {
  // 使用 LRU 缓存，限制最大条目数和 TTL，防止内存溢出
  private cache: LRUCache<string, CacheEntry<unknown>>;

  onModuleInit() {
    this.cache = new LRUCache({
      max: 5000,              // 最多缓存 5000 个 key
      maxSize: 50 * 1024 * 1024, // 最大 50MB
      sizeCalculation: (entry) => {
        // 估算每个缓存项的大小
        return JSON.stringify(entry).length;
      },
      ttl: 5 * 60 * 1000,    // 默认 TTL 5 分钟
      updateAgeOnGet: true,   // 访问时重置过期时间
    });
  }

  get<T>(key: string): T | undefined {
    const entry = this.cache.get(key) as CacheEntry<T> | undefined;
    if (!entry) return undefined;

    // 提前刷新策略：距离过期不足 20% 时间时触发异步刷新
    if (Date.now() > entry.refreshAt) {
      // 标记为需要刷新，但不阻塞当前请求
      this.cache.set(key, { ...entry, refreshAt: Infinity });
      return entry.value; // 返回旧值，让调用方异步刷新
    }

    return entry.value;
  }

  set<T>(key: string, value: T, ttlMs: number = 300_000): void {
    this.cache.set(key, {
      value,
      // 在 80% 的 TTL 时间点开始后台刷新
      refreshAt: Date.now() + ttlMs * 0.8,
    }, { ttl: ttlMs });
  }

  del(key: string): void {
    this.cache.delete(key);
  }

  // 批量操作
  mget<T>(keys: string[]): Map<string, T> {
    const result = new Map<string, T>();
    for (const key of keys) {
      const value = this.get<T>(key);
      if (value !== undefined) result.set(key, value);
    }
    return result;
  }

  getStats() {
    return {
      size: this.cache.size,
      calculatedSize: this.cache.calculatedSize,
    };
  }
}

应用内存缓存的局限

多实例不一致：多个服务实例的本地缓存各自独立，无法共享
容量受限：受限于进程可用内存（Node.js 默认堆内存约 1.5GB）
重启丢失：进程重启后缓存全部丢失，需要重新预热
GC 压力：大量缓存对象会增加 V8 垃圾回收压力

缓存模式详解

Cache Aside（旁路缓存）

最常用的缓存模式。应用代码同时管理缓存和数据库，缓存层完全透明。

cache-aside-pattern.ts
import { Injectable } from '@nestjs/common';
import Redis from 'ioredis';

@Injectable()
export class UserService {
  constructor(
    private readonly redis: Redis,
    private readonly userRepo: UserRepository,
  ) {}

  // 读操作：先查缓存 → 未命中查 DB → 写入缓存
  async getUserById(id: string): Promise<User | null> {
    const cacheKey = `user:${id}`;

    // 1. 先查缓存
    const cached = await this.redis.get(cacheKey);
    if (cached) {
      return JSON.parse(cached) as User;
    }

    // 2. 缓存未命中，查数据库
    const user = await this.userRepo.findById(id);
    if (!user) {
      // 缓存空值，防止缓存穿透，设置较短 TTL
      await this.redis.set(cacheKey, 'null', 'EX', 60);
      return null;
    }

    // 3. 写入缓存，设置过期时间加随机偏移（防止雪崩）
    const ttl = 3600 + Math.floor(Math.random() * 600);
    await this.redis.set(cacheKey, JSON.stringify(user), 'EX', ttl);
    return user;
  }

  // 写操作：先更新 DB → 再删缓存
  async updateUser(id: string, data: Partial<User>): Promise<User> {
    // 1. 先更新数据库
    const user = await this.userRepo.update(id, data);

    // 2. 再删除缓存，下次读取时重建
    const cacheKey = `user:${id}`;
    await this.redis.del(cacheKey);

    return user;
  }
}

Read Through / Write Through

应用只和缓存层交互，缓存层内部负责数据库的读写。对应用透明，但需要缓存中间件支持。

read-write-through.ts
/**
 * Read Through / Write Through 模式
 * 应用只和 CacheManager 交互，不直接操作数据库
 */
interface CacheManager<T> {
  get(key: string): Promise<T | null>;
  set(key: string, value: T): Promise<void>;
  del(key: string): Promise<void>;
}

class ReadWriteThroughCache<T> implements CacheManager<T> {
  constructor(
    private redis: Redis,
    private loader: (key: string) => Promise<T | null>,   // 数据加载函数
    private writer: (key: string, value: T) => Promise<void>, // 数据写入函数
    private ttl: number = 3600,
    private prefix: string = '',
  ) {}

  // Read Through：缓存未命中时自动从数据源加载
  async get(key: string): Promise<T | null> {
    const cacheKey = `${this.prefix}${key}`;
    const cached = await this.redis.get(cacheKey);

    if (cached !== null) {
      return cached === 'null' ? null : JSON.parse(cached);
    }

    // 缓存自动去 DB 加载
    const data = await this.loader(key);
    const serialized = data !== null ? JSON.stringify(data) : 'null';
    await this.redis.set(cacheKey, serialized, 'EX', this.ttl);
    return data;
  }

  // Write Through：写入时同时更新缓存和数据库（同步）
  async set(key: string, value: T): Promise<void> {
    const cacheKey = `${this.prefix}${key}`;

    // 同步写入数据库
    await this.writer(key, value);

    // 同步更新缓存
    await this.redis.set(cacheKey, JSON.stringify(value), 'EX', this.ttl);
  }

  async del(key: string): Promise<void> {
    const cacheKey = `${this.prefix}${key}`;
    await this.redis.del(cacheKey);
  }
}

// 使用示例
const userCache = new ReadWriteThroughCache<User>(
  redis,
  (key) => userRepo.findById(key),      // loader
  (key, value) => userRepo.save(value),  // writer
  3600,
  'user:',
);

// 应用代码只和缓存交互，不感知数据库
const user = await userCache.get('123');
await userCache.set('123', updatedUser);

Write Behind（Write Back，异步回写）

写操作只更新缓存，由后台任务异步批量写入数据库。写性能最高，但有数据丢失风险。

write-behind.ts
import { Injectable, OnModuleDestroy } from '@nestjs/common';

/**
 * Write Behind 模式
 * 写操作先写缓存，异步批量刷入数据库
 * 适合：统计计数、日志、不要求强一致的场景
 */
@Injectable()
export class WriteBehindCache implements OnModuleDestroy {
  private dirtyKeys = new Set<string>(); // 记录待刷入 DB 的 key
  private flushTimer: NodeJS.Timeout | null = null;
  private readonly FLUSH_INTERVAL = 5000; // 5 秒刷一次
  private readonly BATCH_SIZE = 100;

  constructor(
    private readonly redis: Redis,
    private readonly repo: Repository,
  ) {
    this.startFlushTimer();
  }

  async write(key: string, value: unknown): Promise<void> {
    // 1. 只写缓存
    await this.redis.set(key, JSON.stringify(value), 'EX', 7200);
    // 2. 标记为脏数据
    this.dirtyKeys.add(key);

    // 脏数据过多时立即刷入
    if (this.dirtyKeys.size >= this.BATCH_SIZE) {
      await this.flush();
    }
  }

  // 定时批量将脏数据写入数据库
  private async flush(): Promise<void> {
    if (this.dirtyKeys.size === 0) return;

    const keysToFlush = [...this.dirtyKeys].slice(0, this.BATCH_SIZE);
    this.dirtyKeys = new Set(
      [...this.dirtyKeys].filter((k) => !keysToFlush.includes(k)),
    );

    // 批量从 Redis 读出数据
    const pipeline = this.redis.pipeline();
    keysToFlush.forEach((key) => pipeline.get(key));
    const results = await pipeline.exec();

    // 批量写入数据库
    const records = results
      ?.map(([err, val], i) => (err || !val ? null : {
        key: keysToFlush[i],
        value: JSON.parse(val as string),
      }))
      .filter(Boolean);

    if (records?.length) {
      await this.repo.batchUpsert(records);
    }
  }

  private startFlushTimer(): void {
    this.flushTimer = setInterval(() => this.flush(), this.FLUSH_INTERVAL);
  }

  // 服务关闭时确保脏数据全部刷入 DB
  async onModuleDestroy(): Promise<void> {
    if (this.flushTimer) clearInterval(this.flushTimer);
    await this.flush();
  }
}

缓存模式对比

模式	读性能	写性能	一致性	复杂度	适用场景
Cache Aside	高	中	最终一致	低	大多数场景，最推荐
Read Through	高	中	最终一致	中	读多写少，想封装缓存逻辑
Write Through	高	低	强一致	中	对一致性要求高
Write Behind	高	最高	弱一致	高	日志、计数、允许丢失的场景

实际项目选择

90% 的场景用 Cache Aside 就够了。只有在特殊需求下才考虑其他模式。Write Behind 虽然写性能最高，但数据丢失风险大，不适用于交易、订单等核心数据。

本地缓存（应用内存缓存）

本地缓存直接存储在应用进程内存中，无网络开销，是所有缓存方案中最快的。

node-cache 简单方案

simple-node-cache.ts
import NodeCache from 'node-cache';

// node-cache：简单的 key-value 内存缓存
const cache = new NodeCache({
  stdTTL: 600,         // 默认 TTL 10 分钟
  checkperiod: 120,    // 每 2 分钟检查过期 key
  maxKeys: 10000,      // 最大 key 数量
  useClones: false,    // 不克隆对象，提升性能（注意引用安全）
});

// 基本使用
cache.set('config', { maxUploadSize: 10 * 1024 * 1024 });
const config = cache.get<AppConfig>('config');

// 获取或加载（原子操作）
function getOrLoad<T>(key: string, loader: () => Promise<T>, ttl?: number): Promise<T> {
  const cached = cache.get<T>(key);
  if (cached !== undefined) return Promise.resolve(cached);

  return loader().then((data) => {
    cache.set(key, data, ttl);
    return data;
  });
}

LRU Cache 高性能方案

lru-cache-advanced.ts
import { LRUCache } from 'lru-cache';

// lru-cache 是 Node.js 生态最流行的 LRU 缓存库
// 支持 TTL、最大条目数、最大内存、自定义淘汰策略
const lru = new LRUCache<string, unknown>({
  max: 10000,
  maxSize: 100 * 1024 * 1024, // 100MB
  sizeCalculation: (value) => {
    return Buffer.byteLength(JSON.stringify(value));
  },
  ttl: 5 * 60 * 1000,        // 5 分钟
  allowStale: true,           // 允许返回过期数据（用于 stale-while-revalidate）
  updateAgeOnGet: false,      // GET 时不重置过期时间

  // fetchMethod：内置的 Read Through 支持
  // 缓存未命中时自动调用此方法加载数据
  fetchMethod: async (key, staleValue, { signal }) => {
    const response = await fetch(`/api/data/${key}`, { signal });
    return response.json();
  },
});

// 使用 fetch 方法，自动处理缓存未命中
const data = await lru.fetch('user:123');

// stale-while-revalidate 策略
// allowStale=true 时，过期数据仍然返回，同时后台刷新
const staleData = lru.get('user:123'); // 可能返回过期数据

本地缓存 vs 分布式缓存选型

维度	本地缓存	分布式缓存 (Redis)
访问速度	纳秒级 (< 0.1ms)	毫秒级 (1-5ms)
容量	受限于进程内存 (MB-GB)	集群可扩展 (TB)
多实例共享	不支持，各实例独立	支持，所有实例共享
持久化	不支持，重启丢失	RDB + AOF
一致性	进程内强一致	需主动维护
网络开销	无	有
适用场景	配置、字典、热点数据	用户数据、Session、分布式锁

何时用本地缓存

数据量小且更新不频繁：如系统配置、国家/城市列表、权限字典
允许短暂不一致：各实例的本地缓存可能有几秒到几分钟的差异
极致性能：本地缓存比 Redis 快 10-100 倍

最佳实践：本地缓存 + Redis 组成两级缓存。本地缓存做一级（短 TTL），Redis 做二级（长 TTL）。

Redis 缓存实战

缓存 Key 设计原则

良好的 key 设计直接影响缓存的可维护性和性能。

cache-key-design.ts
/**
 * Redis Key 设计规范
 *
 * 格式：{业务}:{对象}:{标识}:{属性}
 * 例如：order:detail:123456:status
 *
 * 原则：
 * 1. 使用冒号分隔，便于管理和查看
 * 2. key 长度适中，避免过长（影响内存和网络）
 * 3. 不要包含特殊字符和空格
 * 4. 加上业务前缀，避免不同业务冲突
 */

// 定义 key 生成器，统一管理所有缓存 key
const CacheKeys = {
  // 用户相关
  user: (id: string) => `user:info:${id}`,
  userProfile: (id: string) => `user:profile:${id}`,
  userPermissions: (id: string) => `user:perm:${id}`,

  // 商品相关
  product: (id: string) => `product:detail:${id}`,
  productList: (category: string, page: number) =>
    `product:list:${category}:p${page}`,
  productStock: (id: string) => `product:stock:${id}`,

  // 排行榜
  ranking: (type: string, date: string) => `rank:${type}:${date}`,

  // 分布式锁
  lock: (resource: string) => `lock:${resource}`,

  // 限流
  rateLimit: (ip: string, api: string) => `rate:${ip}:${api}`,
} as const;

// 使用示例
await redis.get(CacheKeys.user('123'));
await redis.get(CacheKeys.productList('electronics', 1));

Key 设计注意事项

避免大 key：单个 value 不要超过 10KB，集合类型不要超过 5000 个元素
避免热 key：单个 key 的 QPS 不要超过集群单分片的处理能力
设置 TTL：所有 key 都必须设置 TTL，防止 Redis 内存无限增长
不要使用 KEYS *：生产环境使用 SCAN 命令遍历

序列化与批量操作

redis-serialization.ts
import Redis from 'ioredis';

class RedisCacheService {
  constructor(private readonly redis: Redis) {}

  // 使用 Hash 存储对象，避免整体序列化/反序列化
  // 场景：用户信息字段多，但经常只读取部分字段
  async setUserHash(user: User): Promise<void> {
    const key = `user:hash:${user.id}`;
    await this.redis.hset(key, {
      name: user.name,
      email: user.email,
      avatar: user.avatar,
      role: user.role,
    });
    await this.redis.expire(key, 3600);
  }

  // 只获取需要的字段，减少网络传输
  async getUserName(id: string): Promise<string | null> {
    return this.redis.hget(`user:hash:${id}`, 'name');
  }

  // Pipeline：批量操作，一次网络往返处理多个命令
  // 适合：一次性读取多个不相关的 key
  async batchGet(keys: string[]): Promise<Map<string, unknown>> {
    const pipeline = this.redis.pipeline();
    keys.forEach((key) => pipeline.get(key));

    const results = await pipeline.exec();
    const map = new Map<string, unknown>();

    results?.forEach(([err, val], index) => {
      if (!err && val) {
        map.set(keys[index], JSON.parse(val as string));
      }
    });

    return map;
  }

  // 使用 Lua 脚本保证原子性
  // 场景：查缓存 + 未命中时加锁 + 写缓存（多步操作需要原子性）
  private readonly GET_OR_LOCK_SCRIPT = `
    local value = redis.call('GET', KEYS[1])
    if value then
      return value
    end
    local locked = redis.call('SET', KEYS[2], '1', 'EX', 10, 'NX')
    if locked then
      return nil  -- 获取到锁，调用方去加载数据
    end
    return '__LOCKED__'  -- 未获取到锁，调用方需等待重试
  `;

  async getOrLock(key: string): Promise<{ value: string | null; hasLock: boolean }> {
    const lockKey = `lock:${key}`;
    const result = await this.redis.eval(
      this.GET_OR_LOCK_SCRIPT, 2, key, lockKey,
    ) as string | null;

    if (result === '__LOCKED__') {
      return { value: null, hasLock: false };
    }
    return { value: result, hasLock: result === null };
  }
}

缓存三大问题

缓存穿透

请求的数据在缓存和数据库中都不存在，每次请求都穿透到数据库。常见于恶意攻击或无效参数。

解决方案 1：缓存空值

cache-null-value.ts
async function getUser(id: string): Promise<User | null> {
  const cacheKey = `user:${id}`;
  const cached = await redis.get(cacheKey);

  // 区分「key 不存在」和「缓存了空值」
  if (cached === '__NULL__') return null; // 空值命中
  if (cached) return JSON.parse(cached);

  const user = await db.user.findById(id);
  if (!user) {
    // 缓存空值，设置较短 TTL（60s），防止缓存过多无效 key
    await redis.set(cacheKey, '__NULL__', 'EX', 60);
    return null;
  }

  const ttl = 3600 + Math.floor(Math.random() * 600);
  await redis.set(cacheKey, JSON.stringify(user), 'EX', ttl);
  return user;
}

解决方案 2：布隆过滤器

布隆过滤器（Bloom Filter）是一种空间效率极高的概率性数据结构，用于判断一个元素是否可能存在于集合中。它有一个重要特性：可能误判存在（假阳性），但绝不会误判不存在（无假阴性）。

bloom-filter.ts
import { createHash } from 'crypto';

/**
 * 布隆过滤器实现
 *
 * 原理：使用多个哈希函数将元素映射到位数组的不同位置
 * - 添加元素：将所有哈希位置设为 1
 * - 查询元素：检查所有哈希位置是否都为 1
 *   - 都为 1 → 可能存在（存在误判概率）
 *   - 有 0 → 一定不存在
 */
class BloomFilter {
  private bits: Uint8Array;
  private readonly size: number;
  private readonly hashCount: number;

  /**
   * @param expectedItems - 预期存储的元素数量
   * @param falsePositiveRate - 可接受的误判率，如 0.01 表示 1%
   */
  constructor(expectedItems: number, falsePositiveRate: number = 0.01) {
    // 根据预期元素数量和误判率计算最优参数
    // 位数组大小：m = -n * ln(p) / (ln2)^2
    this.size = Math.ceil(
      (-expectedItems * Math.log(falsePositiveRate)) / (Math.LN2 ** 2),
    );
    // 哈希函数个数：k = (m/n) * ln2
    this.hashCount = Math.ceil((this.size / expectedItems) * Math.LN2);
    this.bits = new Uint8Array(Math.ceil(this.size / 8));
  }

  // 多个哈希函数通过双重哈希模拟
  private getHashPositions(value: string): number[] {
    const hash1 = this.murmurHash(value, 0);
    const hash2 = this.murmurHash(value, hash1);

    const positions: number[] = [];
    for (let i = 0; i < this.hashCount; i++) {
      const pos = Math.abs((hash1 + i * hash2) % this.size);
      positions.push(pos);
    }
    return positions;
  }

  add(value: string): void {
    const positions = this.getHashPositions(value);
    for (const pos of positions) {
      const byteIndex = Math.floor(pos / 8);
      const bitIndex = pos % 8;
      this.bits[byteIndex] |= 1 << bitIndex;
    }
  }

  /**
   * 检查元素是否可能存在
   * @returns true = 可能存在（有误判概率）, false = 一定不存在
   */
  mightContain(value: string): boolean {
    const positions = this.getHashPositions(value);
    return positions.every((pos) => {
      const byteIndex = Math.floor(pos / 8);
      const bitIndex = pos % 8;
      return (this.bits[byteIndex] & (1 << bitIndex)) !== 0;
    });
  }

  private murmurHash(key: string, seed: number): number {
    const hash = createHash('md5')
      .update(`${seed}:${key}`)
      .digest();
    return hash.readUInt32LE(0);
  }
}

// 使用布隆过滤器防止缓存穿透
const userBloom = new BloomFilter(1_000_000, 0.01); // 100 万用户，1% 误判率

// 启动时加载所有用户 ID
async function initBloomFilter(): Promise<void> {
  const userIds = await db.user.findAllIds();
  userIds.forEach((id) => userBloom.add(id));
}

async function getUserSafe(id: string): Promise<User | null> {
  // 先经过布隆过滤器，一定不存在的直接返回
  if (!userBloom.mightContain(id)) {
    return null; // 100% 确定不存在
  }

  // 可能存在，继续正常缓存流程
  return getUser(id);
}

Redis 内置布隆过滤器

生产环境推荐使用 Redis 的 RedisBloom 模块，无需自行实现：

# Redis 命令
BF.ADD user_filter user:123
BF.EXISTS user_filter user:999

缓存击穿

某个热点 key 过期的瞬间，大量并发请求同时打到数据库，造成数据库瞬时高负载。

cache-breakdown.ts
// 解决方案 1：互斥锁（Mutex Lock）
// 只让一个请求去数据库加载，其他请求等待
async function getHotData(key: string): Promise<unknown> {
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);

  const lockKey = `lock:${key}`;
  // 尝试获取分布式锁，设置 10 秒超时防止死锁
  const locked = await redis.set(lockKey, '1', 'EX', 10, 'NX');

  if (locked) {
    try {
      // 双重检查：可能在等待锁的过程中其他线程已经填充了缓存
      const doubleCheck = await redis.get(key);
      if (doubleCheck) return JSON.parse(doubleCheck);

      const data = await db.query(key);
      await redis.set(key, JSON.stringify(data), 'EX', 3600);
      return data;
    } finally {
      await redis.del(lockKey);
    }
  } else {
    // 未获取到锁，短暂等待后重试
    await new Promise((r) => setTimeout(r, 100));
    return getHotData(key);
  }
}

// 解决方案 2：逻辑过期（不设置 Redis TTL，由应用控制过期）
// 数据永不真正过期，避免击穿。后台异步刷新过期数据
interface CachedData<T> {
  data: T;
  expireAt: number; // 逻辑过期时间戳
}

async function getWithLogicalExpiry<T>(
  key: string,
  loader: () => Promise<T>,
  ttlSeconds: number = 3600,
): Promise<T> {
  const raw = await redis.get(key);
  if (!raw) {
    // 首次加载，同步获取
    const data = await loader();
    const entry: CachedData<T> = {
      data,
      expireAt: Date.now() + ttlSeconds * 1000,
    };
    // 不设置 Redis TTL，永不过期
    await redis.set(key, JSON.stringify(entry));
    return data;
  }

  const entry = JSON.parse(raw) as CachedData<T>;

  if (Date.now() > entry.expireAt) {
    // 逻辑过期，后台异步刷新，当前请求返回旧数据
    refreshInBackground(key, loader, ttlSeconds);
  }

  return entry.data;
}

// 后台刷新，不阻塞当前请求
async function refreshInBackground<T>(
  key: string,
  loader: () => Promise<T>,
  ttlSeconds: number,
): Promise<void> {
  const lockKey = `refresh:${key}`;
  const locked = await redis.set(lockKey, '1', 'EX', 30, 'NX');
  if (!locked) return; // 已有其他线程在刷新

  try {
    const data = await loader();
    const entry: CachedData<T> = {
      data,
      expireAt: Date.now() + ttlSeconds * 1000,
    };
    await redis.set(key, JSON.stringify(entry));
  } finally {
    await redis.del(lockKey);
  }
}

缓存雪崩

大量 key 同时过期，或缓存服务宕机，导致大量请求直接打到数据库。

cache-avalanche.ts
// 方案 1：过期时间加随机偏移，避免大量 key 同时过期
function setWithRandomTTL(
  key: string,
  value: unknown,
  baseTTL: number = 3600,
  jitter: number = 600, // 随机偏移范围
): Promise<'OK'> {
  const ttl = baseTTL + Math.floor(Math.random() * jitter);
  return redis.set(key, JSON.stringify(value), 'EX', ttl);
}

// 方案 2：永不过期 + 后台刷新（同缓存击穿的逻辑过期方案）

// 方案 3：多级缓存 + 降级
async function getWithFallback(key: string): Promise<unknown> {
  // L1: 本地缓存
  const local = localCache.get(key);
  if (local) return local;

  try {
    // L2: Redis
    const remote = await redis.get(key);
    if (remote) {
      const data = JSON.parse(remote);
      localCache.set(key, data, 60_000); // 本地缓存 1 分钟
      return data;
    }
  } catch {
    // Redis 宕机，降级到数据库
    console.warn('Redis unavailable, fallback to DB');
  }

  // L3: 数据库
  const data = await db.query(key);
  localCache.set(key, data, 30_000); // 降级时本地缓存更短
  return data;
}

// 方案 4：Redis 集群高可用
// - Redis Sentinel：自动故障转移
// - Redis Cluster：数据分片 + 副本
// - 多 AZ 部署：跨可用区部署从节点

问题	原因	核心方案	辅助方案
穿透	查询不存在的数据	布隆过滤器	缓存空值、参数校验
击穿	热点 key 过期	互斥锁	逻辑过期、永不过期
雪崩	大量 key 同时过期	TTL 加随机偏移	多级缓存、集群高可用

缓存预热

缓存预热是在服务启动或大促前，提前将热点数据加载到缓存中，避免冷启动时大量请求打到数据库。

cache-warmup.ts
import { Injectable, OnApplicationBootstrap, Logger } from '@nestjs/common';

@Injectable()
export class CacheWarmupService implements OnApplicationBootstrap {
  private readonly logger = new Logger(CacheWarmupService.name);

  constructor(
    private readonly redis: Redis,
    private readonly productService: ProductService,
    private readonly configService: ConfigService,
  ) {}

  // NestJS 应用启动完成后自动执行预热
  async onApplicationBootstrap(): Promise<void> {
    this.logger.log('Starting cache warmup...');
    const start = Date.now();

    await Promise.allSettled([
      this.warmupConfig(),
      this.warmupHotProducts(),
      this.warmupDictionary(),
    ]);

    this.logger.log(`Cache warmup completed in ${Date.now() - start}ms`);
  }

  // 1. 系统配置预热
  private async warmupConfig(): Promise<void> {
    const configs = await this.configService.loadAll();
    const pipeline = this.redis.pipeline();
    configs.forEach((config) => {
      pipeline.set(`config:${config.key}`, JSON.stringify(config.value), 'EX', 86400);
    });
    await pipeline.exec();
    this.logger.log(`Warmed up ${configs.length} config entries`);
  }

  // 2. 热门商品预热（分批加载，避免一次性打爆数据库）
  private async warmupHotProducts(): Promise<void> {
    const BATCH_SIZE = 200;
    let offset = 0;
    let total = 0;

    while (true) {
      // 分批加载，每批 200 条，避免一次性查询过多数据
      const products = await this.productService.findHot(BATCH_SIZE, offset);
      if (products.length === 0) break;

      const pipeline = this.redis.pipeline();
      products.forEach((product) => {
        const ttl = 3600 + Math.floor(Math.random() * 600);
        pipeline.set(`product:${product.id}`, JSON.stringify(product), 'EX', ttl);
      });
      await pipeline.exec();

      total += products.length;
      offset += BATCH_SIZE;

      // 控制预热速度，避免给数据库太大压力
      await new Promise((r) => setTimeout(r, 100));
    }

    this.logger.log(`Warmed up ${total} hot products`);
  }

  // 3. 字典数据预热
  private async warmupDictionary(): Promise<void> {
    const dicts = await this.configService.loadDictionaries();
    const pipeline = this.redis.pipeline();
    Object.entries(dicts).forEach(([key, value]) => {
      pipeline.set(`dict:${key}`, JSON.stringify(value), 'EX', 86400);
    });
    await pipeline.exec();
    this.logger.log(`Warmed up ${Object.keys(dicts).length} dictionaries`);
  }
}

预热策略总结

策略	适用场景	实现方式
启动预热	配置、字典等必备数据	`OnApplicationBootstrap` 钩子
定时预热	热点数据定期刷新	Cron Job / 定时任务
流量回放	大促前模拟真实流量	录制线上请求并回放
手动预热	运营活动前特定数据	管理后台触发

缓存更新策略

TTL 过期自动淘汰

最简单的策略，设置合理的 TTL，过期后下次访问自动从数据库加载。

ttl-strategy.ts
// 不同数据类型设置不同的 TTL
const TTL_CONFIG = {
  user: 3600,           // 用户信息 1 小时
  product: 1800,        // 商品详情 30 分钟
  config: 86400,        // 系统配置 24 小时
  hotData: 300,         // 热点数据 5 分钟
  session: 7200,        // Session 2 小时
} as const;

事件驱动更新（Binlog 监听）

通过监听数据库变更日志（如 MySQL Binlog），实时更新缓存。适合对一致性要求较高的场景。

binlog-cache-update.ts
import { Injectable } from '@nestjs/common';

/**
 * 通过消息队列消费 Canal/Debezium 推送的 Binlog 变更事件
 * 实现缓存的近实时更新
 */
@Injectable()
export class BinlogCacheUpdater {
  constructor(private readonly redis: Redis) {}

  // 处理 Binlog 变更事件
  async handleBinlogEvent(event: BinlogEvent): Promise<void> {
    const { table, type, data, old } = event;

    switch (table) {
      case 'users':
        await this.handleUserChange(type, data, old);
        break;
      case 'products':
        await this.handleProductChange(type, data, old);
        break;
    }
  }

  private async handleUserChange(
    type: 'INSERT' | 'UPDATE' | 'DELETE',
    data: Record<string, unknown>,
    old?: Record<string, unknown>,
  ): Promise<void> {
    const userId = data.id as string;
    const cacheKey = `user:${userId}`;

    switch (type) {
      case 'INSERT':
      case 'UPDATE':
        // 方式 1：直接删除缓存，让下次读取重建
        await this.redis.del(cacheKey);
        // 方式 2：直接更新缓存（需要确保数据完整性）
        // await this.redis.set(cacheKey, JSON.stringify(data), 'EX', 3600);
        break;
      case 'DELETE':
        await this.redis.del(cacheKey);
        break;
    }
  }

  private async handleProductChange(
    type: 'INSERT' | 'UPDATE' | 'DELETE',
    data: Record<string, unknown>,
    _old?: Record<string, unknown>,
  ): Promise<void> {
    const productId = data.id as string;
    await this.redis.del(`product:${productId}`);
    // 同时清除相关列表缓存
    const category = data.category as string;
    const keys = await this.redis.keys(`product:list:${category}:*`);
    if (keys.length > 0) {
      await this.redis.del(...keys);
    }
  }
}

interface BinlogEvent {
  table: string;
  type: 'INSERT' | 'UPDATE' | 'DELETE';
  data: Record<string, unknown>;
  old?: Record<string, unknown>;
}

版本号策略

通过版本号判断缓存是否需要更新，适合数据更新频率不确定的场景。

version-cache.ts
class VersionedCache {
  constructor(private readonly redis: Redis) {}

  async get<T>(key: string): Promise<T | null> {
    // 同时获取数据和版本号
    const [data, cachedVersion] = await this.redis.mget(
      `data:${key}`,
      `version:${key}`,
    );

    if (!data || !cachedVersion) return null;

    // 检查版本是否最新
    const currentVersion = await this.redis.get(`version:current:${key}`);
    if (cachedVersion !== currentVersion) {
      // 版本不匹配，缓存失效
      await this.redis.del(`data:${key}`, `version:${key}`);
      return null;
    }

    return JSON.parse(data);
  }

  async set<T>(key: string, value: T, version: string): Promise<void> {
    const pipeline = this.redis.pipeline();
    pipeline.set(`data:${key}`, JSON.stringify(value), 'EX', 3600);
    pipeline.set(`version:${key}`, version, 'EX', 3600);
    pipeline.set(`version:current:${key}`, version);
    await pipeline.exec();
  }

  // 只需更新版本号，所有旧版本缓存自动失效
  async invalidate(key: string): Promise<void> {
    const newVersion = Date.now().toString();
    await this.redis.set(`version:current:${key}`, newVersion);
  }
}

缓存监控

缓存监控是保障系统健康运行的关键环节，核心指标是缓存命中率。

cache-monitor.ts
import { Injectable } from '@nestjs/common';

@Injectable()
export class CacheMonitor {
  private metrics = {
    hits: 0,
    misses: 0,
    errors: 0,
    latencySum: 0,
    latencyCount: 0,
  };

  recordHit(latencyMs: number): void {
    this.metrics.hits++;
    this.metrics.latencySum += latencyMs;
    this.metrics.latencyCount++;
  }

  recordMiss(latencyMs: number): void {
    this.metrics.misses++;
    this.metrics.latencySum += latencyMs;
    this.metrics.latencyCount++;
  }

  recordError(): void {
    this.metrics.errors++;
  }

  getStats() {
    const total = this.metrics.hits + this.metrics.misses;
    return {
      hitRate: total > 0 ? (this.metrics.hits / total * 100).toFixed(2) + '%' : 'N/A',
      totalRequests: total,
      hits: this.metrics.hits,
      misses: this.metrics.misses,
      errors: this.metrics.errors,
      avgLatencyMs: this.metrics.latencyCount > 0
        ? (this.metrics.latencySum / this.metrics.latencyCount).toFixed(2)
        : 'N/A',
    };
  }

  // 定期上报到监控系统
  async report(): Promise<void> {
    const stats = this.getStats();
    // 上报到 Prometheus / Grafana / 自定义监控
    console.log('Cache Stats:', stats);
    this.reset();
  }

  private reset(): void {
    this.metrics = { hits: 0, misses: 0, errors: 0, latencySum: 0, latencyCount: 0 };
  }
}

// 封装带监控的缓存操作
class MonitoredCache {
  constructor(
    private readonly redis: Redis,
    private readonly monitor: CacheMonitor,
  ) {}

  async get<T>(key: string): Promise<T | null> {
    const start = Date.now();
    try {
      const result = await this.redis.get(key);
      const latency = Date.now() - start;

      if (result) {
        this.monitor.recordHit(latency);
        return JSON.parse(result);
      } else {
        this.monitor.recordMiss(latency);
        return null;
      }
    } catch {
      this.monitor.recordError();
      return null;
    }
  }
}

关键监控指标

指标	健康值	告警阈值	说明
命中率	> 95%	< 80%	低于 80% 需排查原因
平均延迟	< 2ms	> 10ms	Redis 延迟过高可能是大 key 或网络问题
内存使用率	< 70%	> 85%	接近上限需扩容或优化
连接数	—	> 最大连接数 80%	防止连接耗尽
淘汰 key 数	0	> 0	有淘汰说明内存不足

常见面试问题

Q1: Cache Aside 模式为什么是「先更新 DB 再删缓存」，而不是先删缓存？

答案：

先删缓存会导致并发问题：

先更新 DB 再删缓存（Cache Aside）也可能在极端时序下不一致，但需要满足「读请求比写请求慢」这个很难发生的条件，概率极低。

如果需要更强的一致性保证，可以使用延迟双删。

Q2: 延迟双删怎么实现？

答案：

延迟双删是对 Cache Aside 的增强，通过两次删除缓存来降低不一致的概率：

delayed-double-delete.ts
async function updateWithDoubleDelete(
  id: string,
  data: Partial<User>,
): Promise<void> {
  const cacheKey = `user:${id}`;

  // 1. 先删除缓存（可选，增加一致性概率）
  await redis.del(cacheKey);

  // 2. 更新数据库
  await db.user.update(id, data);

  // 3. 再次删除缓存
  await redis.del(cacheKey);

  // 4. 延迟后第二次删除，覆盖在步骤 2-3 之间被其他读请求重建的旧缓存
  // 延迟时间 = 读请求执行时间 + 几百毫秒余量
  setTimeout(async () => {
    await redis.del(cacheKey);
  }, 500);
}

// 更可靠的实现：通过消息队列延迟删除
async function updateWithMQDoubleDelete(
  id: string,
  data: Partial<User>,
): Promise<void> {
  const cacheKey = `user:${id}`;

  await db.user.update(id, data);
  await redis.del(cacheKey);

  // 发送延迟消息，500ms 后再次删除缓存
  await messageQueue.sendDelayed({
    action: 'DELETE_CACHE',
    key: cacheKey,
    delay: 500,
  });
}

延迟双删的局限

延迟时间不好确定：需要根据业务读请求的平均耗时估算
不能完全保证一致性：只是降低了不一致的概率
增加了复杂度：引入了延迟任务或消息队列

如果对一致性要求非常高，建议使用 Binlog 订阅方案（Canal/Debezium），从数据库变更事件驱动缓存更新。

Q3: 如何保证缓存和数据库的一致性？

答案：

没有银弹，只能根据业务场景选择合适的方案。从弱到强：

方案	一致性级别	复杂度	适用场景
Cache Aside + TTL	最终一致（秒级）	低	大多数场景
延迟双删	最终一致（亚秒级）	中	对一致性有一定要求
Binlog 订阅	近实时一致	高	电商、金融等核心业务
Write Through	强一致	中	缓存中间件支持时
分布式事务	强一致	很高	不推荐，性能代价大

实际项目推荐组合：Cache Aside + TTL + 延迟双删。大部分场景的不一致窗口在 1 秒以内，可以接受。

更详细的讨论参考：缓存与数据库一致性

Q4: 本地缓存和分布式缓存怎么选？

答案：

场景	推荐方案	原因
系统配置、字典数据	本地缓存	数据量小、变更少、要求极快
用户 Session	Redis	多实例共享、需持久化
热点商品详情	本地 + Redis 两级	本地挡住高频请求，Redis 作为二级
分布式锁	Redis	必须跨实例可见
排行榜	Redis ZSet	需要排序能力
接口限流	Redis	跨实例统一计数

两级缓存架构：

two-level-cache.ts
async function getFromTwoLevelCache<T>(
  key: string,
  loader: () => Promise<T>,
): Promise<T> {
  // L1: 本地缓存（纳秒级）
  const local = localCache.get<T>(key);
  if (local !== undefined) return local;

  // L2: Redis（毫秒级）
  const remote = await redis.get(key);
  if (remote) {
    const data = JSON.parse(remote) as T;
    localCache.set(key, data, 60_000); // 本地缓存 1 分钟
    return data;
  }

  // L3: 数据库
  const data = await loader();
  await redis.set(key, JSON.stringify(data), 'EX', 3600);
  localCache.set(key, data, 60_000);
  return data;
}

Q5: 缓存 Key 怎么设计？

答案：

命名规范：{业务}:{对象}:{标识}[:{属性}]

好的设计：

user:info:123 - 用户信息
product:detail:456 - 商品详情
order:list:user:123:p1 - 用户 123 的订单列表第 1 页

坏的设计：

u123 - 含义不明
getUserInfoByIdAndReturnFullProfile:123 - 太长
user info 123 - 包含空格

核心原则：

可读性：看到 key 就知道存的什么数据
唯一性：加业务前缀，避免不同业务冲突
长度适中：key 本身也占内存，不要太长（建议 < 128 字节）
统一管理：使用 key 生成器集中管理，不在代码中硬编码

Q6: 缓存预热怎么做？

答案：

缓存预热的核心目的是避免冷启动时大量请求打到数据库。

预热时机：

服务启动时：在 OnApplicationBootstrap 等生命周期钩子中加载
大促前：提前将活动商品、库存等预热到缓存
定时刷新：Cron Job 定期刷新即将过期的热点数据
流量回放：录制线上请求，在新集群上回放以预热缓存

预热注意事项：

分批加载：避免一次性查询过多数据打爆数据库
控制速度：加适当延迟，控制数据库和 Redis 的写入压力
异步执行：预热不应阻塞服务启动
TTL 分散：预热数据的 TTL 加随机偏移，避免同时过期

Q7: 大 Key 问题怎么处理？

答案：

大 Key 指单个 key 的 value 过大（String > 10KB，集合元素 > 5000）。危害包括：网络传输慢、Redis 内存分配碎片化、删除大 key 可能阻塞 Redis。

排查方法：

# Redis 内置扫描（推荐）
redis-cli --bigkeys

# 使用 MEMORY USAGE 命令
MEMORY USAGE my_big_key

解决方案：

big-key-solution.ts
// 方案 1：拆分大 Key
// 原来：一个 Hash 存储所有用户属性
// 拆分后：按属性组拆分为多个 Hash
await redis.hset('user:123:basic', { name: '张三', age: '25' });
await redis.hset('user:123:address', { city: '北京', street: '...' });
await redis.hset('user:123:preference', { theme: 'dark', lang: 'zh' });

// 方案 2：压缩 Value
import { gzipSync, gunzipSync } from 'zlib';

async function setCompressed(key: string, data: unknown): Promise<void> {
  const json = JSON.stringify(data);
  // 数据超过 1KB 时压缩
  if (json.length > 1024) {
    const compressed = gzipSync(json).toString('base64');
    await redis.set(key, `gz:${compressed}`, 'EX', 3600);
  } else {
    await redis.set(key, json, 'EX', 3600);
  }
}

// 方案 3：大 Key 异步删除（Redis 4.0+）
// UNLINK 命令在后台线程异步删除，不阻塞主线程
await redis.unlink('my_big_key');

Q8: 热 Key 问题怎么处理？

答案：

热 Key 指某个 key 的访问频率极高（如明星离婚、秒杀商品），导致 Redis 单分片压力过大。

解决方案：

方案	说明	适用场景
本地缓存	在应用内存中缓存热 key，不走 Redis	最有效的方案
Key 分片	将一个 key 拆成多个副本，随机读取	读请求分散到多个分片
限流	对热 key 的请求进行限流	兜底方案

hot-key-solution.ts
// 方案 1：本地缓存热 key
// 通过实时监控发现热 key，自动加入本地缓存

// 方案 2：Key 分片（读扩散）
const REPLICAS = 10;

async function getHotKey(key: string): Promise<unknown> {
  // 随机选一个副本读取，将请求分散到不同的 Redis 分片
  const replica = Math.floor(Math.random() * REPLICAS);
  const shardKey = `${key}:r${replica}`;

  const cached = await redis.get(shardKey);
  if (cached) return JSON.parse(cached);

  // 未命中时从主 key 加载
  const data = await loadFromDB(key);

  // 写入所有副本
  const pipeline = redis.pipeline();
  for (let i = 0; i < REPLICAS; i++) {
    const ttl = 3600 + Math.floor(Math.random() * 600);
    pipeline.set(`${key}:r${i}`, JSON.stringify(data), 'EX', ttl);
  }
  await pipeline.exec();

  return data;
}

Q9: Redis 和 Memcached 的区别？

答案：

维度	Redis	Memcached
数据结构	String/Hash/List/Set/ZSet/Stream	只有 Key-Value
持久化	RDB 快照 + AOF 日志	不支持
集群	原生 Cluster（16384 槽位）	客户端一致性哈希分片
内存管理	8 种淘汰策略	仅 LRU
线程模型	单线程命令执行 + 多线程 IO (6.0+)	多线程
Pub/Sub	支持	不支持
Lua 脚本	支持	不支持
事务	MULTI/EXEC	不支持
典型用途	缓存 + 队列 + 锁 + 排行榜 + Session	纯缓存加速

选型建议：现在的新项目基本都选 Redis。Memcached 唯一的优势是多线程模型在纯 Key-Value 缓存场景下吞吐量略高，但 Redis 6.0 引入多线程 IO 后差距已经很小。

更多 Redis 知识参考：Redis 数据结构与应用

Q10: 缓存淘汰策略有哪些？

答案：

Redis 支持 8 种淘汰策略（maxmemory-policy）：

策略	范围	算法	说明
`noeviction`	—	—	默认。不淘汰，内存满时写入报错
`volatile-lru`	设了 TTL 的 key	LRU	淘汰最近最少使用的
`allkeys-lru`	所有 key	LRU	缓存场景推荐
`volatile-lfu`	设了 TTL 的 key	LFU	淘汰最不经常使用的 (Redis 4.0+)
`allkeys-lfu`	所有 key	LFU	适合有明显冷热的场景
`volatile-random`	设了 TTL 的 key	随机	随机淘汰
`allkeys-random`	所有 key	随机	随机淘汰
`volatile-ttl`	设了 TTL 的 key	TTL	优先淘汰剩余时间最短的

LRU vs LFU

LRU（Least Recently Used）：淘汰最久没被访问的。适合访问模式均匀的场景
LFU（Least Frequently Used）：淘汰访问频率最低的。适合有明显冷热差异的场景（如热门商品）

Redis 的 LRU 是近似 LRU，通过随机采样 key（默认 5 个）找到最久未使用的淘汰，不是严格的 LRU，但性能更好。

问题​

答案​

多级缓存架构​

CDN 缓存​

Nginx 代理缓存​

应用内存缓存​

缓存模式详解​

Cache Aside（旁路缓存）​

Read Through / Write Through​

Write Behind（Write Back，异步回写）​

缓存模式对比​

本地缓存（应用内存缓存）​

node-cache 简单方案​

LRU Cache 高性能方案​

本地缓存 vs 分布式缓存选型​

Redis 缓存实战​

缓存 Key 设计原则​

序列化与批量操作​

缓存三大问题​

缓存穿透​

缓存击穿​

缓存雪崩​

缓存预热​

缓存更新策略​

TTL 过期自动淘汰​

事件驱动更新（Binlog 监听）​

版本号策略​

缓存监控​

常见面试问题​

Q1: Cache Aside 模式为什么是「先更新 DB 再删缓存」，而不是先删缓存？​

Q2: 延迟双删怎么实现？​

Q3: 如何保证缓存和数据库的一致性？​

Q4: 本地缓存和分布式缓存怎么选？​

Q5: 缓存 Key 怎么设计？​

Q6: 缓存预热怎么做？​

Q7: 大 Key 问题怎么处理？​

Q8: 热 Key 问题怎么处理？​

Q9: Redis 和 Memcached 的区别？​

Q10: 缓存淘汰策略有哪些？​

相关链接​

问题

答案