设计短链接服务

问题

如何用 Python 设计一个短链接服务？如何保证高并发下的唯一性？

答案

架构

核心实现

short_url/service.py
import hashlib
import string
import redis
from sqlalchemy.orm import Session

CHARSET = string.digits + string.ascii_letters  # 0-9a-zA-Z，62 个字符
BASE = len(CHARSET)

def id_to_short(num: int) -> str:
    """自增 ID → Base62 短码"""
    if num == 0:
        return CHARSET[0]
    result = []
    while num > 0:
        result.append(CHARSET[num % BASE])
        num //= BASE
    return "".join(reversed(result))


class ShortURLService:
    def __init__(self, db: Session, cache: redis.Redis, domain: str = "s.example.com"):
        self.db = db
        self.cache = cache
        self.domain = domain

    def shorten(self, long_url: str) -> str:
        # 1. 查重（布隆过滤器或数据库查询）
        existing = self.db.query(URLMapping).filter_by(long_url=long_url).first()
        if existing:
            return f"https://{self.domain}/{existing.short_code}"

        # 2. 插入数据库获得自增 ID
        mapping = URLMapping(long_url=long_url)
        self.db.add(mapping)
        self.db.commit()

        # 3. ID → Base62
        short_code = id_to_short(mapping.id)
        mapping.short_code = short_code
        self.db.commit()

        # 4. 写缓存
        self.cache.set(f"url:{short_code}", long_url, ex=86400 * 30)
        return f"https://{self.domain}/{short_code}"

    def resolve(self, short_code: str) -> str | None:
        # 先查缓存
        cached = self.cache.get(f"url:{short_code}")
        if cached:
            return cached.decode()

        # 缓存未命中查数据库
        mapping = self.db.query(URLMapping).filter_by(short_code=short_code).first()
        if mapping:
            self.cache.set(f"url:{short_code}", mapping.long_url, ex=86400 * 30)
            return mapping.long_url
        return None

FastAPI 路由

short_url/api.py
from fastapi import FastAPI, HTTPException
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.post("/shorten")
def create_short_url(long_url: str):
    short = service.shorten(long_url)
    return {"short_url": short}

@app.get("/{short_code}")
def redirect(short_code: str):
    long_url = service.resolve(short_code)
    if not long_url:
        raise HTTPException(status_code=404, detail="Not found")
    # 301 永久重定向（浏览器缓存） vs 302 临时重定向（方便统计）
    return RedirectResponse(url=long_url, status_code=302)

哈希方案（备选）

short_url/hash_approach.py
import hashlib

def hash_shorten(long_url: str, length: int = 6) -> str:
    """MD5 哈希取前 N 位"""
    h = hashlib.md5(long_url.encode()).hexdigest()
    # 取前 8 个 hex 字符转为 Base62
    num = int(h[:8], 16)
    return id_to_short(num)[:length]

# 冲突处理：拼接随机盐重新哈希，或直接用自增 ID 方案

常见面试问题

Q1: Base62 vs 哈希方案？

答案：

方案	优点	缺点
自增 ID + Base62	无冲突、可预测	依赖单点自增
哈希取前缀	无需中心化	可能冲突
预生成 ID 池	高并发友好	需要维护 ID 池

Q2: 如何统计短链访问量？

答案：

Redis INCR 实时计数
异步写入 ClickHouse 做分析
记录 IP、UA、Referer 等维度

Q3: 301 vs 302 重定向？

答案：

301 永久重定向：浏览器缓存，减少服务端压力；无法统计点击量
302 临时重定向：每次都经过服务端，方便统计和动态切换目标

问题​

答案​

架构​

核心实现​

FastAPI 路由​

哈希方案（备选）​

常见面试问题​

Q1: Base62 vs 哈希方案？​

Q2: 如何统计短链访问量？​

Q3: 301 vs 302 重定向？​

相关链接​

问题

答案

架构

核心实现

FastAPI 路由

哈希方案（备选）

常见面试问题

Q1: Base62 vs 哈希方案？

Q2: 如何统计短链访问量？

Q3: 301 vs 302 重定向？

相关链接