Fluentd 与 Fluent Bit
对比
| Fluent Bit | Fluentd | Vector | |
|---|---|---|---|
| 语言 | C | Ruby + C | Rust |
| 内存 | ~5MB | ~40MB | ~15MB |
| 定位 | 轻量采集(DaemonSet) | 聚合/处理节点 | 采集 + 聚合一体 |
| 插件 | ~100 | ~1000 | ~100 |
| K8s 场景 | 首选采集器 | 聚合层 | 全能替代 |
推荐组合
- 小规模:Fluent Bit → Loki/ES(直连)
- 大规模:Fluent Bit → Kafka → Fluentd → ES/Loki(分层架构)
- 新项目:Vector 可同时替代 Fluent Bit + Fluentd
Fluent Bit
K8s DaemonSet 配置
fluent-bit-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Daemon Off
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser cri
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Merge_Log On # 将 JSON 日志合并到顶层
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On # 支持 Pod 注解排除日志
[FILTER]
Name grep
Match kube.*
Exclude log healthcheck # 排除健康检查日志
[OUTPUT]
Name es
Match kube.*
Host elasticsearch
Port 9200
Index logs-k8s
Type _doc
Logstash_Format On
Retry_Limit 3
[OUTPUT]
Name loki
Match kube.*
Host loki
Port 3100
Labels job=fluent-bit, namespace=$kubernetes['namespace_name'], app=$kubernetes['labels']['app']
Fluentd
聚合层配置
fluentd.conf
<!-- 接收 Fluent Bit 转发的日志 -->
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<!-- 解析 JSON 日志 -->
<filter app.**>
@type parser
key_name log
<parse>
@type json
time_key timestamp
time_format %Y-%m-%dT%H:%M:%S.%LZ
</parse>
</filter>
<!-- 敏感信息脱敏 -->
<filter **>
@type record_transformer
<record>
message ${record["message"]&.gsub(/\d{11}/, '***')}
</record>
</filter>
<!-- 输出到 Elasticsearch -->
<match **>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix logs
<buffer>
@type file
path /var/log/fluentd-buffers
flush_mode interval
flush_interval 5s
chunk_limit_size 8MB
retry_max_interval 30
retry_forever true
</buffer>
</match>
Vector
配置示例
vector.toml
# 采集 K8s 容器日志
[sources.kubernetes]
type = "kubernetes_logs"
# 解析 JSON
[transforms.parse]
type = "remap"
inputs = ["kubernetes"]
source = '''
. = parse_json!(.message)
.namespace = .kubernetes.pod_namespace
.app = .kubernetes.pod_labels.app
'''
# 过滤 DEBUG
[transforms.filter]
type = "filter"
inputs = ["parse"]
condition = '.level != "DEBUG"'
# 输出到 Loki
[sinks.loki]
type = "loki"
inputs = ["filter"]
endpoint = "http://loki:3100"
labels.app = "{{ app }}"
labels.namespace = "{{ namespace }}"
# 同时输出到 S3 归档
[sinks.s3]
type = "aws_s3"
inputs = ["filter"]
bucket = "logs-archive"
region = "ap-southeast-1"
compression = "gzip"
常见面试问题
Q1: K8s 日志采集选 Fluent Bit 还是 Fluentd?
答案:
- Fluent Bit:部署为 DaemonSet 在每个节点上,负责采集和简单过滤。内存仅 5MB,适合节点级采集
- Fluentd:部署为 Deployment(少量实例),负责复杂的解析、过滤、富化
- 大规模推荐组合:Fluent Bit(采集)→ Kafka(缓冲)→ Fluentd(处理)→ ES/Loki(存储)
Q2: 日志采集如何保证不丢失?
答案:
- 文件偏移记录:Filebeat/Fluent Bit 记录读取位置,重启后续传
- 内存缓冲 + 磁盘缓冲:设置
Mem_Buf_Limit,超限写磁盘 - Kafka 缓冲:解耦采集和消费,Kafka 持久化保障
- 重试机制:输出插件失败时自动重试
- 背压处理:下游不可用时暂停采集,避免 OOM