在VPS上自托管Langfuse实现LLM可观测性
使用Docker Compose在自己的VPS上部署Langfuse v3。追踪LLM调用、监控成本、使用DeepEval运行自动化评估,并将质量门控集成到CI/CD流水线中。
Langfuse是一个开源的LLM可观测性平台。它追踪应用中的每一次LLM调用:延迟、token用量、成本以及prompt/completion对。使用Docker Compose自托管可以将所有追踪数据保留在你的基础设施上。没有按事件计费。没有数据离开你的网络。
本教程涵盖完整的生命周期:部署Langfuse v3、添加TLS、为Python和TypeScript应用添加观测,以及构建与CI/CD集成的DeepEval自动化评估流水线。
Langfuse v3在VPS上运行需要什么?
Langfuse v3运行六个容器:Web界面、异步worker、用于元数据的PostgreSQL、用于追踪分析的ClickHouse、用于队列的Redis,以及用于对象存储的MinIO。这与v2相比是一个重大变化,v2只需要PostgreSQL。
| 组件 | 用途 | 默认端口 | 基础RAM |
|---|---|---|---|
| langfuse-web | Web界面和API | 3000 | ~512 MB |
| langfuse-worker | 异步事件处理 | 3030 | ~512 MB |
| PostgreSQL 17 | 事务性元数据 | 5432 | ~256 MB |
| ClickHouse | OLAP追踪分析 | 8123(HTTP)、9000(原生) | ~1 GB |
| Redis 7 | 队列和缓存 | 6379 | ~128 MB |
| MinIO | 对象/媒体存储 | 9000(API)、9001(控制台) | ~256 MB |
至少分配4 vCPU和8 GB RAM。Virtua Cloud VCS-8(4 vCPU、8 GB RAM、NVMe)可以轻松应对。从100 GB磁盘开始。ClickHouse大约每百万条追踪增长1-2 GB,具体取决于prompt/completion的大小。
按规模的资源规划
| 追踪量/月 | 磁盘增长/月 | 推荐VPS |
|---|---|---|
| < 100K | ~500 MB | 4 vCPU / 8 GB |
| 100K - 1M | 1-2 GB | 4 vCPU / 8 GB |
| 1M - 10M | 10-20 GB | 8 vCPU / 16 GB |
| > 10M | 50+ GB | 专用服务器 / Kubernetes |
前提条件
- 一台运行Debian 12或Ubuntu 24.04的VPS,至少4 vCPU和8 GB RAM
- 已安装Docker和Docker Compose VPS上的Docker生产环境:会出什么问题以及如何解决
- 一个域名,A记录指向你的VPS IP(用于TLS)
- 基于密钥认证的SSH访问 Linux VPS上的SSH安全加固:sshd_config完整配置指南
如何使用Docker Compose部署Langfuse?
克隆官方仓库,使用提供的docker-compose.yml作为起点。关键步骤是生成真正的密钥,而不是使用占位符值。
mkdir -p /opt/langfuse && cd /opt/langfuse
创建包含生成密钥的环境文件:
cat > .env << 'ENVEOF'
# PostgreSQL
POSTGRES_USER=langfuse
POSTGRES_PASSWORD=REPLACE_PG
POSTGRES_DB=langfuse
# ClickHouse
CLICKHOUSE_USER=clickhouse
CLICKHOUSE_PASSWORD=REPLACE_CH
# MinIO
MINIO_ROOT_USER=minio
MINIO_ROOT_PASSWORD=REPLACE_MINIO
# Redis
REDIS_AUTH=REPLACE_REDIS
# Langfuse secrets
NEXTAUTH_SECRET=REPLACE_NEXTAUTH
SALT=REPLACE_SALT
ENCRYPTION_KEY=REPLACE_ENCRYPTION
# Langfuse config
NEXTAUTH_URL=https://langfuse.example.com
LANGFUSE_CSP_ENFORCE_HTTPS=true
KEEP_ALIVE_TIMEOUT=70
ENVEOF
现在用真正的随机值替换占位符。使用openssl rand -hex而不是-base64,因为base64输出包含/、+和=字符,会破坏PostgreSQL连接URL:
sed -i "s|REPLACE_PG|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_CH|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_MINIO|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_REDIS|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_NEXTAUTH|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_SALT|$(openssl rand -hex 32)|" .env
sed -i "s|REPLACE_ENCRYPTION|$(openssl rand -hex 32)|" .env
锁定文件权限。只有root应该能读取:
chmod 600 .env
ls -la .env
-rw------- 1 root root 715 Mar 19 10:00 .env
如何解决MinIO和ClickHouse的端口9000冲突?
MinIO和ClickHouse默认都使用端口9000。官方docker-compose.yml已经将MinIO的API端口映射到主机的9090(9090:9000),避免了冲突。如果你编写自定义compose文件,确保重新映射其中一个。
官方compose文件还将基础设施端口仅绑定到127.0.0.1(PostgreSQL、ClickHouse原生协议、Redis、MinIO控制台),防止外部访问。唯一暴露在所有接口上的端口是Web界面的3000端口,我们将把它放在反向代理后面。
创建docker-compose.yml:
services:
langfuse-web:
image: docker.io/langfuse/langfuse:3
ports:
- "127.0.0.1:3000:3000"
environment:
- DATABASE_URL=postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
- NEXTAUTH_URL=${NEXTAUTH_URL}
- NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
- SALT=${SALT}
- ENCRYPTION_KEY=${ENCRYPTION_KEY}
- CLICKHOUSE_MIGRATION_URL=clickhouse://clickhouse:9000
- CLICKHOUSE_URL=http://clickhouse:8123
- CLICKHOUSE_USER=${CLICKHOUSE_USER}
- CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
- CLICKHOUSE_CLUSTER_ENABLED=false
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_AUTH=${REDIS_AUTH}
- LANGFUSE_S3_EVENT_UPLOAD_BUCKET=langfuse
- LANGFUSE_S3_EVENT_UPLOAD_ENDPOINT=http://minio:9000
- LANGFUSE_S3_EVENT_UPLOAD_ACCESS_KEY_ID=${MINIO_ROOT_USER}
- LANGFUSE_S3_EVENT_UPLOAD_SECRET_ACCESS_KEY=${MINIO_ROOT_PASSWORD}
- LANGFUSE_S3_EVENT_UPLOAD_FORCE_PATH_STYLE=true
- LANGFUSE_S3_EVENT_UPLOAD_REGION=auto
- LANGFUSE_S3_MEDIA_UPLOAD_BUCKET=langfuse
- LANGFUSE_S3_MEDIA_UPLOAD_ENDPOINT=http://minio:9000
- LANGFUSE_S3_MEDIA_UPLOAD_ACCESS_KEY_ID=${MINIO_ROOT_USER}
- LANGFUSE_S3_MEDIA_UPLOAD_SECRET_ACCESS_KEY=${MINIO_ROOT_PASSWORD}
- LANGFUSE_S3_MEDIA_UPLOAD_FORCE_PATH_STYLE=true
- LANGFUSE_S3_MEDIA_UPLOAD_REGION=auto
- LANGFUSE_CSP_ENFORCE_HTTPS=${LANGFUSE_CSP_ENFORCE_HTTPS}
- KEEP_ALIVE_TIMEOUT=${KEEP_ALIVE_TIMEOUT}
- LANGFUSE_LOG_LEVEL=info
depends_on:
postgres:
condition: service_healthy
clickhouse:
condition: service_healthy
minio:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
langfuse-worker:
image: docker.io/langfuse/langfuse-worker:3
ports:
- "127.0.0.1:3030:3030"
environment:
- DATABASE_URL=postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
- NEXTAUTH_URL=${NEXTAUTH_URL}
- SALT=${SALT}
- ENCRYPTION_KEY=${ENCRYPTION_KEY}
- CLICKHOUSE_MIGRATION_URL=clickhouse://clickhouse:9000
- CLICKHOUSE_URL=http://clickhouse:8123
- CLICKHOUSE_USER=${CLICKHOUSE_USER}
- CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
- CLICKHOUSE_CLUSTER_ENABLED=false
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_AUTH=${REDIS_AUTH}
- LANGFUSE_S3_EVENT_UPLOAD_BUCKET=langfuse
- LANGFUSE_S3_EVENT_UPLOAD_ENDPOINT=http://minio:9000
- LANGFUSE_S3_EVENT_UPLOAD_ACCESS_KEY_ID=${MINIO_ROOT_USER}
- LANGFUSE_S3_EVENT_UPLOAD_SECRET_ACCESS_KEY=${MINIO_ROOT_PASSWORD}
- LANGFUSE_S3_EVENT_UPLOAD_FORCE_PATH_STYLE=true
- LANGFUSE_S3_EVENT_UPLOAD_REGION=auto
- LANGFUSE_S3_MEDIA_UPLOAD_BUCKET=langfuse
- LANGFUSE_S3_MEDIA_UPLOAD_ENDPOINT=http://minio:9000
- LANGFUSE_S3_MEDIA_UPLOAD_ACCESS_KEY_ID=${MINIO_ROOT_USER}
- LANGFUSE_S3_MEDIA_UPLOAD_SECRET_ACCESS_KEY=${MINIO_ROOT_PASSWORD}
- LANGFUSE_S3_MEDIA_UPLOAD_FORCE_PATH_STYLE=true
- LANGFUSE_S3_MEDIA_UPLOAD_REGION=auto
- LANGFUSE_LOG_LEVEL=info
depends_on:
postgres:
condition: service_healthy
clickhouse:
condition: service_healthy
minio:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
postgres:
image: docker.io/postgres:17
ports:
- "127.0.0.1:5432:5432"
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
volumes:
- langfuse_postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
interval: 3s
timeout: 3s
retries: 10
restart: unless-stopped
clickhouse:
image: docker.io/clickhouse/clickhouse-server
user: "101:101"
ports:
- "127.0.0.1:8123:8123"
- "127.0.0.1:9000:9000"
volumes:
- langfuse_clickhouse_data:/var/lib/clickhouse
- langfuse_clickhouse_logs:/var/log/clickhouse-server
environment:
- CLICKHOUSE_DB=default
- CLICKHOUSE_USER=${CLICKHOUSE_USER}
- CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
healthcheck:
test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:8123/ping || exit 1"]
interval: 5s
timeout: 5s
retries: 10
restart: unless-stopped
minio:
image: docker.io/minio/minio
entrypoint: sh
command: -c 'mkdir -p /data/langfuse && minio server --address ":9000" --console-address ":9001" /data'
ports:
- "127.0.0.1:9090:9000"
- "127.0.0.1:9091:9001"
environment:
- MINIO_ROOT_USER=${MINIO_ROOT_USER}
- MINIO_ROOT_PASSWORD=${MINIO_ROOT_PASSWORD}
volumes:
- langfuse_minio_data:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 3s
timeout: 5s
retries: 5
restart: unless-stopped
redis:
image: docker.io/redis:7
command: redis-server --requirepass ${REDIS_AUTH} --maxmemory-policy noeviction
ports:
- "127.0.0.1:6379:6379"
volumes:
- langfuse_redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_AUTH}", "ping"]
interval: 3s
timeout: 3s
retries: 10
restart: unless-stopped
volumes:
langfuse_postgres_data:
langfuse_clickhouse_data:
langfuse_clickhouse_logs:
langfuse_minio_data:
langfuse_redis_data:
关于这个compose文件有几点需要注意:
- Web容器绑定到
127.0.0.1:3000而不是0.0.0.0:3000。所有流量都通过反向代理。 CLICKHOUSE_MIGRATION_URL使用原生协议(clickhouse://)在端口9000上,而CLICKHOUSE_URL使用HTTP在端口8123上。两者都是必需的。缺少CLICKHOUSE_MIGRATION_URL会导致Web容器启动时崩溃。- Redis使用单独的
REDIS_HOST、REDIS_PORT和REDIS_AUTH变量,而不是连接字符串。这是Langfuse v3期望的格式。 - MinIO的entrypoint在首次启动时创建
langfusebucket目录(mkdir -p /data/langfuse)。没有这个,S3上传会失败直到bucket存在。 - ClickHouse以
user: "101:101"运行,以匹配容器内部的clickhouse用户。 - Redis使用
--maxmemory-policy noeviction运行,防止内存紧张时丢失数据。Langfuse依赖Redis作为其任务队列,驱逐键会导致静默数据丢失。
启动整个栈:
docker compose up -d
等待大约2-3分钟让所有容器初始化。ClickHouse和PostgreSQL在首次启动时会运行迁移。检查状态:
docker compose ps
六个容器都应显示Up。基础设施容器(postgres、clickhouse、minio、redis)显示(healthy)。langfuse-web和langfuse-worker容器没有定义健康检查,所以显示Up但没有健康标签。如果任何容器在重启循环中,检查其日志:
docker compose logs langfuse-web --tail 50
访问健康端点确认API正常运行且数据库连接正常:
curl -s http://localhost:3000/api/public/health?failIfDatabaseUnavailable=true | python3 -m json.tool
{
"status": "OK",
"version": "3.160.0"
}
如何为自托管的Langfuse实例添加TLS?
在Langfuse前面放置Caddy。Caddy自动处理从Let's Encrypt获取TLS证书。不需要certbot定时任务,不需要手动续期。
安装Caddy:
apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list
apt update && apt install -y caddy
创建Caddyfile:
cat > /etc/caddy/Caddyfile << 'EOF'
langfuse.example.com {
header -Server
reverse_proxy 127.0.0.1:3000 {
transport http {
keepalive 75s
keepalive_idle_conns 10
}
}
}
EOF
keepalive 75s设置得比Langfuse的KEEP_ALIVE_TIMEOUT=70更高,以防止反向代理保持过期连接。这个不匹配是许多自托管者遇到间歇性502/504错误的根本原因。header -Server指令从响应中去除版本信息。
systemctl enable --now caddy
enable使Caddy在重启后自动启动。--now立即启动它。
systemctl status caddy
● caddy.service - Caddy
Loaded: loaded (/usr/lib/systemd/system/caddy.service; enabled; preset: enabled)
Active: active (running)
从本地机器测试TLS端点:
curl -I https://langfuse.example.com/api/public/health
HTTP/2 200
content-type: application/json
配置认证和API密钥
在浏览器中打开https://langfuse.example.com。创建第一个用户账户。这将成为管理员账户。
登录后:
- 创建一个新项目(例如"production")
- 进入Settings > API Keys
- 点击Create API Key
- 保存Public Key和Secret Key。你需要这两个密钥来为应用添加观测。
公钥标识你的项目。密钥认证写入操作。像对待数据库密码一样对待密钥。
如何为Python应用添加观测以发送追踪到Langfuse?
Langfuse Python SDK基于OpenTelemetry构建。@observe()装饰器自动为被装饰的函数创建追踪和span。嵌套调用在Langfuse界面中生成嵌套span。
安装SDK:
pip install langfuse openai
设置指向自托管实例的环境变量:
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://langfuse.example.com"
export OPENAI_API_KEY="sk-..."
下面是一个带观测的应用:
from langfuse import observe, propagate_attributes, get_client
from langfuse.openai import openai # patched OpenAI client
@observe()
def retrieve_context(query: str) -> str:
# Simulating a retrieval step
return "Paris is the capital of France, with a population of 2.1 million."
@observe()
def answer_question(query: str) -> str:
context = retrieve_context(query)
response = openai.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": f"Answer based on this context: {context}"},
{"role": "user", "content": query},
],
)
return response.choices[0].message.content
@observe()
def run_pipeline(query: str) -> str:
with propagate_attributes(
user_id="user_42",
session_id="session_abc",
tags=["production", "rag-pipeline"],
):
return answer_question(query)
result = run_pipeline("What is the capital of France?")
print(result)
# Flush traces before exit in short-lived scripts
get_client().shutdown()
run_pipeline上的@observe()创建根追踪。answer_question和retrieve_context成为嵌套span。补丁版的openai导入(from langfuse.openai import openai)自动捕获模型名称、token数量、延迟和成本作为生成span。propagate_attributes上下文管理器将用户和会话元数据附加到所有嵌套观测。
运行此脚本后,打开Langfuse仪表板。Trace Explorer显示完整的调用树以及每个span的时间信息。
如何为TypeScript应用添加观测?
安装Langfuse TypeScript SDK和OpenTelemetry Node SDK:
npm install @langfuse/tracing @langfuse/otel @opentelemetry/sdk-node
设置相同的环境变量:
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_BASEURL="https://langfuse.example.com"
import {
observe,
startActiveObservation,
propagateAttributes,
} from "@langfuse/tracing";
const fetchContext = observe(
async (query: string) => {
return "Paris is the capital of France.";
},
{ name: "fetch-context", asType: "span" }
);
const callLLM = observe(
async (query: string, context: string) => {
// Replace with your actual LLM call
return `Based on context: ${context}, the answer is Paris.`;
},
{ name: "llm-call", asType: "generation" }
);
await startActiveObservation("rag-pipeline", async (root) => {
await propagateAttributes(
{ userId: "user_42", sessionId: "session_abc" },
async () => {
const context = await fetchContext("Capital of France?");
const answer = await callLLM("Capital of France?", context);
root.update({ output: { answer } });
}
);
});
在serverless函数或脚本退出前调用span processor的forceFlush()以防止数据丢失。
仪表板概览
Langfuse仪表板提供四个关键视图:
Trace Explorer显示单个请求追踪。点击任何追踪查看完整的span树:哪些函数运行了,每个花了多长时间,以及发送给LLM的确切prompt/completion。按用户ID、标签或时间范围过滤。
成本追踪按模型分解支出。你可以看到哪些模型消耗最多token,哪些API调用最贵。用它来识别在不必要的长completion上浪费token的prompt。
延迟百分位(p50、p90、p99)按端点或span名称。如果你的p99延迟飙升,可以深入最慢的追踪找到瓶颈。
Token使用趋势随时间变化。关注意外的跳跃,这可能表明prompt变更或产生更长输出的bug。
如何使用DeepEval和Langfuse设置自动化LLM评估?
DeepEval是一个开源LLM评估框架,它根据幻觉、忠实度和相关性等指标为模型输出打分。结合Langfuse,你可以获取生产追踪、对其运行评估,并将分数推送回Langfuse仪表板。
安装DeepEval:
pip install deepeval
评估指标
| 指标 | 衡量内容 | 分数范围 | 使用场景 |
|---|---|---|---|
| Hallucination | 相对于提供上下文的事实准确性 | 0-1(1 = 无幻觉) | RAG流水线 |
| Faithfulness | 输出是否与检索上下文一致 | 0-1(1 = 忠实) | RAG流水线 |
| Answer Relevancy | 回答是否回答了问题 | 0-1(1 = 相关) | 任何LLM应用 |
| G-Eval | 通过LLM-as-judge的自定义标准 | 0-1 | 自定义质量检查 |
评估脚本
此脚本从Langfuse获取最近的追踪,运行DeepEval指标,并将分数推送回去:
import os
from langfuse import Langfuse
from deepeval.metrics import GEval, AnswerRelevancyMetric
from deepeval.test_case import LLMTestCase, LLMTestCaseParams
langfuse = Langfuse(
public_key=os.environ["LANGFUSE_PUBLIC_KEY"],
secret_key=os.environ["LANGFUSE_SECRET_KEY"],
host=os.environ["LANGFUSE_HOST"],
)
# Fetch traces tagged for evaluation
traces = langfuse.api.trace.list(
tags=["eval-candidate"],
limit=50,
).data
# Define metrics
relevancy = AnswerRelevancyMetric(threshold=0.7)
correctness = GEval(
name="Correctness",
criteria="Determine whether the output is factually correct based on the context.",
evaluation_params=[
LLMTestCaseParams.ACTUAL_OUTPUT,
LLMTestCaseParams.EXPECTED_OUTPUT,
],
)
for trace in traces:
test_case = LLMTestCase(
input=trace.input,
actual_output=trace.output,
retrieval_context=[trace.metadata.get("context", "")],
)
relevancy.measure(test_case)
correctness.measure(test_case)
# Push scores back to Langfuse
langfuse.create_score(
trace_id=trace.id,
name="relevancy",
value=relevancy.score,
comment=relevancy.reason,
)
langfuse.create_score(
trace_id=trace.id,
name="correctness",
value=correctness.score,
comment=correctness.reason,
)
langfuse.shutdown()
print(f"Evaluated {len(traces)} traces")
运行后,评估分数会出现在Langfuse仪表板中,与原始追踪并列显示。你可以按分数过滤追踪来找到低质量的输出。
如何在CI/CD流水线中运行LLM评估?
将DeepEval集成到GitHub Actions中,在质量回归到达生产环境之前捕获它们。工作流在每个pull request上针对测试数据集运行评估套件。
创建.github/workflows/llm-eval.yml:
name: LLM Evaluation
on:
pull_request:
paths:
- 'prompts/**'
- 'src/llm/**'
jobs:
evaluate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install deepeval langfuse openai
- name: Run evaluation suite
env:
LANGFUSE_PUBLIC_KEY: ${{ secrets.LANGFUSE_PUBLIC_KEY }}
LANGFUSE_SECRET_KEY: ${{ secrets.LANGFUSE_SECRET_KEY }}
LANGFUSE_HOST: ${{ secrets.LANGFUSE_HOST }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: deepeval test run tests/test_llm_quality.py
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: eval-results
path: .deepeval/
创建测试文件tests/test_llm_quality.py:
import deepeval
from deepeval import assert_test
from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
from deepeval.dataset import EvaluationDataset
# Load test cases from a JSON file or define inline
test_cases = [
LLMTestCase(
input="What is the capital of France?",
actual_output=my_llm_function("What is the capital of France?"),
retrieval_context=["France is a country in Western Europe. Its capital is Paris."],
),
# Add more test cases covering your prompt changes
]
dataset = EvaluationDataset(test_cases=test_cases)
@deepeval.parametrize(dataset)
def test_answer_relevancy(test_case):
metric = AnswerRelevancyMetric(threshold=0.7)
assert_test(test_case, [metric])
@deepeval.parametrize(dataset)
def test_faithfulness(test_case):
metric = FaithfulnessMetric(threshold=0.7)
assert_test(test_case, [metric])
工作流仅在prompt模板或LLM代码变更时触发。如果任何指标低于阈值,PR构建失败。开发者可以在GitHub Actions日志中看到哪个测试用例失败以及差距多少。
如何备份Langfuse的PostgreSQL和ClickHouse?
没有备份的自托管是不负责任的。PostgreSQL保存用户账户、项目设置和API密钥。ClickHouse保存所有追踪数据。MinIO保存媒体上传和批量导出。
PostgreSQL备份
docker compose exec postgres pg_dump -U langfuse langfuse | gzip > /opt/backups/langfuse-pg-$(date +%F).sql.gz
ClickHouse备份
使用ClickHouse客户端逐表导出:
docker compose exec clickhouse clickhouse-client \
--user clickhouse \
--password "$(grep CLICKHOUSE_PASSWORD /opt/langfuse/.env | cut -d= -f2)" \
--query "SELECT * FROM traces FORMAT Native" > /opt/backups/traces-$(date +%F).native
ClickHouse也支持BACKUP DATABASE命令进行完整备份,但需要在ClickHouse服务器配置中配置allowed_disk。对于Docker部署,上面的逐表导出更简单。
MinIO备份
将MinIO数据同步到远程S3兼容bucket或本地目录:
docker run --rm --network langfuse_default \
minio/mc alias set local http://minio:9000 minio "$(grep MINIO_ROOT_PASSWORD /opt/langfuse/.env | cut -d= -f2)" && \
mc mirror local/langfuse /opt/backups/minio-langfuse/
使用cron自动化
cat > /etc/cron.d/langfuse-backup << 'EOF'
0 3 * * * root cd /opt/langfuse && docker compose exec -T postgres pg_dump -U langfuse langfuse | gzip > /opt/backups/langfuse-pg-$(date +\%F).sql.gz
0 4 * * * root find /opt/backups -name "langfuse-pg-*.sql.gz" -mtime +14 -delete
EOF
chmod 644 /etc/cron.d/langfuse-backup
生产加固
更新流程
拉取新镜像并重启:
cd /opt/langfuse
docker compose pull
docker compose up -d
Langfuse在启动时自动运行数据库迁移。更新后检查Web容器日志:
docker compose logs langfuse-web --tail 20
在输出中查找Ready。如果迁移失败,容器将无法启动。通过在docker-compose.yml中固定之前的版本标签来回滚(例如langfuse/langfuse:3.x.x)。
监控Langfuse本身
/api/public/health的健康端点在API正常时返回200 OK。添加?failIfDatabaseUnavailable=true进行包含数据库连接的深度检查。
监控ClickHouse卷的磁盘使用量。它是增长最快的组件:
docker system df -v | grep clickhouse
用以下命令监控容器内存:
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.CPUPerc}}"
防火墙
只有端口22(SSH)、80和443应该开放。所有数据库端口在compose文件中已绑定到127.0.0.1,但防火墙提供纵深防御 如何在Linux VPS上使用UFW和nftables配置防火墙:
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp
ufw allow 80/tcp
ufw allow 443/tcp
ufw enable
Langfuse vs LangSmith vs Opik
| 特性 | Langfuse | LangSmith | Opik |
|---|---|---|---|
| 许可证 | MIT(开源) | 专有 | Apache 2.0 |
| 自托管 | Docker Compose / K8s | 仅企业许可 | Docker Compose / K8s |
| 追踪存储 | 你的基础设施 | LangChain云 | 你的基础设施 |
| 评估框架 | 外部(DeepEval等) | 内置 | 内置 |
| OpenTelemetry | 原生支持 | 否 | 部分 |
Langfuse和Opik是自托管的两个可行开源选项。Langfuse拥有更大的社区和更多集成。LangSmith需要企业许可才能自托管。查看Langfuse自托管文档了解最新的部署选项。
故障排除
**容器持续unhealthy:**检查docker compose logs <service> --tail 100。常见原因:.env中密码错误(ClickHouse对用户名区分大小写),或MinIO在首次启动时无法初始化bucket。用docker compose down && docker compose up -d重启整个栈。
**Web容器循环崩溃并显示"invalid port number":**你的PostgreSQL密码包含特殊字符(/、+、=),破坏了postgresql://连接URL。使用openssl rand -hex 32而不是-base64重新生成所有密码,然后运行docker compose down -v && docker compose up -d用新凭据重置。
**Web容器崩溃显示"CLICKHOUSE_MIGRATION_URL is not configured":**在langfuse-web和langfuse-worker的environment部分添加CLICKHOUSE_MIGRATION_URL=clickhouse://clickhouse:9000。这使用端口9000上的ClickHouse原生协议,与端口8123上的HTTP API分开。
**通过反向代理的502/504错误:**将Langfuse Web容器中的KEEP_ALIVE_TIMEOUT设置为高于反向代理空闲超时的值。Caddy默认使用30s的keepalive。我们将Langfuse设为70s,Caddy设为75s。
**Worker关闭时间很长:**在负载下,worker在停止前会清空队列。这可能需要长达一小时。要更快地关闭,先缩减到零:docker compose stop langfuse-worker,等待队列清空,然后继续。
ClickHouse磁盘满了:设置数据保留。Langfuse在仪表板的Settings > Data Retention下提供内置的数据保留设置。根据存储容量配置追踪TTL。
**日志:**所有Langfuse容器输出到stdout。查看方式:
journalctl -u docker -f
docker compose logs -f
VPS上的AIOps:使用开源工具实现AI驱动的服务器管理 在VPS上自托管AI代理 在VPS上构建并自托管MCP服务器 [-> docker-compose-multi-service-vps]