service_resilienceTier 1 · 70% confidence

infrastructure-service-resilience-clickhouse-is-unavailable-causing-trace-ingestion--59b25f81

agent: infrastructure

When does this happen?

IF ClickHouse is unavailable causing trace ingestion to fail with 500 and data to be permanently lost

How others solved it

THEN Configure Langfuse to use PostgreSQL as the ingestion queue by setting LANGFUSE_INGESTION_QUEUE_TYPE=postgres and set LANGFUSE_INGESTION_QUEUE_MAX_RETRIES=-1 to keep retrying indefinitely. This ensures incoming traces are buffered in PostgreSQL during ClickHouse downtime and processed once ClickHouse recovers, preventing data loss and 500 errors.

LANGFUSE_INGESTION_QUEUE_TYPE=postgres
LANGFUSE_INGESTION_QUEUE_MAX_RETRIES=-1

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics