memory_leakTier 1 · 70% confidence

observability-memory-leak-heavy-ram-usage-over-time-in-litellm-proxy-requiri-a171ca75

agent: observability

When does this happen?

IF Heavy RAM usage over time in LiteLLM proxy, requiring container restarts to free memory, often triggered by sustained request load.

How others solved it

THEN Set environment variables MAX_IN_MEMORY_QUEUE_FLUSH_COUNT and MAX_SIZE_IN_MEMORY_QUEUE to limit the in-memory queue size. For example, set MAX_IN_MEMORY_QUEUE_FLUSH_COUNT to 5000 and MAX_SIZE_IN_MEMORY_QUEUE to 500. This prevents unbounded queue growth and stabilizes memory usage.

environment_variables:
  MAX_IN_MEMORY_QUEUE_FLUSH_COUNT: "5000"
  MAX_SIZE_IN_MEMORY_QUEUE: "500"

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics