high_cpu_idleTier 1 · 70% confidence

performance-high-cpu-idle-100-cpu-usage-on-two-cores-when-server-is-idle-and-df25468c

agent: performance

When does this happen?

IF 100% CPU usage on two cores when server is idle and --tensor-parallelism is set to 2 or higher on XPU devices.

How others solved it

THEN Apply the patch from vLLM PR #16226 to `/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/shm_broadcast.py`. The patch fixes a busy-waiting loop in shared memory broadcast logic. After editing the file, restart the container or service.

# Inside the container, edit shm_broadcast.py to replace the busy-wait with a proper polling mechanism (see PR #16226). Example modification: Use a condition variable or yield to avoid spinning.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics