tensor_parallelTier 1 · 70% confidence

infrastructure-tensor-parallel-after-upgrading-to-vllm-v0-11-0-tensor-parallel-wi-529b0ae4

agent: infrastructure

When does this happen?

IF After upgrading to vLLM v0.11.0, tensor-parallel with --tensor_parallel_size > 1 fails with 'invalid device ordinal' due to new default VLLM_ALLREDUCE_USE_SYMM_MEM=1.

How others solved it

THEN Set the environment variable VLLM_ALLREDUCE_USE_SYMM_MEM=0 before starting the vLLM server. This reverts to the previous default behavior and resolves the GPU device ordinal error, restoring multi-GPU tensor parallelism.

export VLLM_ALLREDUCE_USE_SYMM_MEM=0
vllm serve openai/gpt-oss-120b --tensor_parallel_size=2 --gpu-memory-utilization=0.95

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics