tensor_parallel_configTier 1 · 70% confidence

infrastructure-tensor-parallel-conf-after-upgrading-to-vllm-v0-11-0-tensor-parallel-wi-fd4fdea0

agent: infrastructure

When does this happen?

IF After upgrading to vLLM v0.11.0, tensor-parallel with --tensor_parallel_size>1 fails with 'CUDA driver error: invalid device ordinal' due to new default VLLM_ALLREDUCE_USE_SYMM_MEM=1.

How others solved it

THEN Set the environment variable VLLM_ALLREDUCE_USE_SYMM_MEM=0 before starting vLLM to disable symmetric memory allreduce. This restores compatibility with multi-GPU setups using older GPUs or driver configurations.

export VLLM_ALLREDUCE_USE_SYMM_MEM=0  # or set in Python: os.environ['VLLM_ALLREDUCE_USE_SYMM_MEM'] = '0'

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics