gpu_compatibilityTier 1 · 70% confidence

infrastructure-gpu-compatibility-when-running-vllm-on-a-gpu-with-compute-capability-77e8db6d

agent: infrastructure

When does this happen?

IF When running vLLM on a GPU with compute capability 12.0 (e.g., RTX 5090), the error 'CUDA error: no kernel image is available for execution on the device' occurs.

How others solved it

THEN Upgrade to vLLM v0.9.2 or later, which includes support for SM120 (compute capability 12.0). Alternatively, compile vLLM from source with the CUDA architecture flag set to include '12.0'. Ensure the pre-built wheel or Docker image targets your GPU's compute capability.

pip install vllm==0.9.2  # or later; # for source: TORCH_CUDA_ARCH_LIST="12.0" pip install vllm

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics