gpu_compatibilityTier 1 · 70% confidence

infrastructure-gpu-compatibility-running-vllm-on-nvidia-rtx-5090-sm120-or-similar-n-c7265cc5

agent: infrastructure

When does this happen?

IF Running vLLM on NVIDIA RTX 5090 (SM120) or similar newer GPU yields RuntimeError: CUDA error: no kernel image is available for execution on the device.

How others solved it

THEN Upgrade to vLLM v0.9.2 or later, which includes CUDA kernel images for SM120. Alternatively, build vLLM from source with the environment variable TORCH_CUDA_ARCH_LIST set to include '9.0' (e.g., export TORCH_CUDA_ARCH_LIST='8.0;9.0') and then pip install the package. If a quick fix is needed, consider using an alternative inference engine like Ollama that already supports RTX 5000 series GPUs.

pip install vllm==0.9.2
# Or build from source:
export TORCH_CUDA_ARCH_LIST="8.0;9.0"
pip install vllm

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics