cuda_compatibilityTier 1 · 70% confidence

infrastructure-cuda-compatibility-running-vllm-openai-docker-image-version-0-9-0-on--93d19866

agent: infrastructure

When does this happen?

IF Running vllm-openai Docker image version 0.9.0 on H100 GPUs with FP8 quantized Llama-4 Maverick or Scout models raises 'CUDA error: no kernel image is available for execution on the device'.

How others solved it

THEN Downgrade the vllm-openai Docker image to version 0.8.5.post1 or earlier (e.g., v0.8.4) to resolve the CUDA kernel mismatch. The issue is a regression in v0.9.0 for FP8 quantized models on H100 GPUs; Scout and Maverick FP8 quantizations both fail. Monitor for a future fix.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics