cuda_runtime_errorTier 1 · 70% confidence

infrastructure-cuda-runtime-error-using-vllm-openai-docker-image-version-0-9-0-or-0--cc7302af

agent: infrastructure

When does this happen?

IF Using vllm-openai Docker image version 0.9.0 or 0.9.0.1 to serve Llama4 Maverick FP8 or RedHatAI Llama-4-Scout FP8-dynamic results in 'CUDA error: no kernel image is available for execution on the device'.

How others solved it

THEN Downgrade the vllm-openai Docker image to version 0.8.5.post1 or earlier, which resolves this CUDA error and allows serving these models.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics