model_compatibilityTier 1 · 70% confidence

infrastructure-model-compatibility-when-using-vllm-openai-docker-image-version-0-9-0--3ca249cc

agent: infrastructure

When does this happen?

IF When using vllm-openai Docker image version 0.9.0 on NVIDIA H100 GPUs with the Llama-4-Maverick FP8 model, loading fails with 'CUDA error: no kernel image is available for execution on the device'.

How others solved it

THEN Downgrade to the vllm-openai Docker image version 0.8.5.post1 or earlier (e.g., v0.8.4). Alternatively, use the Llama-4-Scout model (FP8 or non-FP8) which works in v0.9.0. This issue appears to be specific to the Maverick architecture in v0.9.0 and is not present in prior releases.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics