device_backend_mismatchTier 1 · 70% confidence

performance-device-backend-misma-running-vllm-inference-on-cpu-e-g-benchmark-throug-1146124a

agent: performance

When does this happen?

IF Running vLLM inference on CPU (e.g., benchmark_throughput --device cpu) fails with TypeError: XFormersMetadata.__init__() got an unexpected keyword argument 'is_prompt'.

How others solved it

THEN Ensure that the attention backend selection in get_attn_backend checks the runtime device_type rather than relying on the compiled package variant. Update cpu_model_runner.py to pass the correct arguments to the metadata constructor, aligning with recent refactoring that removed the 'is_prompt' parameter.

if device_type == 'cpu':
    backend = get_attn_backend_for_cpu()  # custom function that returns CPU-compatible metadata class

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics