gpu_device_mismatchTier 1 · 70% confidence

infrastructure-gpu-device-mismatch-using-gemma2-model-with-device-map-auto-on-a-multi-6e0604a9

agent: infrastructure

When does this happen?

IF Using Gemma2 model with device_map='auto' on a multi-GPU system triggers RuntimeError: Expected all tensors to be on the same device, but found at least two devices.

How others solved it

THEN Set the environment variable CUDA_VISIBLE_DEVICES to a single GPU ID before loading the model, or downgrade transformers to version 4.43.4 to avoid this regression. Alternatively, load the model with device_map='balanced' or 'sequential' and manually move input tensors to the correct device.

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"  # forces single GPU
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('google/gemma-2-2b', device_map='auto')
input_ids = tokenizer.encode('text', return_tensors='pt').to('cuda')

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics