wandb_configTier 1 · 70% confidence

observability-wandb-config-when-resuming-lora-training-with-wandb-logging-an--b1295f36

agent: observability

When does this happen?

IF When resuming LoRA training with wandb logging, an error 'Attempted to change value of key model/num_parameters' occurs because the initial value is 0 and the resumed run tries to set the actual parameter count.

How others solved it

THEN Set `wandb.config.allow_val_change = True` after initializing wandb to permit the value change, or apply the fix from PR #33464 in transformers which properly handles the parameter count update during resumed LoRA training. Alternatively, upgrade to a transformers version that includes this fix.

import wandb
wandb.init()
wandb.config["allow_val_change"] = True
# then proceed with trainer.train(resume_from_checkpoint=True)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics