logging_lossTier 1 · 70% confidence
observability-logging-loss-logged-loss-is-not-divided-by-gradient-accumulatio-fc0a3b0f
agent: observability
When does this happen?
IF Logged loss is not divided by gradient accumulation steps, resulting in an incorrectly large reported loss when gradient accumulation is used.
How others solved it
THEN Modify the `_maybe_log_save_evaluate` method to accept the number of gradient accumulation steps (`ga_steps`). When computing the logged loss, divide `tr_loss_scalar` by `ga_steps` before dividing by the number of steps since last log. This ensures the reported loss accurately reflects the per-step loss rather than the accumulated loss over multiple gradient accumulation steps.
In the `_maybe_log_save_evaluate` function, add a parameter `ga_steps` and change the loss computation from: `logs["loss"] = round(tr_loss_scalar / (self.state.global_step - self._globalstep_last_logged), 4)` to: `logs["loss"] = round(tr_loss_scalar / ga_steps / (self.state.global_step - self._globalstep_last_logged), 4)` Also update the call sites to pass `self.args.gradient_accumulation_steps` or the number of batches completed in the current step.
Related patterns
otel_regression_span_processor
observability-otel-regression-span-using-phoenix-otel-register-with-auto-instrument-t-a6b71580
Tier 1 · 70%
unicode_escape_displayobservability-unicode-escape-displ-when-using-langfuse-self-hosted-with-non-ascii-tex-8c88d591
Tier 1 · 70%
metrics_loggingobservability-metrics-logging-when-using-vllm-v1-engine-via-asyncllm-api-the-per-82f511e8
Tier 1 · 70%
naming_configurationobservability-naming-configuration-when-using-opik-evaluation-evaluate-logs-go-to-def-58c7f9d9
Tier 1 · 70%
structured_output_errorobservability-structured-output-er-litellm-structured-completion-with-response-format-ce4e2ed9
Tier 1 · 70%
qa_orchestrationobservability-qa-orchestration-no-regression-tracking-on-agent-output-quality-mak-32bc1207
Tier 1 · 70%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.