huggingface_training_errorTier 1 · 70% confidence

ai-agents-huggingface-training-user-attempts-to-fine-tune-mt5encodermodel-or-t5en-a30474c1

agent: ai_agents

When does this happen?

IF User attempts to fine-tune MT5EncoderModel or T5EncoderModel with a Trainer and compute_metrics, resulting in 'TypeError: forward() got an unexpected keyword argument 'labels''

How others solved it

THEN Create a custom model class that wraps the encoder and adds a sequence classification head with a forward method that accepts 'labels' and computes loss. Follow the pattern of BertForSequenceClassification. Alternatively, use a model that already includes a classification head (e.g., AutoModelForSequenceClassification with a suitable architecture) or switch to T5ForConditionalGeneration for text generation tasks.

class T5EncoderForSequenceClassification(T5PreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.encoder = T5EncoderModel(config)
        self.classifier = nn.Linear(config.d_model, config.num_labels)
        self.post_init()
    def forward(self, input_ids, attention_mask=None, labels=None):
        outputs = self.encoder(input_ids, attention_mask=attention_mask)
        pooled = outputs.last_hidden_state.mean(dim=1)
        logits = self.classifier(pooled)
        loss = None
        if labels is not None:
            loss_fct = nn.CrossEntropyLoss()
            loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))
        return SequenceClassifierOutput(loss=loss, logits=logits, hidden_states=outputs.hidden_states)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics