BERT语言模型微调出现错误: AttributeError: 'str' object has no attribute 'item'

如下代码报错为 AttributeError: 'str' object has no attribute 'item'

for step, batch in enumerate(self.train_data):
                if step % 40 == 0 and not step == 0:
                    elapsed = format_time(time.time() - t0)
                    print('  Batch {:>5,}  of  {:>5,}.    Elapsed: {:}.'.format(step, len(self.train_data), elapsed))
                b_input_ids = batch[0].to(device)
                b_input_mask = batch[1].to(device)
                b_labels = batch[2].to(device)                
                model.zero_grad()              
                loss, logits = model(b_input_ids, 
                                     token_type_ids=None,
                                     attention_mask=b_input_mask,
                                     labels=b_labels)
                # accumulates the training loss over batches.
                total_train_loss += loss.item()
                loss.backward()
                torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
                optimizer.step()
                scheduler.step()
            avg_train_loss = total_train_loss / len(self.train_data)
            training_time = format_time(time.time() - t0)

Loss Type: 确保损失变量确实是 PyTorch 张量,而不是字符串或任何其他非数字类型。 损失应该是表示损失函数值的标量张量。

执行 print(type(loss)),看一下结果是不是 <class 'torch.Tensor'>。发现不是<class 'torch.Tensor'>,因此做如下修改。

for step, batch in enumerate(self.train_data):
                if step % 40 == 0 and not step == 0:
                    elapsed = format_time(time.time() - t0)
                    print('  Batch {:>5,}  of  {:>5,}.    Elapsed: {:}.'.format(step, len(self.train_data), elapsed))
                b_input_ids = batch[0].to(device)
                b_input_mask = batch[1].to(device)
                b_labels = batch[2].to(device)                
                model.zero_grad()              
                outpus = model(b_input_ids, 
                                     token_type_ids=None,
                                     attention_mask=b_input_mask,
                                     labels=b_labels)
                loss = outputs.loss
                # accumulates the training loss over batches.
                total_train_loss += loss.item()
                loss.backward()
                torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
                optimizer.step()
                scheduler.step()
            avg_train_loss = total_train_loss / len(self.train_data)
            training_time = format_time(time.time() - t0)

再次检查print(type(loss)),发现结果就是 <class 'torch.Tensor'>。

至此,成功解决。

posted @ 2023-11-17 09:46  夕月一弯  阅读(172)  评论(0编辑  收藏  举报