BERT
🥥 Table of Content
00 - Overview
01 - Input Data Preprocessing
03 - Fine Tune & Adapter
04 - NLP Tasks Examples
🥑 Get Started!
00 - Overview
Article 1: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Article 2: DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Article 3: Exploring Efficient-tuning Methods in Self-supervised Speech Models
pre-training MLM NSP
attention_mask using element-wise product, 文字为1,padding部分为0
type_ids encode_plus




01 - Input Data Preprocessing
<1> Tokenization & Dataset & DataLoader
03 - Fine Tune & Adapter
Video 1:【生成式AI】Finetuning vs. Prompting:對於大型語言模型的不同期待所衍生的兩類使用方式 (1/3)
Video 2:【生成式AI】Finetuning vs. Prompting:對於大型語言模型的不同期待所衍生的兩類使用方式 (2/3)
Video 3:【生成式AI】Finetuning vs. Prompting:對於大型語言模型的不同期待所衍生的兩類使用方式 (3/3)
Resource 1: Why use Efficient Fine-Tuning?
<1> Full Fine Tuning
<2> Adapter(Parameter-Efficient Fine-Tuning, peft)
04 - NLP Tasks Examples
<1> Sentiment Analysis with DistillBERT
# Load Data and Configuration
texts = ['today is not that bad',
'today is so bad',
'so good tonight']
model_name = 'distilbert/distilbert-base-uncased-finetuned-sst-2-english'
# Instantiation
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Tokenizer
batch_input = tokenizer(texts, truncation=True, padding=True, return_tensors='pt')
'''
{'input_ids': tensor([[ 101, 2651, 2003, 2025, 2008, 2919, 102],
[ 101, 2651, 2003, 2061, 2919, 102, 0],
[ 101, 2061, 2204, 3892, 102, 0, 0]]),
'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0, 0]])}
'''
# Model
import numpy as np
import pandas as pd
import torch
import torch.nn.functional as F
with torch.no_grad():
outputs = model(**batch_input) # SequenceClassifierOutput(loss=None, logits=tensor([[ 0.2347, -0.1015],[ 0.1364, -0.3081],[ 0.0071, -0.4359]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)
logits = outputs.logits # tensor([[-3.4620, 3.6118],[ 4.7508, -3.7899],[-4.2113, 4.5724]])
scores = F.softmax(logits, dim=-1) # tensor([[8.4632e-04, 9.9915e-01],[9.9980e-01, 1.9531e-04],[1.5318e-04, 9.9985e-01]])
labels_ids = torch.argmax(scores, dim=-1) # tensor([1, 0, 1])
labels = [model.config.id2label[id] for id in labels_ids.tolist()] # ['POSITIVE', 'NEGATIVE', 'POSITIVE']
# Save
target_cols = ['label']
submission = pd.DataFrame(labels, columns=target_cols)
submission.to_csv('submission.csv', index=False)
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek “源神”启动!「GitHub 热点速览」
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· 我与微信审核的“相爱相杀”看个人小程序副业
· C# 集成 DeepSeek 模型实现 AI 私有化(本地部署与 API 调用教程)
· spring官宣接入deepseek,真的太香了~