10 2021 档案
《RETHINKING POSITIONAL ENCODING IN LANGUAGE PRE-TRAINING》TUPE论文复现
摘要:论文《TUPE》复现 原有的注意力计算公式拆分为四部分后发现,中间两部分(word-to-position, position-to-word)对于识别并没有什么明显的作用,并且第一部分(word-to-word)和第四部分论文提出将位置信息与词嵌入信息分离开选择各自的权重矩阵来更新参数,提出的原因
transformer代码笔记----pre_process.py
摘要:import os import pickle from tqdm import tqdm from config import wav_folder, transcript_file, pickle_file from utils import ensure_folder def get_data
transformer代码笔记----transformer.py
摘要:import torch.nn as nn from .decoder import Decoder from .encoder import Encoder class Transformer(nn.Module): #定义类,继承父类nn.Module """An encoder-decoder
transformer代码笔记----decoder.py
摘要:import torch import torch.nn as nn import torch.nn.functional as F from config import IGNORE_ID from .attention import MultiHeadAttention from .module
transformer代码笔记----attention.py
摘要:import numpy as np import torch import torch.nn as nn class MultiHeadAttention(nn.Module): ''' Multi-Head Attention module ''' def __init__(self, n_he
transformer代码笔记----encoder.py
摘要:import torch.nn as nn from .attention import MultiHeadAttention #引进多头注意力模块 from .module import PositionalEncoding, PositionwiseFeedForward #位置编码和前馈网络
transformer代码笔记----train.py
摘要:import numpy as np import torch # from torch.utils.tensorboard import SummaryWriter import torch.nn as nn import argparse from tqdm import tqdm from c