摘要: import torch import torch.nn.functional as F import math def fmha(q, k, v, is_causal=True): return F.scaled_dot_product_attention(q, k, v, attn_mask=N 阅读全文
posted @ 2023-05-18 13:43 xytpai 阅读(30) 评论(0) 推荐(0) 编辑