neural network - 随笔分类 - chease

万事皆可Seq2Seq

摘要：转自：https://kexue.fm/archives/7867 以T5的预训练为例，其包含的无监督和有监督两部分都改成了Seq2Seq模式：无监督部分使用完形填空的形式输入：明月几时有，[M0] 问青天，不知**[M1]，今夕是何年？我欲[M2]归去，又恐琼楼玉宇，高处[M3]；起舞[M4 阅读全文

posted @ 2023-03-03 17:46 chease 阅读(30) 评论(0) 推荐(0) 编辑

prompting learning

摘要：Prompt 定义： Prompt is the technique of making better use of the knowledge from the pre-trained model by adding additional texts to the input. Prompt 是一阅读全文

posted @ 2022-08-05 18:43 chease 阅读(86) 评论(0) 推荐(0) 编辑

GPT&T5

摘要：GPT对比GPT-2：https://zhuanlan.zhihu.com/p/96791725 Pytorch——GPT-2 预训练模型及文本生成：https://www.cnblogs.com/wwj99/p/12503545.html 训练目标图解 GPT-2：https://zhuanla 阅读全文

posted @ 2021-07-05 15:44 chease 阅读(427) 评论(0) 推荐(0) 编辑

半监督学习

摘要：https://zhuanlan.zhihu.com/p/33196506 待 https://www.jiqizhixin.com/graph/technologies/11fab423-a92b-48d5-8212-79b1ecb46551 阅读全文

posted @ 2020-12-31 15:20 chease 阅读(56) 评论(0) 推荐(0) 编辑

自监督学习

摘要：转自：https://zhuanlan.zhihu.com/p/108906502 1. 什么是自监督学习？自监督学习主要是利用辅助任务（pretext）从大规模的无监督数据中挖掘自身的监督信息，通过这种构造的监督信息对网络进行训练，从而可以学习到对下游任务有价值的表征。 2.如何评测自监督学习的阅读全文

posted @ 2020-11-17 19:54 chease 阅读(2268) 评论(0) 推荐(0) 编辑

深度学习中的Normalization - Batch Normalization、Layer Normalization、Weight Normalization

摘要：转自：https://zhuanlan.zhihu.com/p/33173246 https://zhuanlan.zhihu.com/p/55102378 目录： 1. 为什么需要 Normalization ——深度学习中的 Internal Covariate Shift 问题及其影响 2. 阅读全文

posted @ 2020-10-20 14:44 chease 阅读(307) 评论(0) 推荐(0) 编辑

TCN

摘要：论文: https://arxiv.org/pdf/1803.01271.pdf github：https://github.com/locuslab/TCN 转自：http://nooverfit.com/wp/%E6%97%B6%E9%97%B4%E5%8D%B7%E7%A7%AF%E7%BD% 阅读全文

posted @ 2020-10-12 20:04 chease 阅读(958) 评论(0) 推荐(0) 编辑

用户行为序列相关

摘要：转自：https://coladrill.github.io/2020/06/01/%E7%94%A8%E6%88%B7%E8%A1%8C%E4%B8%BA%E5%BA%8F%E5%88%97%E5%BB%BA%E6%A8%A1/ https://zhuanlan.zhihu.com/p/13813 阅读全文

posted @ 2020-09-30 16:45 chease 阅读(722) 评论(0) 推荐(0) 编辑

损失函数loss相关

摘要：Focal loss 参考： https://zhuanlan.zhihu.com/p/49981234 目标通过减少易分类样本的权重，从而使得模型在训练时更专注于难分类的样本。公式论文实验中γ=2，a=0.25的效果最好阅读全文

posted @ 2020-09-24 17:55 chease 阅读(174) 评论(0) 推荐(0) 编辑

神经网络work值得注意的点

摘要：转自：https://zhuanlan.zhihu.com/p/29247151 1. 忘了数据规范化 What? 在使用神经网络的过程中，非常重要的一点是要考虑好怎样规范化（normalize）你的数据。这一步不能马虎，不正确、仔细完成规范化的话，你的网络将会不能正常工作。因为规范化数据这个重阅读全文

posted @ 2020-09-21 20:01 chease 阅读(284) 评论(0) 推荐(0) 编辑

激活函数

摘要：1、常用激活函数 Rectified Linear Unit(ReLU) - 用于隐层神经元输出 Sigmoid - 用于隐层神经元输出 Softmax - 用于多分类神经网络输出 Linear - 用于回归神经网络输出（或二分类问题）常用激活函数介绍参考： https://blog.csdn.n 阅读全文

posted @ 2020-09-15 20:50 chease 阅读(190) 评论(0) 推荐(0) 编辑

高阶交叉特征相关模型-wide&deep Autoint Simclr

摘要：叉乘（feature crosses）： https://segmentfault.com/a/1190000014799038 、https://www.cnblogs.com/lightmare/p/10398788.html https://zhuanlan.zhihu.com/p/16435 阅读全文

posted @ 2020-09-07 20:03 chease 阅读(256) 评论(0) 推荐(0) 编辑

样本不平衡问题待

摘要：https://zhuanlan.zhihu.com/p/56882616 https://blog.csdn.net/ZesenChen/article/details/85057641 https://zhuanlan.zhihu.com/p/40464595 https://zhuanlan. 阅读全文

posted @ 2020-09-07 20:00 chease 阅读(104) 评论(0) 推荐(0) 编辑

短文本匹配的利器-ESIM

摘要：转自： https://zhuanlan.zhihu.com/p/141622985 论文：《Enhanced LSTM for Natural Language Inference》 github： https://github.com/JasonForJoy/ESIM-NLI 、 https:/ 阅读全文

posted @ 2020-09-03 11:17 chease 阅读(544) 评论(0) 推荐(0) 编辑

Attention机制

摘要：转自： https://easyai.tech/ai-definition/attention/ https://www.zhihu.com/question/68482809 https://zhuanlan.zhihu.com/p/46313756 paper 《NEURAL MACHINE T 阅读全文

posted @ 2020-06-28 15:00 chease 阅读(1872) 评论(0) 推荐(0) 编辑

小样本学习（Few-shot Learning）

摘要：参考资料： https://zhuanlan.zhihu.com/p/61215293 https://www.zmonster.me/2019/12/08/few-shot-learning.html https://zhuanlan.zhihu.com/p/136975128 论文： 1、 Me 阅读全文

posted @ 2020-04-01 19:24 chease 阅读(7629) 评论(0) 推荐(3) 编辑

TensorFlow Eager 模式

摘要：https://zhuanlan.zhihu.com/p/47201474 阅读全文

posted @ 2019-11-07 19:26 chease 阅读(189) 评论(0) 推荐(0) 编辑

综述类解读

摘要：1. 深度学习的优劣 Gary Marcus 曾经说过，深度学习是贪婪，脆弱，不透明和浅薄的。这些系统很贪婪，因为它们需要大量的训练数据；它们是脆弱的，因为当神经网络应用在一些不熟悉的场景时，面对与训练中使用的示例不同的场景，它并不能很好的完成任务；它们是不透明的，因为与传统的可调试代码不同，神经网阅读全文

posted @ 2019-10-28 17:28 chease 阅读(209) 评论(0) 推荐(0) 编辑

GNN

摘要：了解图神经网络GNN和2种高级算法「DeepWalk」+ 「GraphSage」：https://easyai.tech/blog/gnn-deepwalk-graphsage/ 二、GCN 2.1、概念讲解透彻 https://zhuanlan.zhihu.com/p/51990489 2.2、推阅读全文

posted @ 2019-08-23 17:43 chease 阅读(442) 评论(0) 推荐(0) 编辑

论文阅读索引

摘要：一、多任务学习 1.1、《Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts》--2018 googloe 其它参考：https://www.cnblogs.com/arachis 阅读全文

posted @ 2019-08-22 20:12 chease 阅读(252) 评论(0) 推荐(0) 编辑

chease

随笔分类 - neural network

公告

搜索

常用链接

随笔分类

随笔档案

阅读排行榜

评论排行榜

推荐排行榜

最新评论