Junfei_Wang - 博客园

2018年8月24日

【读书笔记】：MIT线性代数(4):Independence, Basis and Dimension

摘要： Independence: The columns of A are independent when the nullspace N (A) contains only the zero vector. Example1: 1. If three vectors are not in the sa 阅读全文

posted @ 2018-08-24 08:06 Junfei_Wang 阅读(504) 评论(0) 推荐(0) 编辑

2018年8月21日

【读书笔记】：MIT线性代数(3):Special Solution, Rank and RREF

摘要： Special Solutions: Notice what is special about s 1 and S2. They have ones and zeros in the last two components. Those components are "free" and we ch 阅读全文

posted @ 2018-08-21 20:30 Junfei_Wang 阅读(1161) 评论(0) 推荐(0) 编辑

【读书笔记】：MIT线性代数(2):Vector Spaces and Subspaces

摘要： Vector Space: R1, R2, R3,R4 , .... Each space Rn consists of a whole collection of vectors. R5 contains all column vectors with five components. This 阅读全文

posted @ 2018-08-21 07:51 Junfei_Wang 阅读(505) 评论(0) 推荐(0) 编辑

2018年8月10日

【读书笔记】：MIT线性代数(1):Linear Combinations

摘要： 1. Linear Combination Two linear operations of vectors: Linear combination: 2.Geometric Explainations 2D case 3D case:for 3 vectors u,v,w,the importan 阅读全文

posted @ 2018-08-10 13:34 Junfei_Wang 阅读(411) 评论(0) 推荐(0) 编辑

2018年7月13日

Adam Optimization Algorithm

摘要：曾经多次看到别人说起，在选择Optimizer的时候默认就选Adam。这样的建议其实比较尴尬，如果有一点科学精神的人，其实就会想问为什么，并搞懂这一切，这也是我开这个Optimizer系列的原因之一。前面介绍了Momentum，也介绍了RMSProp，其实Adam就是二者的结合，再加上偏差修正(Bi 阅读全文

posted @ 2018-07-13 20:24 Junfei_Wang 阅读(764) 评论(0) 推荐(0) 编辑

2018年7月11日

AdaGrad Algorithm and RMSProp

摘要： AdaGrad全称是Adaptive Gradient Algorithm，是标准Gradient Descent的又一个派生算法。标准Gradient Descent的更新公式为：其中Learning Rate α对于Cost Function的各个feature都一样，但同一个α几乎不可能在各阅读全文

posted @ 2018-07-11 15:52 Junfei_Wang 阅读(1114) 评论(0) 推荐(0) 编辑

2018年7月9日

Gradient Descent with Momentum and Nesterov Momentum

摘要：在Batch Gradient Descent及Mini-batch Gradient Descent, Stochastic Gradient Descent(SGD)算法中，每一步优化相对于之前的操作，都是独立的。每一次迭代开始，算法都要根据更新后的Cost Function来计算梯度，并用该梯阅读全文

posted @ 2018-07-09 20:15 Junfei_Wang 阅读(645) 评论(0) 推荐(0) 编辑

2018年7月4日

Parameter Initializations in Deep Learning

摘要：全零初始化的问题：在Linear Regression中，常用的参数初始化方式是全零，因为在做Gradient Descent的时候，各个参数会在输入的各个分量维度上各自更新。更新公式为：而在Neural Network（Deep Learning）中，当我们将所有的parameters做全零初阅读全文

posted @ 2018-07-04 18:39 Junfei_Wang 阅读(336) 评论(0) 推荐(0) 编辑

2018年6月30日

L2 Regularization for Neural Nerworks

摘要： L2 Regularization是解决Variance（Overfitting）问题的方案之一，在Neural Network领域里通常还有Drop Out, L1 Regularization等。无论哪种方法，其Core Idea是让模型变得更简单，从而平衡对training set完美拟合、以阅读全文

posted @ 2018-06-30 18:47 Junfei_Wang 阅读(183) 评论(0) 推荐(0) 编辑

2018年6月15日

Activation Functions and Their Derivatives

摘要： 1. Sigmoid Function: when z=0,g'(z)=0.25 2. tanh Function: when x=0,tanh'(x)=1 3. Relu 阅读全文

posted @ 2018-06-15 18:58 Junfei_Wang 阅读(169) 评论(0) 推荐(0) 编辑

Rhys_Wang

公告