摘要:这一节主要讲如何通过数据来合理的验证模型好不好。首先,否定了Ein来选模型和Etest来选模型。(1)模型越复杂,Ein肯定越好;但是Eout就不一定了(见上一节的overfitting等)(2)Etest是偷窥训练集,也没有效果下面,集中讨论已有的数据集切分成train data和test dat...
阅读全文
摘要:正则化的提出,是因为要解决overfitting的问题。以Linear Regression为例:低次多项式拟合的效果可能会好于高次多项式拟合的效果。这里回顾上上节nonlinear transform的课件:上面的内容说的是,多项式拟合这种的假设空间,是nested hypothesis;因此,能...
阅读全文
摘要:首先明确了什么是Overfitting随后,用开车的例子给出了Overfitting的出现原因出现原因有三个:(1)dvc太高,模型过于复杂(开车开太快)(2)data中噪声太大(路面太颠簸)(3)数据量N太小(知道的路线太少)这里(1)是前提,模型太复杂:(1)模型越复杂,就会捕获train da...
阅读全文
摘要:关注了Q18~Q20三道编程作业题。这三道题都与Logistic Regression相关。Q18~19是用全量梯度下降实现Logistic Regression;Q20要求用随机梯度下降实现Logistic Regression。这三题的代码都何在一个py文件中了。个人觉得,这道题的程序设计,完全...
阅读全文
摘要:题目:第一次刷的时候漏掉了这道题。Given a sorted linked list, delete all duplicates such that each element appear onlyonce.For example,Given1->1->2, return1->2.Given1-...
阅读全文
摘要:首先回顾了几个Linear Model的共性:都是算出来一个score,然后做某种变化处理。既然Linear Model有各种好处(训练时间,公式简单),那如何把Linear Regression给应用到Classification的问题上呢?到底能不能迁移呢?总结了如下的集中Linear Mode...
阅读全文
摘要:这里提出Logistic Regression的角度是Soft Binary Classification。输出限定在0~1之间,用于表示可能发生positive的概率。具体的做法是在Linear Regression的基础上,再加一层Logistic Function,限定住输出的取值。完成了hy...
阅读全文
摘要:这一节开始讲基础的Linear Regression算法。(1)Linear Regression的假设空间变成了实数域(2)Linear Regression的目标是找到使得残差更小的分割线(超平面)下面进入核心环节:Linear Regression的优化目标是minimize Ein(W)为了...
阅读全文
摘要:作业一被bubuko抓取了,要是能注明转载就更好了(http://bubuko.com/infodetail-916604.html) 作业二关注的题目是需要coding的Q16~Q20 Q16理解了一段时间,题目阐述的不够详细。理解了题意之后,发现其实很简单。 理解问题的关键是题目中给的's'是啥
阅读全文
摘要:http://beader.me/mlnotebook/section2/noise-and-error.html上面这个日志总结的已经很好了。这一章的内容,在后面具体的算法中cost function体会更好一些。没必要过于纠结。
阅读全文
摘要:首先回顾上节课末尾引出来的VC Bound概念,对于机器学习来说,VC dimension理论到底有啥用。三点:1. 如果有Break Point证明是一个好的假设集合2. 如果N足够大,那么Ein跟Eout的表现会比较接近3. 如果算法A选的g足够好(Ein很小),则可能从数据中学到了东西====...
阅读全文
摘要:紧接上一讲的Break Point of H。有一个非常intuition的结论,如果break point在k取到了,那么k+1, k+2,... 都是break point。那么除此之外,我们还能获得那些讯息?这里举了一些例子,核心就是说下面的事情简言之,如果H有Break Point k,那么...
阅读全文
摘要:接着上一讲留下的关子,机器学习是否可行与假设集合H的数量M的关系。机器学习是否可行的两个关键点:1. Ein(g)是否足够小(在训练集上的表现是否出色)2. Eout(g)是否与Ein(g)足够接近(在训练集上的表现能否迁移到测试集上)(1)如果假设集合数量小(M小),可知union bound后,...
阅读全文
摘要:这一节的核心内容在于如何由hoeffding不等式 关联到机器学习的可行性。这个PAC很形象又准确,描述了“当前的可能性大概是正确的”,即某个概率的上届。hoeffding在机器学习上的关联就是:如果样本数量足够大,那么在训练集上获得的学习效果是可以平移到测试集上的。即如下,这里保证的仅仅是“训练集...
阅读全文
摘要:作业方面,暂时只关注需要编程的题目了,用python完成代码。Q15~Q17应用的是传统PLA算法,给定的数据集也是保证线性可分的。代码需要完成的就是实现一个简单的PLA,并且“W = W + speed*yX”中的speed是可以配置的(即学习速率)代码1#encoding=utf8import ...
阅读全文
摘要:题目:Follow up for "Search in Rotated Sorted Array":What ifduplicatesare allowed?Would this affect the run-time complexity? How and why?Write a function...
阅读全文
摘要:直接跳过第一讲。从第二讲Perceptron开始,记录这一讲中几个印象深的点:1. 之前自己的直觉一直对这种图理解的不好,老按照x、y去理解。a) 这种图的每个坐标代表的是features;features的值是有物理意义的。b) 而圈圈和叉叉是为了标注不同的样本(正样本 负样本),即label;为...
阅读全文
摘要:前天刷完了leetcode的150题~算是完成了一个节点~后续每天回顾下几题,保持感觉~机器学习这块,感觉自己在实际上线的项目中也用过,各种算法也了解。但是面试时让我一点儿不差的推导出来SVM,真是做不到,就是基础不牢固。决定跟一下台大林轩田的机器学习课程,继续夯实基础为求职工作准备。每跟完一个章节...
阅读全文
摘要:题目:Givennpoints on a 2D plane, find the maximum number of points that lie on the same straight line.代码:/** * Definition for a point. * struct Point { ...
阅读全文
摘要:题目:Given an array of words and a lengthL, format the text such that each line has exactlyLcharacters and is fully (left and right) justified.You shoul...
阅读全文
摘要:题目:Divide two integers without using multiplication, division and mod operator.If it is overflow, return MAX_INT.代码:class Solution {public: int div...
阅读全文
摘要:题目:The string"PAYPALISHIRING"is written in a zigzag pattern on a given number of rows like this: (you may want to display this pattern in a fixed font...
阅读全文
摘要:题目:Given an integern, generate a square matrix filled with elements from 1 ton2in spiral order.For example,Givenn=3,You should return the following ma...
阅读全文
摘要:题目:Given a matrix ofmxnelements (mrows,ncolumns), return all elements of the matrix in spiral order.For example,Given the following matrix:[ [ 1, 2, 3...
阅读全文
摘要:题目:Given an indexk, return thekthrow of the Pascal's triangle.For example, givenk= 3,Return[1,3,3,1].Note:Could you optimize your algorithm to use onl...
阅读全文
摘要:题目:GivennumRows, generate the firstnumRowsof Pascal's triangle.For example, givennumRows= 5,Return[ [1], [1,1], [1,2,1], [1,3,3,1], [1,4,6,4...
阅读全文
摘要:题目:You are given a string,s, and a list of words,words, that are all of the same length. Find all starting indices of substring(s) insthat is a concat...
阅读全文
摘要:题目:Given two numbers represented as strings, return multiplication of the numbers as a string.Note: The numbers can be arbitrarily large and are non-n...
阅读全文
摘要:题目:Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).For example,S="ADOBECOD...
阅读全文
摘要:题目:Given a collection of intervals, merge all overlapping intervals.For example,Given[1,3],[2,6],[8,10],[15,18],return[1,6],[8,10],[15,18].代码:/** * De...
阅读全文
摘要:题目:Given a set ofnon-overlappingintervals, insert a new interval into the intervals (merge if necessary).You may assume that the intervals were initia...
阅读全文
摘要:题目:Determine whether an integer is a palindrome. Do this without extra space.click to show spoilers.Some hints:Could negative integers be palindromes?...
阅读全文
摘要:题目:Reverse digits of an integer.Example1:x = 123, return 321Example2:x = -123, return -321click to show spoilers.Have you thought about this?Here are ...
阅读全文
摘要:题目:Clone an undirected graph. Each node in the graph contains alabeland a list of itsneighbors.OJ's undirected graph serialization:Nodes are labeled u...
阅读全文
摘要:题目:Given a stringsand a dictionary of wordsdict, add spaces insto construct a sentence where each word is a valid dictionary word.Return all such poss...
阅读全文
摘要:题目:Given a stringsand a dictionary of wordsdict, determine ifscan be segmented into a space-separated sequence of one or more dictionary words.For exa...
阅读全文
摘要:题目:Given a stringSand a stringT, count the number of distinct subsequences ofTinS.A subsequence of a string is a new string which is formed from the o...
阅读全文
摘要:题目:A message containing letters fromA-Zis being encoded to numbers using the following mapping:'A' -> 1'B' -> 2...'Z' -> 26Given an encoded message co...
阅读全文
摘要:题目:Given two wordsword1andword2, find the minimum number of steps required to convertword1toword2. (each operation is counted as 1 step.)You have the ...
阅读全文
摘要:题目:Given amxngrid filled with non-negative numbers, find a path from top left to bottom right whichminimizesthe sum of all numbers along its path.Note...
阅读全文
摘要:题目:Given a strings1, we may represent it as a binary tree by partitioning it to two non-empty substrings recursively.Below is one possible representat...
阅读全文
摘要:题目:Givens1,s2,s3, find whethers3is formed by the interleaving ofs1ands2.For example,Given:s1="aabcc",s2="dbbca",Whens3="aadbbcbcac", return true.Whens...
阅读全文
摘要:题目:Say you have an array for which theithelement is the price of a given stock on dayi.Design an algorithm to find the maximum profit. You may complet...
阅读全文
摘要:题目:Given a 2D binary matrix filled with 0's and 1's, find the largest rectangle containing all ones and return its area.代码:class Solution {public: ...
阅读全文
摘要:题目:Given a strings, partitionssuch that every substring of the partition is a palindrome.Return the minimum cuts needed for a palindrome partitioning ...
阅读全文