Supervised Learning000

the process of supervised learning

given dataset with labels then learn the hypothesis/predict function h: x->y through learning algorithm. When the target variable we try to predict is continuous, we call it regression problem. Whrn the target variable we try to predict is discrete, we call it classification problem.

Linear regression

e.g. prediction of houses' price, give living area x₁, number of bedrooms x₂

h_θ(x) = θ₀ + θ₁x₁+ θ₂x₂

θi's are parameters/weight value, for simplicity, we can write in vectors as

h(x) = sum(θ_ix_i) = θ^Tx

to get parameter θ we define the cost function:

J(θ) = 1/2sum(h_θ(x⁽ⁱ⁾) - y⁽ⁱ⁾)²

- LMS algorithm (Least Mean Squares)
  - consider gradient descent algorithm(find local minimum), which starts with some initial θ, and repeatedly performs the update:(repeat until convergence)
    
    θ_j := θ_j- α*∂J(θ)/∂θ_j
    
    ∂J(θ)/∂θ_j= (h_θ(x) - y)*x_j
- The normal equations
  - instead of iterative algorithm, minimize J(θ) by explicitly taking its derivatives with respect to the θj' s, and setting them to zero.
  - Matrix derivative(click for more info on WIKI)

- - Least squares revisited(see this article I've written before)
- Probabilistic interpretation( of least squares)
  - Assume that the target variables and the inputs are related via the equation y⁽ⁱ⁾ = θ^Tx⁽ⁱ⁾ + ε⁽ⁱ⁾, where ε⁽ⁱ⁾ is an error term that captures either unmodeled effects ( such as if there are some features very pertinent相关的特征to predict housing price, but that we'd left out of the regression), or random nosie.
  - further assume that the ε⁽ⁱ⁾ are distributed IID( independently and identically distribution)accroding to Gaussian distribution( also called Normal distribution) . I.e., the density of ε⁽ⁱ⁾ is given by

- - Then, construct likelihood function:
  - Next, get log likelihood l(θ)
  - Hence, maximizing l(θ) gives the same answer as minimizing J(θ) below, the original least-square function

posted @ 2020-05-17 16:02 ArkiWang 阅读(156) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Loading

脑洞的分析与证明

薛定谔的更新与薛定谔的我

Supervised Learning000

公告