Generalized Linear Models

1. Guide

　　So far, we’ve seen a regression example, and a classification example. In the regression example, we had y|x; θ ∼ N(μ, σ²), and in the classification one, y|x; θ ∼ Bernoulli(φ), where for some appropriate definitions of μ and φ as functions of x and θ.(μ=θ^Tx, φ=g(θ^Tx))

2. The exponential family

　　We say that a class of distributions is in the exponential family if it can be written in the form:

　　　　p(y; η) = b(y) exp(η^TT(y) − a(η))

　　Here, η is called the natural parameter (also called the canonical parameter) of the distribution; T(y) is the sufficient statistic (for the distributions we consider, it will often be the case that T(y) = y); and a(η) is the logpartition function. The quantity e^−a(η) essentially plays the role of a normalization constant, that makes sure the distribution p(y; η) sums/integrates over y to 1.

　　A fixed choice of T, a and b defines a family (or set) of distributions that is parameterized by η; as we vary η, we then get different distributions within this family.

3. Examples

　　a. Bernoulli distribution:(we consider η as a real num)

　　　　Thus, the natural parameter is given by η = log(φ/(1 − φ)). Interestingly, if we invert this definition for η by solving for φ in terms of η, we obtain φ = 1/(1 + e^−η). This is the familiar sigmoid function! This will come up again when we derive logistic regression as a GLM:

　　　　　　η = log(φ/(1 − φ))

　　　　　　T(y) = y

　　　　　　a(η) = −log(1 − φ)= log(1 + e^η)

　　　　　　b(y) = 1

　　b. Gaussian distribute:(let's set σ²=1)

　　　　Thus, we see that the Gaussian is in the exponential family:

　　　　　　η = μ

　　　　　　T(y) = y

　　　　　　a(η) = μ²/2 = η²/2

　　　　　　b(y) = (1/√2π) exp(−y²/2).

posted on 2013-04-14 10:25 BigPalm 阅读(170) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Generalized Linear Models

导航

公告