Explanation of logistic regression cost function

Explanation of logistic regression cost function

\[\begin{array}{c} \hat{y} = \sigma(w^Tx+b)\quad where\;\sigma(z) = \frac{1}{1+e^{-z}}\\ interpret \quad\hat{y} =P(y=1\mid x)\\ if\quad y=1:P(y\mid x)=\hat{y}\\ if\quad y=0:P(y\mid x)=1-\hat{y}\\ y=0,1\;because\;of\;binary\;cost\;equation.\\ and\;we\;can\;such\;equation\;to\;maintain\;its\;contunity\\ P(y\mid x)=\hat{y}^y\cdot(1-\hat{y})^{1-y}\\ log(P(y\mid x))=y\cdot log(\hat{y})+(1-y)\cdot log(1-\hat{y})\\ and\;our\;single\;loss\;function=-log(P(y\mid x))\\ because\;minimize\;loss\;is\;equivalent\;to\;maximize\;P(y\mid x) \end{array} \]

对于m个独立样例的数据集时

\[P(labels\;in\;training\;set)=\prod^{m}_{i=1}P(y^{(i)}\mid x^{(i)}) \]

我们希望通过寻找一组参数使得上式概率最大(maximun likelihood estimation),则

\[\begin{align} log\;P(\cdots)&=\sum^{m}_{i=1}log(P(y^{(i)}\mid x^{(i)}))\\ &=\sum^{m}_{i=1}(- \mathcal{L}(\hat{y}^{(i)},y^{i}))\\ &=-\sum^{m}_{i=1}( \mathcal{L}(\hat{y}^{(i)},y^{i})) \end{align} \]

所以我们定义了成本函数

\[Cost:\quad J(w,b)=\frac{1}{m}\sum^{m}_{i=1} \mathcal{L}(\hat{y}^{(i)},y^{(i)}) \]

posted @ 2022-03-29 21:49  Link_kingdom  阅读(18)  评论(0编辑  收藏  举报