Loss_Function_of_Linear_Classifier_and_Optimization
Loss_Function_of_Linear_Classifier_and_Optimization
Multiclass SVM Loss:
Given an example(xi, yi>/sub>) where xi is the image and where yi is the (integer) label, and using the shorthand for the scores vectors: s = f(xi, W), then:
the SVM loss has the form: \(L_i = \sum\limits_{j != y_i} max(0, s_j-s_{y_i}+1)\),
code format:
L_i_vectorized(x, y, W):
scores = W.dot(x)
margins = np.maximun(0, scores - scores[y] + 1)
margins[y] = 0
loss_i = np.sum(margins)
return loss_i
Adding Regularization:
L = \(\frac{1}{N}*\sum\limits_{i = 1}^{N}{\sum\limits_{i != y_i}{max(0, f(x_i; W)_j - f(x_i; W)_{y_i} + 1)}} + \lambda*R(W)\) \((\lambda)\sum\limits_k{\sum\limits_l{W_{k, l}^2}}\)
Benefits in initializing parameters:
x = [1 1 1 1]
W1 = [1 0 0 0]
W2 = [0.25 0.25 0.25 0.25]
Without the regularization item, the dot result will be the same ones, while actually W2 is better than W1 as common sense. If the regularization item is added into it, the result won't be the same, so we can classify them.
Other Regularization Methods: Elastic net(L1+L2), Max norm regularization, Dropout.