这两篇博客介绍了为什么机器学习中超参数很难调,以及可行的解决方案。值得读一读,等下周(2021年端午节后)不忙了然后找个例子实现下。
Key point
- We often optimize a Linear combinations of losses and hope to simultaneously reduce both \(L_1\) and \(L_0\) losses, but that this linear combination is actually precarious and treacherous.
- Authours showed that when the pareto curve is concave, only one of these two losses was considered and this linear combination was valid in convex pareto curve.
- In fact, We can't figure out the property of the pareto curve.
- We can reformulate the linear combination losses as a constraint optimization problem, e.g., restricting \(L_0\leq \epsilon\)
- By doing so, authors gave three possible solutions.
References