摘要:
Problem with Large Weights Large weights in a neural network are a sign of overfitting. A network with large weights has very likely learned the stati 阅读全文
摘要:
Paper [1]: White-box neural network attack, adversaries have full access to the model. Using Gradient Descent going back to update the input so that r 阅读全文
摘要:
In this post, we are going to compare the two types of machine learning models-generative model and discriminative model-, whose underlying ideas are 阅读全文
摘要:
In the previous posts, we use different techniques to build and keep updating State-Action tables. But it is impossible to do the same thing when the 阅读全文
摘要:
SARSA SARSA algorithm also estimate Action-Value functions rather than State-Value function. The difference between SARSA and Monte Carlo is: SARSA do 阅读全文
摘要:
In Monte Carlo Learning, we've got the estimation of value function: Gt is the episode return from time t, which can be calculated by: Please recall, 阅读全文
摘要:
Problem of State-Value Function Similar as Policy Iteration in Model-Based Learning, Generalized Policy Iteration will be used in Monte Carlo Control. 阅读全文
摘要:
Model-Based and Model-Free In the previous several posts, we mainly talked about Model-Based Reinforcement Learning. The biggest assumption for Model- 阅读全文
摘要:
Value-Iteration Algorithm: For each iteration k+1: a. calculate the optimal state-value function for all s∈S; b. untill algorithm converges. end up wi 阅读全文
摘要:
From the last post, we know how to evaluate a policy. But that's not enough, because the purpose of policy evaluation is to improve policies so that f 阅读全文