2019 年 5月 26 日随笔档案 - JASONlee3

强化学习_PolicyGradient（策略梯度）_代码解析

摘要：使用策略梯度解决离散action space问题。一、导入包，定义hyper parameter 二、PolicyGradient Agent的构造函数： 1、设置问题的状态空间维度，动作空间维度； 2、序列采样的存储结构； 3、调用创建用于策略函数近似的神经网络的函数，tensorflow的se 阅读全文

posted @ 2019-05-26 16:37 JASONlee3 阅读(2201) 评论(0) 推荐(0) 编辑

leetcode_1053. Previous Permutation With One Swap

摘要： 1053. Previous Permutation With One Swap https://leetcode.com/problems/previous-permutation-with-one-swap/ 题意：Given an array A of positive integers (n 阅读全文

posted @ 2019-05-26 14:15 JASONlee3 阅读(309) 评论(0) 推荐(0) 编辑

leetcode_1052. Grumpy Bookstore Owner

摘要： 1052. Grumpy Bookstore Owner https://leetcode.com/problems/grumpy-bookstore-owner/ 题意：每个时刻i会有customer[i]个顾客进店，在i时刻店主的情绪是grumpy[i]，若grumpy[i]==1则店主脾气暴躁阅读全文

posted @ 2019-05-26 14:08 JASONlee3 阅读(212) 评论(0) 推荐(0) 编辑

Jason333

强化学习_PolicyGradient（策略梯度）_代码解析

leetcode_1053. Previous Permutation With One Swap

leetcode_1052. Grumpy Bookstore Owner

导航

公告