LEARNING TO NAVIGATE IN COMPLEX ENVIRONMENTS

任务是地图里面导航，让agent从起始点到达指定位置。

用了supervised learning + reinforcement learning + lstm

用supervised learning当做辅助训练，加速rl训练，用lstm当做memory。实验表明depth construction比较有用。论文中的方法在固定地图和随机地图中都能用。