LZ_Jaja

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

Link of the Paper: https://arxiv.org/abs/1609.06647

A Correlative Paper: Show and Tell: A Neural Image Caption Generator (Link of the Paper: https://arxiv.org/abs/1411.4555)

Main Points ( Improvements Over the CVPR2015 Model  ):

  1. Image Model Improvement: GoogLeNet ( 22 layers ) -> Batch Normalization Model.
  2. Image Model Fine Tuning: fine tuning the image model must be carried after the LSTM parameters have settled on a good language model.
  3. Scheduled Sampling: a fully guided scheme using the true previous word -> a less guided scheme which mostly uses the model generated word instead.
  4. Ensembling
  5. Beam Size Reduction: the best beam size turned out to be small: 3.
posted on 2018-08-14 18:21  LZ_Jaja  阅读(267)  评论(0编辑  收藏  举报