bilingual evaluation understudy

BLEU is designed to approximate human judgement at a corpus level, and performs badly if used to evaluate the quality of individual sentences.

https://en.wikipedia.org/wiki/BLEU

To produce a score for the whole corpus the modified precision scores for the segments are combined using the geometric meanmultiplied by a brevity penalty to prevent very short candidates from receiving too high a score.

posted @ 2017-09-19 18:36 papering 阅读(217) 评论(0) 编辑收藏举报

刷新页面返回顶部