如何从使用 nltk 计算 BLEU 转到使用CocoEval 计算 BLUE、Cider、Meter、Rough、Spice、

　　仅对于Cider计算而言，nltk对输入的要求同Coco是不同的。

　　前者仅要求输入的 reference 长度等于 hypotheses，并且要求 reference 为一维 List，要求 hypotheses 是二维 List。

　　Coco则不同，他要求输入的 reference 长度等于 hypotheses，并且二者都是字典形式，对应的 reference = { “序号” ， [ “文本” ] }，hypotheses = { “序号” ， [ “文本1”， “文本2” ···] }

　　所以需要将模型得出的结果转为真正的字符串，这时候需要用到之前总结出的 wordmap。具体代码如下：

　　对于 nltk 而言

            allcaps = allcaps[sort_ind]  # 每次 model 得出结果对应的参考值 hypotheses 
            #对应的一些处理 并存到总的 reference 中
            for j in range(allcaps.shape[0]):
                img_caps = allcaps[j].tolist()
                img_captions = list(
                    map(lambda c: [w for w in c if w not in {word_map['<start>'], word_map['<pad>']}],
                        img_caps))  # remove <start> and pads
                references.append(img_captions)

            # Hypotheses
            # 每次 model 得出的结果 reference 并进行处理
            _, preds = torch.max(scores_copy, dim=2)
            preds = preds.tolist()
            temp_preds = list()
            for j, p in enumerate(preds):
                temp_preds.append(preds[j][:decode_lengths[j]])  # remove pads
            preds = temp_preds
            hypotheses.extend(preds)
            assert len(references) == len(hypotheses)

　　之后即可进行 nltk 的 BLEU 计算

from nltk.translate.bleu_score import corpus_bleu        
# Calculate BLEU-4 scores
bleu4 = corpus_bleu(references, hypotheses)

紧跟着，我们只需对此时的 reference 和 hypotheses 做少量处理即可直接使用 Coco 计算

　　　　 # Load word map (word2ix) 读取 word map
        with open(args.word_map, 'r') as j:
             word_map = json.load(j)
        rev_word_map = {v: k for k, v in word_map.items()}  # ix2word
        
        hyp_list = hypotheses
        ref_list = references
        
        hyps = {}
        refs = {}        
        
        for iidx, hypid in enumerate(hyp_list):
            words = [rev_word_map[ind] for ind in hypid]
            temp = ''
            wordlist = []
            for word in words:
                if word != '<end>' and word != '<unk>':
                    temp += word + ' '
            wordlist.append(temp)
            hyps[str(iidx)] = wordlist
        
        for iidx, refid in enumerate(ref_list):
            wordlist = []     
            for ref in refid:
                temp = ''
                words = [rev_word_map[ind] for ind in ref]
                for word in words:
                    if word != '<end>' and word != '<unk>':
                        temp += word + ' '
                wordlist.append(temp)
            refs[str(iidx)] = wordlist
        tem_cider = cider(refs, hyps)
        tem_bleu  = bleu(refs, hyps)
        tem_rouge = rouge(refs, hyps)
        tem_spice = spice(refs, hyps)
        meteor(refs, hyps)

参考资料：

https://github.com/wangleihitcs/CaptionMetrics

https://blog.csdn.net/xiyou__/article/details/121494013

无用的资料：

https://github.com/Maluuba/nlg-eval

posted @ 2022-04-13 17:18 TheBigSeven 阅读(399) 评论(0) 收藏举报

刷新页面返回顶部

葭月十四

如何从使用 nltk 计算 BLEU 转到使用CocoEval 计算 BLUE、Cider、Meter、Rough、Spice、

公告

葭月十四

如何从使用 nltk 计算 BLEU 转到 使用CocoEval 计算 BLUE、Cider、Meter、Rough、Spice、

公告

如何从使用 nltk 计算 BLEU 转到使用CocoEval 计算 BLUE、Cider、Meter、Rough、Spice、