latexdiff中的大坑:字符编码问题
最近用latex写文章,要用到修订模式,于是采用latexdiff命令生成修订版pdf。这原本是一个非常简单方便的方法,却隐藏着字符编码的问题,初次用可能会遇到意想不到的问题,让人很烦,比如,生成出来的.tex文档,不是UTF-8编码,而是UTF-16,这导致latex无法编译(本人是采用的Windows10+TeXLive 2019+VS CODE)。通过notepad将其转换成UTF-8编码,会出现奇奇怪怪的字符,如下:
Package inputenc: Unicode character 贸 (U+8D38)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 铆 (U+94C6)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 啪 (U+556A)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 谩 (U+8C29)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 酶 (U+9176)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 氓 (U+6C13)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 鈥 (U+9225)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 擳 (U+64F3)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
Package inputenc: Unicode character 枚 (U+679A)
(inputenc) not set up for use with LaTeX.
(inputenc) not set up for use with LaTeX.
经过仔细检查,这些都是因为参考文献中的人名中含有希腊字母、北欧字母等奇奇怪怪的字母,以及特殊的字符组合,如“-T”、“r's”等。由于latex文本非常长,无法通过肉眼找到错误,而且VS CODE不会在乱码处标红,但可以采用以下方法找错:
1、将.tex文件改成UTF-8编码后,直接编译,让其报错;
2、查看VS CODE的错误提示,里面会写明所有的乱码,将其一条一条copy出来,如上所示,再在.tex文件中一个一个find到,直到所有的报错都消失;
除了乱码和特殊字符,图片的修改貌似行不通,编译不通过,直接去掉就好,就这么无赖。