Kaldi安装

Kaldi是基于C++开发并遵守Apache License v2.0的一款语音识别工具包,是目前最流行的ASR工具之一,本文基于Ubuntu 18.04 LTS介绍了如何安装Kaldi。

Kaldi

首先按照官网提示,将Kaldi项目克隆至本地:

~$ git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden

进入kaldi-trunk:

~$ cd kaldi-trunk
~/kaldi-trunk$

查看INSTALL:

~/kaldi-trunk$ cat INSTALL
This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[for native Windows install, see windows/INSTALL]

(1)
go to tools/  and follow INSTALL instructions there.

(2)
go to src/ and follow INSTALL instructions there.

所以先进入tools目录按提示安装,再进入src目录按提示安装。

进入tools目录查看INSTALL:

~/kaldi-trunk$ cd tools
~/kaldi-trunk/tools$ cat INSTALL
To check the prerequisites for Kaldi, first run

  extras/check_dependencies.sh

and see if there are any system-level installations you need to do. Check the
output carefully. There are some things that will make your life a lot easier
if you fix them at this stage. If your system default C++ compiler is not
supported, you can do the check with another compiler by setting the CXX
environment variable, e.g.

  CXX=g++-4.8 extras/check_dependencies.sh

Then run

  make

which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe.
OpenFst requires a relatively recent C++ compiler with C++11 support, e.g.
g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3. If your system default
compiler does not have adequate support for C++11, you can specify a C++11
compliant compiler as a command argument, e.g.

  make CXX=g++-4.8

If you have multiple CPUs and want to speed things up, you can do a parallel
build by supplying the "-j" option to make, e.g. to use 4 CPUs

  make -j 4

In extras/, there are also various scripts to install extra bits and pieces that
are used by individual example scripts.  If an example script needs you to run
one of those scripts, it will tell you what to do.

所以首先需要进入extras目录运行脚本check_dependencies.sh来检查各种依赖是否安装。

进入extras并运行check_dependencies.sh:

~/kaldi-trunk/tools$ cd extras/
~/kaldi-trunk/tools/extras$ ./check_dependencies.sh
./check_dependencies.sh: all OK.

运行check_dependencies.sh后出现任何提示表明某些库未安装,都应按照提示解决,直到运行check_dependencies.sh后出现如上所示”./check_dependencies.sh: all OK.”。

然后进入上一级,进行编译:

~/kaldi-trunk/tools/extras$ cd ..
~/kaldi-trunk/tools$ make

如果是在虚拟机上,建议使用make而非make -j 4,否则很容易内存不够导致编译失败,之后在src目录下的编译也一样。

make完成后可能会提示irstlm未安装,此时不用管,先继续完成整个kaldi的安装再说。

进入src目录并查看INSTALL:

~/kaldi-trunk/tools$ cd ../src
~/kaldi-trunk/src$ cat INSTALL

These instructions are valid for UNIX-like systems (these steps have
been run on various Linux distributions; Darwin; Cygwin).  For native Windows
compilation, see ../windows/INSTALL.

You must first have completed the installation steps in ../tools/INSTALL
(compiling OpenFst; getting ATLAS and CLAPACK headers).

The installation instructions are

  ./configure --shared
  make depend -j 8
  make -j 8

Note that we added the "-j 8" to run in parallel because "make" takes a long
time.  8 jobs might be too many for a laptop or small desktop machine with not
many cores.

For more information, see documentation at http://kaldi-asr.org/doc/
and click on "The build process (how Kaldi is compiled)".

运行configure且不要添加参数”– –shared”:

~/kaldi-trunk/src$ ./configure
Configuring ...
Backing up kaldi.mk to kaldi.mk.bak ...
Checking compiler g++ ...
Checking OpenFst library in /home/zillyrex/kaldi-trunk/tools/openfst ...
Doing OS specific configurations ...
On Linux: Checking for linear algebra header files ...
Using ATLAS as the linear algebra library.
Atlas found in /usr/lib/x86_64-linux-gnu
Validating presence of ATLAS libs in /usr/lib/x86_64-linux-gnu
Using library /usr/lib/x86_64-linux-gnu/liblapack.so as ATLAS's CLAPACK library.
CUDA will not be used! If you have already installed cuda drivers 
and cuda toolkit, try using --cudatk-dir=... option.  Note: this is
only relevant for neural net experiments
Info: configuring Kaldi not to link with Speex (don't worry, it's only needed if you
intend to use 'compress-uncompress-speex', which is very unlikely)
Successfully configured for Linux [dynamic libraries] with ATLASLIBS =/usr/lib/x86_64-linux-gnu/liblapack.so /usr/lib/x86_64-linux-gnu/libcblas.so /usr/lib/x86_64-linux-gnu/libatlas.so /usr/lib/x86_64-linux-gnu/libf77blas.so
SUCCESS
To compile: make clean -j; make depend -j; make -j
 ... or e.g. -j 10, instead of -j, to use a specified number of CPUs

务必仔细阅读运行configure后显示的提示,它可能和上文所示的内容有所区别,其中提醒了你有哪些东西没安装好,并给出了指导,遵循那些执导完成相关依赖的安装,直到运行configure后出现如上文所示的提示,提示的最后显示”SUCCESS To compile: ……”,此时才能进行后面的步骤,否则长时间的make后会报错。

执行最后的步骤,编译kaldi的源码:

~/kaldi-trunk/src$ make depend
...
...
~/kaldi-trunk/src$ make
...
...
...
Done

make的时间较长,大约半个小时到一个小时,如果编译过程中未出现红色的error,最后出现”Done”,表明编译成功。

最后运行一个例程来检验安装是否成功,运行egs/yesno/s5目录下的run.sh:

~/kaldi-trunk/src$ cd ../egs/yesno/s5/
~/kaldi-trunk/egs/yesno/s5$ ./run.sh
Preparing train and test data
Dictionary preparation succeeded
utils/prepare_lang.sh --position-dependent-phones false data/local/dict <SIL> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK
...
...
...
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
%WER 0.00 [ 0 / 232, 0 ins, 0 del, 0 sub ] exp/mono0a/decode_test_yesno/wer_10_0.0

出现如上结果,表明kaldi安装成功。

posted @ 2019-11-06 01:29  ZillyRex  阅读(2604)  评论(0编辑  收藏  举报