kaldi 三个脚本cmd.sh path.sh run.sh
cmd.sh 脚本为:
可以很清楚的看到有 3 个分类分别对应 a,b,c。a 和 b 都是集群上去运行这个样子, c
就是我们需要的。我们在虚拟机上运行的。你需要修改这个脚本
# "queue.pl" uses qsub. The options to it are # options to qsub. If you have GridEngine installed, # change this to a queue you have access to. # Otherwise, use "run.pl", which will run jobs locally # (make sure your --num-jobs options are no more than # the number of cpus on your machine. #a) JHU cluster options #export train_cmd="queue.pl -l arch=*64" #export decode_cmd="queue.pl -l arch=*64,mem_free=2G,ram_free=2G" #export mkgraph_cmd="queue.pl -l arch=*64,ram_free=4G,mem_free=4G" #export cuda_cmd=run.pl #b) BUT cluster options #export train_cmd="queue.pl -q all.q@@blade -l ram_free=1200M,mem_free=1200M" #export decode_cmd="queue.pl -q all.q@@blade -l ram_free=1700M,mem_free=1700M" #export decodebig_cmd="queue.pl -q all.q@@blade -l ram_free=4G,mem_free=4G" #export cuda_cmd="queue.pl -q long.q@@pco203 -l gpu=1" #export cuda_cmd="queue.pl -q long.q@pcspeech-gpu" #export mkgraph_cmd="queue.pl -q all.q@@servers -l ram_free=4G,mem_free=4G" #c) run it locally... export train_cmd=run.pl export decode_cmd=run.pl export cuda_cmd=run.pl export mkgraph_cmd=run.pl
Path.sh 的内容:
在这里一般只要修改 export KALDI_ROOT=`pwd`/../../..改为你安装 kaldi 的目录,有时候不
修改也可以,根据实际情况。
export KALDI_ROOT=`pwd`/../../.. [ -f $KALDI_ROOT/tools/env.sh ] && . $KALDI_ROOT/tools/env.sh export PATH=$PWD/utils/:$KALDI_ROOT/tools/openfst/bin:$KALDI_ROOT/tools/irstlm/bin/:$PWD:$PATH [ ! -f $KALDI_ROOT/tools/config/common_path.sh ] && echo >&2 "The standard file $KALDI_ROOT/tools/config/common_path.sh is not present -> Exit!" && exit 1 . $KALDI_ROOT/tools/config/common_path.sh export LC_ALL=C
Run.sh
需要指定你的数据在什么路径下,你只需要修改:
如:
#timit=/export/corpora5/LDC/LDC93S1/timit/TIMIT # @JHU
timit=/mnt/matylda2/data/TIMIT/timit # @BUT
修改为你的 timit 所在的路径。
其他的数据库都一样。
此外,voxforge 或者 vystadial_cz 或者 vystadial_en 这些数据库都提供下载,没有数据库的可
以利用这些来做实验。
最后,来解释下 run.sh 脚本。我们就用 timit 里的 s5 来举例阐述:
位置: /home/dahu/myfile/my_git/kaldi/egs/timit/s5
#!/bin/bash # # Copyright 2013 Bagher BabaAli, # 2014-2017 Brno University of Technology (Author: Karel Vesely) # # TIMIT, description of the database: # http://perso.limsi.fr/lamel/TIMIT_NISTIR4930.pdf # # Hon and Lee paper on TIMIT, 1988, introduces mapping to 48 training phonemes, # then re-mapping to 39 phonemes for scoring: # http://repository.cmu.edu/cgi/viewcontent.cgi?article=2768&context=compsci # . ./cmd.sh [ -f path.sh ] && . ./path.sh #最好看看path.sh 的路径是否有问题 set -e # Acoustic model parameters ,声学模型的参数,暂时先不改 numLeavesTri1=2500 numGaussTri1=15000 numLeavesMLLT=2500 numGaussMLLT=15000 numLeavesSAT=2500 numGaussSAT=15000 numGaussUBM=400 numLeavesSGMM=7000 numGaussSGMM=9000 feats_nj=10 train_nj=30 decode_nj=5 #nj是指需要运行jobs的数量,一般不超过cpu的数量 echo ============================================================================ echo " Data & Lexicon & Language Preparation " echo ============================================================================ #timit=/export/corpora5/LDC/LDC93S1/timit/TIMIT # @JHU timit=/mnt/matylda2/data/TIMIT/timit # @BUT #修改为自己的timit所在路径 local/timit_data_prep.sh $timit || exit 1 local/timit_prepare_dict.sh # Caution below: we remove optional silence by setting "--sil-prob 0.0", # in TIMIT the silence appears also as a word in the dictionary and is scored. utils/prepare_lang.sh --sil-prob 0.0 --position-dependent-phones false --num-sil-states 3 \ data/local/dict "sil" data/local/lang_tmp data/lang local/timit_format_data.sh echo ============================================================================ echo " MFCC Feature Extration & CMVN for Training and Test set " echo ============================================================================ # Now make MFCC features. #这部分主要是特征提取部分, mfccdir=mfcc for x in train dev test; do steps/make_mfcc.sh --cmd "$train_cmd" --nj $feats_nj data/$x exp/make_mfcc/$x $mfccdir steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir done echo ============================================================================ echo " MonoPhone Training & Decoding " echo ============================================================================ #这里是单音素的训练和解码部分,语音识别最基础的部分!!要详细研究一下。 steps/train_mono.sh --nj "$train_nj" --cmd "$train_cmd" data/train data/lang exp/mono utils/mkgraph.sh data/lang_test_bg exp/mono exp/mono/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/mono/graph data/dev exp/mono/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/mono/graph data/test exp/mono/decode_test echo ============================================================================ echo " tri1 : Deltas + Delta-Deltas Training & Decoding " echo ============================================================================ #这里是三音素的训练和解码部分 steps/align_si.sh --boost-silence 1.25 --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/mono exp/mono_ali # Train tri1, which is deltas + delta-deltas, on train data. steps/train_deltas.sh --cmd "$train_cmd" \ $numLeavesTri1 $numGaussTri1 data/train data/lang exp/mono_ali exp/tri1 utils/mkgraph.sh data/lang_test_bg exp/tri1 exp/tri1/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri1/graph data/dev exp/tri1/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri1/graph data/test exp/tri1/decode_test echo ============================================================================ echo " tri2 : LDA + MLLT Training & Decoding " echo ============================================================================ #这里在三音素模型的基础上做了 LDA + MLLT 变换 steps/align_si.sh --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/tri1 exp/tri1_ali steps/train_lda_mllt.sh --cmd "$train_cmd" \ --splice-opts "--left-context=3 --right-context=3" \ $numLeavesMLLT $numGaussMLLT data/train data/lang exp/tri1_ali exp/tri2 utils/mkgraph.sh data/lang_test_bg exp/tri2 exp/tri2/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri2/graph data/dev exp/tri2/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri2/graph data/test exp/tri2/decode_test echo ============================================================================ echo " tri3 : LDA + MLLT + SAT Training & Decoding " echo ============================================================================ #这里是三音素模型的基础上做了 LDA + MLLT + SAT 变换 # Align tri2 system with train data. steps/align_si.sh --nj "$train_nj" --cmd "$train_cmd" \ --use-graphs true data/train data/lang exp/tri2 exp/tri2_ali # From tri2 system, train tri3 which is LDA + MLLT + SAT. steps/train_sat.sh --cmd "$train_cmd" \ $numLeavesSAT $numGaussSAT data/train data/lang exp/tri2_ali exp/tri3 utils/mkgraph.sh data/lang_test_bg exp/tri3 exp/tri3/graph steps/decode_fmllr.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri3/graph data/dev exp/tri3/decode_dev steps/decode_fmllr.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri3/graph data/test exp/tri3/decode_test echo ============================================================================ echo " SGMM2 Training & Decoding " echo ============================================================================ #这里是三音素模型的基础上做了 sgmm2 steps/align_fmllr.sh --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/tri3 exp/tri3_ali exit 0 # From this point you can run Karel's DNN : local/nnet/run_dnn.sh steps/train_ubm.sh --cmd "$train_cmd" \ $numGaussUBM data/train data/lang exp/tri3_ali exp/ubm4 steps/train_sgmm2.sh --cmd "$train_cmd" $numLeavesSGMM $numGaussSGMM \ data/train data/lang exp/tri3_ali exp/ubm4/final.ubm exp/sgmm2_4 utils/mkgraph.sh data/lang_test_bg exp/sgmm2_4 exp/sgmm2_4/graph steps/decode_sgmm2.sh --nj "$decode_nj" --cmd "$decode_cmd"\ --transform-dir exp/tri3/decode_dev exp/sgmm2_4/graph data/dev \ exp/sgmm2_4/decode_dev steps/decode_sgmm2.sh --nj "$decode_nj" --cmd "$decode_cmd"\ --transform-dir exp/tri3/decode_test exp/sgmm2_4/graph data/test \ exp/sgmm2_4/decode_test echo ============================================================================ echo " MMI + SGMM2 Training & Decoding " echo ============================================================================ #这里是三音素模型的基础上做了 MMI + SGMM2 steps/align_sgmm2.sh --nj "$train_nj" --cmd "$train_cmd" \ --transform-dir exp/tri3_ali --use-graphs true --use-gselect true \ data/train data/lang exp/sgmm2_4 exp/sgmm2_4_ali steps/make_denlats_sgmm2.sh --nj "$train_nj" --sub-split "$train_nj" \ --acwt 0.2 --lattice-beam 10.0 --beam 18.0 \ --cmd "$decode_cmd" --transform-dir exp/tri3_ali \ data/train data/lang exp/sgmm2_4_ali exp/sgmm2_4_denlats steps/train_mmi_sgmm2.sh --acwt 0.2 --cmd "$decode_cmd" \ --transform-dir exp/tri3_ali --boost 0.1 --drop-frames true \ data/train data/lang exp/sgmm2_4_ali exp/sgmm2_4_denlats exp/sgmm2_4_mmi_b0.1 for iter in 1 2 3 4; do steps/decode_sgmm2_rescore.sh --cmd "$decode_cmd" --iter $iter \ --transform-dir exp/tri3/decode_dev data/lang_test_bg data/dev \ exp/sgmm2_4/decode_dev exp/sgmm2_4_mmi_b0.1/decode_dev_it$iter steps/decode_sgmm2_rescore.sh --cmd "$decode_cmd" --iter $iter \ --transform-dir exp/tri3/decode_test data/lang_test_bg data/test \ exp/sgmm2_4/decode_test exp/sgmm2_4_mmi_b0.1/decode_test_it$iter done echo ============================================================================ echo " DNN Hybrid Training & Decoding " echo ============================================================================ #这里是povey版本的dnn模型,教程说不建议使用 # DNN hybrid system training parameters dnn_mem_reqs="--mem 1G" dnn_extra_opts="--num_epochs 20 --num-epochs-extra 10 --add-layers-period 1 --shrink-interval 3" steps/nnet2/train_tanh.sh --mix-up 5000 --initial-learning-rate 0.015 \ --final-learning-rate 0.002 --num-hidden-layers 2 \ --num-jobs-nnet "$train_nj" --cmd "$train_cmd" "${dnn_train_extra_opts[@]}" \ data/train data/lang exp/tri3_ali exp/tri4_nnet [ ! -d exp/tri4_nnet/decode_dev ] && mkdir -p exp/tri4_nnet/decode_dev decode_extra_opts=(--num-threads 6) steps/nnet2/decode.sh --cmd "$decode_cmd" --nj "$decode_nj" "${decode_extra_opts[@]}" \ --transform-dir exp/tri3/decode_dev exp/tri3/graph data/dev \ exp/tri4_nnet/decode_dev | tee exp/tri4_nnet/decode_dev/decode.log [ ! -d exp/tri4_nnet/decode_test ] && mkdir -p exp/tri4_nnet/decode_test steps/nnet2/decode.sh --cmd "$decode_cmd" --nj "$decode_nj" "${decode_extra_opts[@]}" \ --transform-dir exp/tri3/decode_test exp/tri3/graph data/test \ exp/tri4_nnet/decode_test | tee exp/tri4_nnet/decode_test/decode.log echo ============================================================================ echo " System Combination (DNN+SGMM) " echo ============================================================================ #这里是 dnn + sgmm 模型 for iter in 1 2 3 4; do local/score_combine.sh --cmd "$decode_cmd" \ data/dev data/lang_test_bg exp/tri4_nnet/decode_dev \ exp/sgmm2_4_mmi_b0.1/decode_dev_it$iter exp/combine_2/decode_dev_it$iter local/score_combine.sh --cmd "$decode_cmd" \ data/test data/lang_test_bg exp/tri4_nnet/decode_test \ exp/sgmm2_4_mmi_b0.1/decode_test_it$iter exp/combine_2/decode_test_it$iter done echo ============================================================================ echo " DNN Hybrid Training & Decoding (Karel's recipe) " echo ============================================================================ #这里是 karel 的 dnn 模型,通用的深度学习模型框架!! local/nnet/run_dnn.sh #local/nnet/run_autoencoder.sh : an example, not used to build any system, echo ============================================================================ echo " Getting Results [see RESULTS file] " echo ============================================================================ #这里是得到上述模型的最后识别结果 bash RESULTS dev bash RESULTS test echo ============================================================================ echo "Finished successfully on" `date` echo ============================================================================ exit 0
看完这3个基本的脚本,了解下大概都是做什么用的,正在下载 timit 的data,之后跑一下。
timit 数据集下载: kaldi timit 实例运行全过程