基于各种分类算法的语音分类(年龄段识别)(续)

基于各种分类算法的语音分类(年龄段识别)

语料提取,基于分类算法进行分类

语料提取分类

TIMIT/DOC/SPKRINFO.TXT中为speaker信息,作为分类条件
定义方法def initspeakerinfo(speakerinfo),生成speaker:age字典:

def initspeakerinfo(speakerinfo):
    dict = {}
    f = open(speakerinfo,'r')
    for line in f:
        linelist = line.strip().split('  ')
        recorddate = linelist[4].strip().split('/')
        birthdata = linelist[5].strip().split('/')
        if recorddate[2]=="??" or birthdata[2]=="??":
            age = 0
        else:
            age = int(recorddate[2])*365+int(recorddate[0])*30+int(recorddate[1])-int(birthdata[2])*365+int(birthdata[0])*30+int(birthdata[1])
            age = age/365.0
        dict[linelist[1]+linelist[0]] = age
    return dict

如三分类或两分类:

def getclass(filename,dict):
    m = filename
    if dict[m]==0:
        return "0"
    if dict[m]<=25:
        return "-1"
    elif dict[m]<=45:
        return "0"
    else:
        return "+1"

特征表示

在之前提取出了MFCC/i-vector,其中MFCC为38n矩阵形式,38是MFCC维度而n为一段语音的帧数,i-vector则是1200矩阵形式,如果要进行分类,需要对MFCC进行处理,最简单的方法就是取38*n的均值再进行归一化
定义方法def initavgmfcc(avgmfccname,mfccpath)读取mfccpath路径下的mfcc文件写入到一个文件中,并完成均值和归一化

def initavgmfcc(avgmfccname,mfccpath):
    f = open(avgmfccname,'w')
    for filename in os.listdir(mfccpath):

        fo = open(mfccpath+"\\"+filename,'r')
        dimen = 13
        avgmfcc = [0]*dimen
        length = 1
        for line in fo:
            linelist = line.strip().split(' ')
            for i in range(dimen):
                avgmfcc[i] = avgmfcc[i] + float(linelist[i])
            length = length + 1
        for i in range(dimen):
            avgmfcc[i] = avgmfcc[i]/length
        listmin = min(avgmfcc)
        listmax = max(avgmfcc)
        for i in range(dimen):
            avgmfcc[i] = str((avgmfcc[i]-listmin)/(listmax-listmin))
        f.write(filename+" "+" ".join(avgmfcc)+"\n")
        print filename+" avg over"
        fo.close()
    f.close()

定义方法def initiv(ivname,ivpath)读取ivpath路径下的i-vector文件写入到一个文件中

def initiv(ivname,ivpath):
    f = open(ivname,'w')
    avgf = open(ivname+"avg","w")
    for filename in os.listdir(ivpath):
        fo = open(ivpath+"\\"+filename,'r')
        dimen = 200
        for line in fo:
            linelist = line.strip().split(' ')
            if(len(linelist)==dimen):
                f.write(filename+" "+" ".join(linelist)+"\n")
                avgiv = [0]*dimen
                linelist = map(eval, linelist)
                listmin = min(linelist)
                listmax = max(linelist)
                for i in range(dimen):
                    avgiv[i] = (str)((linelist[i]-listmin)/(listmax-listmin))
                avgf.write(filename+" "+" ".join(avgiv)+"\n")

        fo.close()
    f.close()
    avgf.close()

PS:https://www.zhihu.com/question/20455227 归一化说明

LIBSVM进行分类

安装

参考http://blog.csdn.net/lqhbupt/article/details/8599295 进行LIBSVM的安装
PS:64位麻烦一点,但是同样可以用nmake解决

LIBSVM格式

http://blog.csdn.net/kobesdu/article/details/8944851 介绍了LIBSVM格式和生成方法
简单来说格式为

+1 1:0.533355514244 2:0.225956771932 3:0.551555751325 4:0.448831840291 5:0.732958158188 6:0.516967914119 ...
-1 1:0.723092649707 2:0.352547706883 3:0.524416372722 4:0.683881004712 5:0.464490812227 6:0.70279542324 ...
...

其实Python几行就可以解决
最后定义方法def initFormat(formatname,avgmfccname,dict,dimen)生成了LIBSVM格式的

  • FormatData-iv-train
  • FormatData-iv-test
  • FormatData-mfcc-train
  • FormatData-mfcc-test

参数寻优

在libsvm-3.21/tools/grid.py中可以进行参数寻优

E:\libsvm-3.21\tools>grid.py
Usage: grid.py [grid_options] [svm_options] dataset

grid_options :
-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2)
    begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end}
    "null"         -- do not grid with c
-log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2)
    begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end}
    "null"         -- do not grid with g
-v n : n-fold cross validation (default 5)
-svmtrain pathname : set svm executable path and name
-gnuplot {pathname | "null"} :
    pathname -- set gnuplot executable path and name
    "null"   -- do not plot
-out {pathname | "null"} : (default dataset.out)
    pathname -- set output file path and name
    "null"   -- do not output file
-png pathname : set graphic output file path and name (default dataset.png)
-resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out)
    This is experimental. Try this option only if some parameters have been checked for the SAME data.

option如上
用以求参数C和gamma
http://m.blog.csdn.net/article/details?id=46386201
参数寻优的原理是交叉验证-v n分为n份
依次取其中n-1份为训练集,1份为测试集,参数C和gamma在

-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2)
    begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end}
    "null"         -- do not grid with c
-log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2)
    begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end}
    "null"         -- do not grid with g

区间内
然后更换训练集和测试集做简单的枚举,设C区间内有numC个取值,gamma区间内有numG个取值,则总共进行numC*numG*n次测试,会输出每一次的结果:准确率accuracy,取最高accuracy时的C和gamma作为参数寻优的结果

LIBSVM训练和预测

train_y, train_x = svm_read_problem('../FormatData-train')
test_y, test_x = svm_read_problem('../FormatData-test')

model = svm_train(train_y,train_x,'-c 112.0 -g 0.000125')
p_label, p_acc, p_val = svm_predict(test_y,test_x, model)

scikit-learn进行分类

scikit-learn是python的一个第三方库
分类方法众多,调用简单,需要预先了解分类方法/Python/numpy

LDA/PLDA/PCA处理

scikit-learn还提供LDA处理,所以之前的LIBSVM可以升级为

from svmutil import *
from sklearn.lda import LDA
#read the data(mfcc/ivectr/LDA-ivector)

train_y, train_x = svm_read_problem('../FormatData-mfcc-train')
test_y, test_x = svm_read_problem('../FormatData-mfcc-test')

clf = LDA(solver='eigen',n_components=100)
train_x2 = clf.fit(train_x,train_y).transform(train_x)
test_x2 = clf.fit(train_x,train_y).transform(test_x)

model = svm_train(train_y2,train_x2,'-c 8192.0 -g 0.05')

scikit-learn分类

#!usr/bin/env python
#-*- coding: utf-8 -*-
import sys
import os
import time
from sklearn import metrics
import numpy as np
import cPickle as pickle
from sklearn.datasets import load_svmlight_file
import numpy
from sklearn.lda import LDA
from sklearn.decomposition import PCA
reload(sys)
sys.setdefaultencoding('utf8')

# Multinomial Naive Bayes Classifier
def naive_bayes_classifier(train_x, train_y):
    from sklearn.naive_bayes import MultinomialNB
    model = MultinomialNB(alpha=0.01)
    model.fit(train_x, train_y)
    return model


# KNN Classifier
def knn_classifier(train_x, train_y):
    from sklearn.neighbors import KNeighborsClassifier
    model = KNeighborsClassifier()
    model.fit(train_x, train_y)
    return model


# Logistic Regression Classifier
def logistic_regression_classifier(train_x, train_y):
    from sklearn.linear_model import LogisticRegression
    model = LogisticRegression(penalty='l2')
    model.fit(train_x, train_y)
    return model


# Random Forest Classifier
def random_forest_classifier(train_x, train_y):
    from sklearn.ensemble import RandomForestClassifier
    model = RandomForestClassifier(n_estimators=8)
    model.fit(train_x, train_y)
    return model


# Decision Tree Classifier
def decision_tree_classifier(train_x, train_y):
    from sklearn import tree
    model = tree.DecisionTreeClassifier()
    model.fit(train_x, train_y)
    return model


# GBDT(Gradient Boosting Decision Tree) Classifier
def gradient_boosting_classifier(train_x, train_y):
    from sklearn.ensemble import GradientBoostingClassifier
    model = GradientBoostingClassifier(n_estimators=200)
    model.fit(train_x, train_y)
    return model


# SVM Classifier
def svm_classifier(train_x, train_y):
    from sklearn.svm import SVC
    model = SVC(kernel='rbf', probability=True)
    model.fit(train_x, train_y)
    return model

# SVM Classifier using cross validation
def svm_cross_validation(train_x, train_y):
    from sklearn.grid_search import GridSearchCV
    from sklearn.svm import SVC
    model = SVC(kernel='rbf', probability=True)
    param_grid = {'C': [1e-3, 1e-2, 1e-1, 1, 10, 100, 1000], 'gamma': [0.001, 0.0001]}
    grid_search = GridSearchCV(model, param_grid, n_jobs = 1, verbose=1)
    grid_search.fit(train_x, train_y)
    best_parameters = grid_search.best_estimator_.get_params()
    for para, val in best_parameters.items():
        print para, val
    model = SVC(kernel='rbf', C=best_parameters['C'], gamma=best_parameters['gamma'], probability=True)
    model.fit(train_x, train_y)
    return model

def read_data(data_file):
    f = open(data_file+"-train")
    x = []
    y = []
    for line in f:
        linelist = line.strip().split(' ')
        linelist = map(eval, linelist)
        x.append(linelist[1:])
        y.append(linelist[0])
    x1 = np.array(x)
    y1 = np.array(y)
    ff = open(data_file+"-test")
    xx = []
    yy = []
    for line in ff:
        linelist = line.strip().split(' ')
        linelist = map(eval, linelist)
        xx.append(linelist[1:])
        yy.append(linelist[0])
    x2 = np.array(xx)
    y2 = np.array(yy)
    train_x = x1
    train_y = y1
    test_x = x2
    test_y = y2
    #return x1[:trainlen],y1[:trainlen],x1[trainlen:],y1[trainlen:]
    return train_x, train_y, test_x, test_y


if __name__ == '__main__':
    data_file = "./data/FormatData-mfcc"
    thresh = 0.5
    model_save_file = None
    model_save = {}

    test_classifiers = ['KNN', 'LR', 'RF', 'DT', 'SVM', 'GBDT']
    classifiers = {#'NB':naive_bayes_classifier,
                  'KNN':knn_classifier,
                   'LR':logistic_regression_classifier,
                   'RF':random_forest_classifier,
                   'DT':decision_tree_classifier,
                  'SVM':svm_classifier,
                'SVMCV':svm_cross_validation,
                 'GBDT':gradient_boosting_classifier
    }

    print 'reading training and testing data...'
    train_x, train_y, test_x, test_y = read_data(data_file)
    num_train, num_feat = train_x.shape
    num_test, num_feat = test_x.shape
    is_binary_class = (len(np.unique(train_y)) == 2)
    print is_binary_class
    print '******************** Data Info *********************'
    print '#training data: %d, #testing_data: %d, dimension: %d' % (num_train, num_test, num_feat)

    for classifier in test_classifiers:
        print '******************* %s ********************' % classifier
        start_time = time.time()
        model = classifiers[classifier](train_x, train_y)
        print 'training took %fs!' % (time.time() - start_time)
        predict = model.predict(test_x)
        if model_save_file != None:
            model_save[classifier] = model
        if is_binary_class:
            precision = metrics.precision_score(test_y, predict)
            recall = metrics.recall_score(test_y, predict)
            print 'precision: %.2f%%, recall: %.2f%%' % (100 * precision, 100 * recall)
        accuracy = metrics.accuracy_score(test_y, predict)
        print 'accuracy: %.2f%%' % (100 * accuracy)

    if model_save_file != None:
        pickle.dump(model_save, open(model_save_file, 'wb'))


grid_search = GridSearchCV(classifiers,)
posted @ 2016-08-19 10:12  Dystopia  阅读(3650)  评论(1编辑  收藏  举报