[speech] 统计音频文件总时长

统计音频文件总时长

两种实现方法,调用sox工具,调用python wave库。

sox

命令行中键入sox wavfile -n stat

-bash-4.2$ sox arctic_a0001.wav -n stat  
Samples read:             53680
Length (seconds):      3.355000
Scaled by:         2147483647.0
Maximum amplitude:     0.628510
Minimum amplitude:    -0.649933
Midline amplitude:    -0.010712
Mean    norm:          0.069802
Mean    amplitude:    -0.000027
RMS     amplitude:     0.114387
Maximum delta:         0.332764
Minimum delta:         0.000000
Mean    delta:         0.019452
RMS     delta:         0.033908
Rough   frequency:          754
Volume adjustment:        1.539

其中的length就是长度,单位seconds。

Code

import os,sys
import os,sys
import math
import subprocess
import csv

wavdir = 'wav'
txtdir = 'txt'
wavlst = os.listdir(wavdir)
total = 0
def get_wav_duration(wav_id):
    cmd = "sox {} -n stat 2>&1".format(os.path.join(wavdir,wav_id))
    tmp = os.popen(cmd)
    dur_line = tmp.readlines()[1].split()
    dur = math.floor(float(dur_line[2]) * 10)/10
    #print dur
    global total
    total = total + dur 
    return str(dur)

with open('text.csv','wb') as csvfile:
    writer = csv.writer(csvfile)
    for wav_id in wavlst:
        utt_id = wav_id.split('.')[0]
        duration = get_wav_duration(wav_id)
        #sound_nframe = os.path.join(wavdir, utt_id)
        write_tmp = [utt_id, duration]                                                                                        
        writer.writerow(write_tmp)
print(total)

os.popen简记

python中调用外部命令行命令,主要使用os.system(cmd)os.popen(cmd),两个命令的区别在于前者返回cmd退出状态码,后者能够返回脚本执行过程中的输出内容。在python的document中貌似并不推荐os.popen,推荐使用subprocess.popen,用法稍微复杂。
os.system

#!/bin/bash
echo "hello world!"
exit 3
os.system(cmd):返回16bit,低位为杀死所调用脚本的信号号码,高位为脚本的退出状态码(即高位信号有用)
>>> n = os.system(test.sh)
>>> n >> 8
>>> 3

os.popen
这种调用方式是通过管道的方式来实现,函数返回一个file-like的对象,里面的内容是脚本输出的内容(可简单理解为echo输出的内容),如果需要其他的内容,就需要使用重定向2>&1【注意重定向】

import os
cmd = 'echo haha'
tmp = os.popen(cmd).readlines()
print(tmp)

wave lib

统计The World English Bible中语音的时长,12G的文件,两层文件夹,统计后总时长为263965.387755s(约73h)

import os
import wave
import contextlib
def get_wav_duration(fname, print_flag=0):
    with contextlib.closing(wave.open(fname,'r')) as f:
        frames = f.getnframes()
        rate = f.getframerate()
        wav_duration = frames / float(rate)
        if str(print_flag) != '0':
            print('wav time: {}'.format(wav_duration))
        return wav_duration

wavdirdir = 'WEB'
wavdirlst = os.listdir(wavdirdir)
wavlst = []
for lst in wavdirlst:
    wavdir = os.path.join(wavdirdir,lst)
    wavpath = os.listdir(wavdir)
    for wav in wavpath:
        wavlst.append(os.path.join(wavdir,wav))
total = 0 
for lst in wavlst:
    total += get_wav_duration(lst)
print(total)

Reference

http://blog.csdn.net/yogurt0928/article/details/46625731
https://taizilongxu.gitbooks.io/stackoverflow-about-python/content/6/README.html
http://blog.csdn.net/windone0109/article/details/8895875
https://www.cnblogs.com/bluescorpio/archive/2010/05/04/1727020.html
http://blog.csdn.net/y_xianjun/article/details/73245482

posted @ 2017-11-21 15:16  战侠歌1994  阅读(1978)  评论(0编辑  收藏  举报