pyaudio 读取麦克风阵列信号

一、确定麦克风的设备号

host_api_index可能是用于选择设备类型；

get_host_api_info_by_index(host_api_index)

get_device_info_by_host_api_device_index(host_api_index, host_api_device_index)

import pyaudio

p = pyaudio.PyAudio()
info = p.get_host_api_info_by_index(0)
numdevices = info.get('deviceCount')

for i in range(0,numdevices):
    if (p.get_device_info_by_host_api_device_index(0,i).get('maxInputChannels')) > 0:
        print('Input Device id ',i,'-',p.get_device_info_by_host_api_device_index(0,i).get('name'))

二、录制语音

import pyaudio
import pyaudio as py
import wave
import time

# 录制音频采用回调模式

# 缓存中存放帧数
CHUNK = 1024
# 量化位数为16位
RESPEAKER_WIDTH = 2
# 声道数,1声道为ASR处理语音，2-5声道为未经处理的麦克风单声道语音，6声道为合并语音（6声道未显示过）
RESPEAKER_CHANNELS = 6
# 对于语音信号的频率范围通常更窄（低于8000Hz）,因此通常使用16000Hz的采样率；
RESPEAKER_RATE = 16000
# 采样时间
RECORD_SECONDS = 5
# 输出文件名称
WAVE_OUTPUT_FILENAME = "output1.wav"
# device index
RESPEAKER_INDEX = 6

# 初始化pyaudio,sets up the portaudio system
p = py.PyAudio()
c = 0

# To record or play audio,open a stream on the desired device with the desired parameters
# The function has the following signature callback(<input_data>, <frame_count>, <time_info>, <status_flag>)
# and must return a tuple containing frame_count frames of audio data and a flag signifying
# whether there are more frames to play/record.
# 回调函数，调用在另一个单独的线程，其中包含了frame_count帧的语音数据
# 在回调模式中，read是不能被调用的


frames = []
def callback(input_data, frame_count, time_info, status):
    global c ,CHUNK, RECORD_SECONDS
    global frames
    print("get the {}th {} frames input_data".format(c, frame_count))
    c += 1
    frames.append(input_data)
    # time.sleep(1)
    if c == RESPEAKER_RATE // CHUNK * RECORD_SECONDS:
        return (input_data,py.paComplete)
    return (input_data, py.paContinue)


stream = p.open(
    # 采样率
    rate=RESPEAKER_RATE,
    # 量化位数
    format=p.get_format_from_width(RESPEAKER_WIDTH),
    # 语音通道数
    channels=RESPEAKER_CHANNELS,
    # 开启record
    input=True,
    # 具体输入设备
    input_device_index=RESPEAKER_INDEX,
    stream_callback=callback
)
# start the stream (4)
stream.start_stream()

# wait for stream to finish (5)
while stream.is_active():
    time.sleep(0.1)

# stop stream (6)
stream.stop_stream()
stream.close()

# close PyAudio (7)
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(RESPEAKER_CHANNELS)
wf.setsampwidth(p.get_sample_size(p.get_format_from_width(RESPEAKER_WIDTH)))
wf.setframerate(RESPEAKER_RATE)
wf.writeframes(b''.join(frames))
wf.close()
print("ok")

关于代码的备注：

这个代码主要是为了测试回调模式下，即非阻塞模式的录音方式，主要解答了几个困惑：

选择非阻塞模式的理由：因为要对语音数据进行实时处理，所以不能采用阻塞模式(blocking mode)，在回调函数中完成对语音信号的处理。

在回调函数中，回调函数会单独开辟一个线程。

处理的逻辑：处理流程是stream是一个流，实时获取麦克风语音数据，而回调函数会每次当有新的语音数据，会调用回调函数，传入frame_count帧的语音数据，回调函数会处理，处理完毕后，返回frame_count的数据，以及是否继续的标志。如果继续，即返回paContinue，当stream流依然活着，就继续调用回调函数，传入的数据是现在stream中的frame_count帧的语音数据。可以理解为stream是一直流动的，而回调函数只从这流动的线中，截取当前的一段距离进行处理。

为了保证主线程不死，在主线程中不断休眠。

posted @ 2022-01-06 12:29 longRookie 阅读(921) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

O(∩_∩)O

pyaudio 读取麦克风阵列信号

一、确定麦克风的设备号

二、录制语音

公告