TorchAudio 入门
TorchAudio 入门
1. 读写
1.1 Backend
不同的 Backend 会对 audio 的读写有影响
-
Windows 默认为 SoundFile
pip3 install PySoundFile
-
Mac/Lunix 默认为 SoX
# 查看可用的 backend
print(torchaudio.backend.list_audio_backends())
# 设置 backend
torchaudio.set_audio_backend("sox_io") # soundfile
1.1 音频读写
以下均基于 SoundFile backend
-
获取音频的源信息
torchaudio.info()metadata = torchaudio.info(SAMPLE_WAV_PATH)属性:
-
metadata.sample_ratesampling rate,采样率(次/秒)duration = tensor_size / sample_rate
-
metadata.num_channelsthe number of channels -
metadata.num_framesthe number of frames per channel -
metadata.bits_per_samplebit depth -
metadata.encodingthe sample coding format
-
-
读取音频为
Tensorwaveform, sample_rate = torchaudio.load(SAMPLE_WAV_PATH) # waveform: torch.Tensor 类型 -
保存
Tensor为音频torchaudio.savetorchaudio.save(filepath, src, sample_rate)

浙公网安备 33010602011771号