开源语音合成库 coqui TTS 使用记录
1 介绍
功能:可以克隆声音;可以转换声音。支持多语言。
GitHub https://github.com/coqui-ai/TTS
在线试玩(效果不如本地demo) https://huggingface.co/spaces/coqui/xtts
2 本地搭建demo
搭建环境
conda create -n coqui python=3.10
conda activate coqui
pip install TTS (可以自动安装需要的依赖包,也可以根据requirements.txt逐个安装依赖包)
运行时其他缺的包直接pip即可(貌似就只有一个)
下载源代码和模型
GitHub https://github.com/coqui-ai/TTS 版本dbf1a08
模型地址 https://huggingface.co/coqui/XTTS-v2/tree/main
测试脚本
import torch
from TTS.api import TTS
## 查看模型列表
# for name in TTS().list_models().list_models():
# print(name)
## Init TTS 初始化,传入模型和配置文件路径
device = "cuda" if torch.cuda.is_available() else "cpu" # Get device
tts = TTS(model_path="/home/ze/coqui/mypath/models/model.pth",
config_path="/home/ze/coqui/mypath/models/config.json",
progress_bar=True).to(device)
## Text to speech to a file
# ## 英文
# tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.",
# speaker_wav="mypath/audio/samples_en_sample.wav",
# language="en",
# file_path="output.wav")
# ## 中文
# tts.tts_to_file(text="龙能大能小,能升能隐;大则兴云吐雾,小则隐介藏形;升则飞腾于宇宙之间,隐则潜伏于波涛之内。方今春深,龙乘时变化,犹人得志而纵横四海。",
# speaker_wav="mypath/audio/samples_zh-cn-sample.wav",
# language="zh-cn",
# file_path="output.wav")
## 指定中文音色,输出英文
tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.",
speaker_wav="mypath/audio/dragon.wav",
language="en",
file_path="output.wav")
遇到问题
报错 NotADirectoryError: [Errno 20] Not a directory: '/home/ze/coqui/mypath/models/model.pth/model.pth’
原因:代码接口存在bug,在/home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192加载模型时没有按照接口定义。
解决:将home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192调用语句self.tts_model.load_checkpoint()中参数tts_checkpoint改为模型所在路径,比如"/home/ze/coqui/mypath/models”