2019 年 11月 8 日随笔档案 - Skye_Zhao

2019年11月8日

Utterance-level Aggregation for Speaker Recognition in The Wild

摘要：文章[1]主要针对的是语句长度不定，含有不相关信号的说话人识别。深度网络设计的关键在于主干(帧级)网络的类型【the type of trunk (frame level) network】和有时间序列属性的聚合方法【the method of temporal aggregation】。文中提阅读全文

posted @ 2019-11-08 11:50 Skye_Zhao 阅读(1691) 评论(4) 推荐(0) 编辑

公告