最近在折腾语音方面的东西,所以看了下GOOGL的语音识别,经过GOOGLE发现Google的语音引擎是通过http来请求的,并且已经获取到http的地址,中文的调用Url为:http://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=zh-CN 其实一看就知道替换zh-CN 成其它语言也是可以的.那么在C#中如何调用呢,直接看代码:
/// <summary> /// 调用GOOLE语音识别引擎 /// </summary> /// <returns></returns> private string GoogleSTT() { string result = string.Empty; try { string inFile = "audio.wav"; FileStream fs = new FileStream(inFile, FileMode.Open); byte[] voice = new byte[fs.Length]; fs.Read(voice, 0, voice.Length); fs.Close(); HttpWebRequest request = null; string url = "http://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=zh-CN"; Uri uri = new Uri(url); request = (HttpWebRequest)WebRequest.Create(uri); request.Method = "POST"; request.ContentType = "audio/x-flac; rate=16000"; request.ContentLength = voice.Length; using (Stream writeStream = request.GetRequestStream()) { writeStream.Write(voice, 0, voice.Length); } using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) { using (Stream responseStream = response.GetResponseStream()) { using (StreamReader readStream = new StreamReader(responseStream, Encoding.UTF8)) { result = readStream.ReadToEnd(); } } } } catch (Exception ex) { Console.WriteLine(ex.StackTrace); } return result; }
其实就是直接通过http请求,然后获取返回的识别结果,不过有需要注意的地方 request.ContentType = "audio/x-flac; rate=16000;这句写的不对会有影响的,google默认的是audio/x-flac的,支持flac格式,要想支持wav、mp3等格式,在网上查之后说只能通过格式转换,如果是直接提交wav文件是无法识别的;后来我在讯飞的语音识别引擎的时候发现他们的也是http的,但是他们的是可以这样写的audio/L16,然后我就尝试着把google的也改成这样,不出所料可以支持wav格式了,这只说明google是有支持的,可能只是有些参数由于google对api的封闭,我们不清楚而已。