StreamWriter结合UTF-8编码使用不当,会造成BOM(Byte Order Mark )问题生成乱码(转载)
问:
I was using HttpWebRequest to try a rest api in ASP.NET Core MVC.
Here is my HttpWebRequest client code:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://localhost:55161/Home/Testing"); string data; HttpWebResponse resp = (HttpWebResponse)req.GetResponse(); using (StreamReader reader = new StreamReader(resp.GetResponseStream(), System.Text.Encoding.UTF8)) { data = reader.ReadToEnd(); }
If I used StreamWriter to write a message to Response.Body in an ASP.NET Core controller, everything is fine:
using (var streamWriter = new StreamWriter(Response.Body, System.Text.Encoding.UTF8)) { streamWriter.Write("hello"); streamWriter.Flush(); }
But if I used Response.Body.Write embedded in a StreamWriter block to write the same message, there will be a weird 65279 character in the end of the string "hello" when I got it from my client code.
using (var streamWriter = new StreamWriter(Response.Body, System.Text.Encoding.UTF8)) { byte[] data = System.Text.Encoding.UTF8.GetBytes("hello"); Response.Body.Write(data, 0, data.Length); }
I want to know if this is a bug or any mechanism caused this problem?
I didn't use UseBrowserLink in startup and my ASP.NET Core version is 2.1
答:
there will be a weird 65279 character in the end of the string "hello" when I got it from my client code
You mean something like this?
hello
This is expected based on the code you provided. Why are you wrapping the stream in a StreamWriter, then writing to the stream directly?
The StreamWriter has a buffer that it will flush to the output when you close it. This will cause a lot of problems if you've been writing to the stream directly. Specifically, what's happening here is this:
- You wrap the Stream in the StreamWriter
- You write hello directly to the stream
- The StreamWriter is closed, so it flushes it's (empty) buffer.
- Since you are using the Encoding.UTF8 encoding, the StreamWriter writes a UTF-8 Byte Order Mark (the sequence 0xEF 0xBB 0xBF which appears as  unless it's at the very beginning of the stream) to the stream. Since you've already written hello, this appears after your hello, causing the rendering glitch above.
因此我们可以看到,在使用StreamWriter的时候,千万不要又用代码直接往StreamWriter底层的Stream对象(本例中是Response.Body)写入数据,因为这很有可能会导致StreamWriter错误地将UTF-8编码的BOM(Byte Order Mark)加到了你写入数据的后面,而UTF-8编码的BOM(Byte Order Mark)只能够出现在一个Stream的最开头才能被正确地识别,否则会被识别为乱码,如同本例中的hello一样。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 提示词工程——AI应用必不可少的技术
· Open-Sora 2.0 重磅开源!
· 周边上新:园子的第一款马克杯温暖上架