C++ string在unicode下

VS2008编译环境下
string 不管是在unicode还是多字节字符集下。都是单字节,数字字母占一个字节,汉字占2个字节。如果想用宽字符 请用std::wstring,这个和THCAR的效果相同。当然也可以用微软的CString更方便些。

I have written before about How to use Unicode with Python, but I've never figured out how to use Unicode in Standard C before. I managed to find an extremely helpful UTF-8 and Unicode FAQ which answers most of the questions, particularly the section beginning with C Support for Unicode and UTF-8. Additionally, Tim Bray's Characters vs. Bytes provides a very readable overview of Unicode encodings.

The good news is that if you use wchar_t* strings and the family of functions related to them such as wprintf, wcslen, and wcslcat, you are dealing with Unicode values. In the C++ world, you can use std::wstring to provide a friendly interface. My only complaint is that these are 32-bit (4 byte) characters, so they are memory hogs for all languages. The reason for this choice is that it guarantees each possible character can be represented by one value. Contrary to popular belief, it is possible for a Unicode character to require multiple 16-bit values. While these characters are rare, it is dangerous to hard code that belief into your software, particularly as computers spread through more of the world, and as people create new characters.

posted @   奥雷连诺  阅读(692)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
阅读排行:
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型,支持深度思考和联网搜索!
· 25岁的心里话
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· ollama系列01:轻松3步本地部署deepseek,普通电脑可用
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
点击右上角即可分享
微信分享提示