UTF-8

http://code.alexreisner.com/articles/character-encoding.html

对于以UTF-8编码的字节:

if it starts with 0 it’s an ASCII character
if it starts with 10 it’s a continuation of a multi-byte character
if it starts with 110 it’s the first byte of a 2-byte character
if it starts with 1110 it’s the first byte of a 3-byte character
if it starts with 11110 it’s the first byte of a 4-byte character
posted on 2011-04-18 15:56  lbsx  阅读(279)  评论(0)    收藏  举报