UTF-8 Encoding rules

/* UTF-8 encoding
00000000-01111111  00-7F  US-ASCII (single byte)
10000000-10111111  80-BF  Second, third, or fourth byte of a multi-byte sequence
11000000-11011111  C0-DF  Start of 2-byte sequence
11100000-11101111  E0-EF  Start of 3-byte sequence
11110000-11110100  F0-F4  Start of 4-byte sequence
*/

posted @ 2012-06-01 12:52  super119  阅读(138)  评论(0编辑  收藏  举报