mysql 字符集研究
一、创建一个测试数据库 及一个测试用的表。均使用默认的编码方式。
show variables like 'char%':
mysql> show variables like 'char%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec)
create database dbtest
create table utf8Latin1Table(column_utf8 varchar(30),column_latin1 varchar(30));
查看库及表的编码方式:均为 latin1
mysql> show create database dbtest; +----------+-------------------------------------------------------------------+ | Database | Create Database | +----------+-------------------------------------------------------------------+ | dbtest | CREATE DATABASE `dbtest` /*!40100 DEFAULT CHARACTER SET latin1 */ | +----------+-------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> show create table test; +-------+---------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------+ | test | CREATE TABLE `test` ( `id` varchar(10) DEFAULT NULL, `name` varchar(30) DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 | +-------+---------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
查看表中字段的编码:
mysql> show full columns from utf8Latin1Table; +---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+ | Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment | +---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+ | column_utf8 | varchar(30) | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | | | column_latin1 | varchar(30) | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | | +---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+ 2 rows in set (0.00 sec)
修改column_utf8 的编码方式 :由latin1 ------ utf8 形式
alter table utf8Latin1Table change column_utf8 column_utf8 varchar(30) character set utf8;
修改完字段的字符集后:
mysql> show full columns from utf8Latin1Table;
+---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| column_utf8 | varchar(30) | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| column_latin1 | varchar(30) | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | |
+---------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
二、 向数据库中存入测试数据。(数据库连接工具是 UTF8 编码方式)
测试 数据为 带重音的 e : é
é 在 latin1中的编码为:E9 在utf8字符集中编码为:C3A9
mysql> insert into utf8Latin1Table values('é','é'); Query OK, 1 row affected (0.00 sec) mysql> select * from utf8Latin1Table; +-------------+---------------+ | column_utf8 | column_latin1 | +-------------+---------------+ | é | é | +-------------+---------------+ 1 row in set (0.00 sec)
查看存储的二进制内容:
mysql> select hex(column_utf8),column_utf8,hex(column_latin1),column_latin1 from utf8Latin1Table; +------------------+-------------+--------------------+---------------+ | hex(column_utf8) | column_utf8 | hex(column_latin1) | column_latin1 | +------------------+-------------+--------------------+---------------+ | C383C2A9 | é | C3A9 | é | +------------------+-------------+--------------------+---------------+ 1 row in set (0.00 sec)
虽然存入的是一样的字符,但是存入的二进制信息不同。
三、分析Insert过程 、select 过程(未写)
涉及应用程序段字符编码(数据库查询工具,网页端等等),character_set_client ,character_set_connection,character_set_results ,及数据库编码(列编码)
insert 过程: