MySQL字符集
1.查看数据库支持那些字符集
mysql> show character set; +----------+---------------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +----------+---------------------------------+---------------------+--------+ | big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 | | dec8 | DEC West European | dec8_swedish_ci | 1 | | cp850 | DOS West European | cp850_general_ci | 1 | | hp8 | HP West European | hp8_english_ci | 1 | | koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 | | latin1 | cp1252 West European | latin1_swedish_ci | 1 | | latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 | | swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
2.GB2312主要包含简体中文字符及常用的符号,这种字符集对于中文字符采用双字节编码格式,也就是,一个汉字字符在存储是会占两个字节。
3.GBK:包含有中、日、韩字符集的大字符集,GB2312是GBK的一个子集。
4.UTF8:它对于英文字符使用的一个字节编码,、;
5.UTF8MB4:是UTF8的超集;
6.校对规则:一个字符集至少拥有一个校对规则,显示字符集的校对规则;
mysql> show collation like 'latin1%'; +-------------------+---------+----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +-------------------+---------+----+---------+----------+---------+ | latin1_german1_ci | latin1 | 5 | | Yes | 1 | | latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 | | latin1_danish_ci | latin1 | 15 | | Yes | 1 | | latin1_german2_ci | latin1 | 31 | | Yes | 2 | | latin1_bin | latin1 | 47 | | Yes | 1 | | latin1_general_ci | latin1 | 48 | | Yes | 1 | | latin1_general_cs | latin1 | 49 | | Yes | 1 | | latin1_spanish_ci | latin1 | 94 | | Yes | 1 | +-------------------+---------+----+---------+----------+---------+ 8 rows in set (0.01 sec)
7.MySQL服务器启动过程中,字符集设定就已经生效,系统会确定默认所使用的字符集和校对规则,字符集对整个MySQL服务的全局有效。
查看客户端的字符集:
mysql> show variables like 'character_set_client'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | character_set_client | utf8 | +----------------------+-------+ 1 row in set (0.01 sec)
8.测试字符串问题
mysql> insert into test values('你好'); ERROR 1366 (HY000): Incorrect string value: '\xE4\xBD\xA0\xE5\xA5\xBD' for column 'te' at row 1
查看表格默认的字符格式:
mysql> show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`te` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
将表格字符集改为uft8
mysql> alter table test charset utf8;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`te` varchar(20) CHARACTER SET latin1 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
继续插入还是有问题:
mysql> insert into test values('你好');
ERROR 1366 (HY000): Incorrect string value: '\xE4\xBD\xA0\xE5\xA5\xBD' for column 'te' at row 1
继续修改列的字符类型:
mysql> alter table test modify te varchar(20) character set utf8 default null;
Query OK, 0 rows affected (0.16 sec)
Records: 0 Duplicates: 0 Warnings: 0
插入成功:
mysql> insert into test values('你好');
Query OK, 1 row affected (0.01 sec)
查看成乱码:
mysql> select * from test;
+------+
| te |
+------+
| ţº` |
+------+
1 row in set (0.00 sec)
修改字符集的一般操作:
mysql> set character_set_connection=gbk;
Query OK, 0 rows affected (0.00 sec)
mysql> set character_set_client=gbk;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%char%';
+--------------------------------------+----------------------------+
| Variable_name | Value |
+--------------------------------------+----------------------------+
| character_set_client | gbk |
| character_set_connection | gbk |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| validate_password_special_char_count | 1 |
+--------------------------------------+----------------------------+
9.MySQ响应客户端操作的过程
(1)客户端发出的SQL语句,所使用的字符集有系统变量character_set_client来指定;
(2)MySQL服务端收到语句后,会用到character_set_connection和collation_connection两个系统变量中的设置,并且将客户端发送的语句字符集由character_set_client转换到character_set_collection;对于语句的字符串的比较或者排序,还需要用到collation_connection中指定的校对规则处理,而对语句中指定列的比较则无关collation_connection的设置。
10.一次搞定与连接相关的字符串字符集变量设置 set names 字符集,第二种方法 set character set 字符集;
mysql> show variables like '%char%'; +--------------------------------------+----------------------------+ | Variable_name | Value | +--------------------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | validate_password_special_char_count | 1 | +--------------------------------------+----------------------------+ 9 rows in set (0.02 sec) mysql> set names gbk; Query OK, 0 rows affected (0.00 sec) mysql> show variables like '%char%'; +--------------------------------------+----------------------------+ | Variable_name | Value | +--------------------------------------+----------------------------+ | character_set_client | gbk | | character_set_connection | gbk | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | gbk | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | validate_password_special_char_count | 1 | +--------------------------------------+----------------------------+ 9 rows in set (0.01 sec)
第二种方法 set character set latin1
mysql> show variables like '%char%'; +--------------------------------------+----------------------------+ | Variable_name | Value | +--------------------------------------+----------------------------+ | character_set_client | gbk | | character_set_connection | gbk | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | gbk | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | validate_password_special_char_count | 1 | +--------------------------------------+----------------------------+ 9 rows in set (0.00 sec) mysql> set character set latin1; Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%char%';
+--------------------------------------+----------------------------+
| Variable_name | Value |
+--------------------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| validate_password_special_char_count | 1 |
+--------------------------------------+----------------------------+
9 rows in set (0.02 sec)
11.固话连接时候的字符集设置
[mysql] default-character-set=gbk
12.字符集的指定
字符集的指定有四种粒度,server级别,数据库级别,表级别,列级别,依次递增。
server级别指定
在数据库安装或者启动时候,或者修改配置文件,或者运行期间都可以设置,其控制参数主要为:
--character_set_server 指定全局粒度的默认字符集
--collation_server 指定全局粒度的校对规则
在配置文件中指定
character-set-server=utf8 collation-server=utf8_general_ci
数据库级别指定:
可以在创建数据库时候指定,也可以在创建后指定;
mysql> create database yanbu charset latin1; Query OK, 1 row affected (0.00 sec) mysql> show create database yanbu; +----------+------------------------------------------------------------------+ | Database | Create Database | +----------+------------------------------------------------------------------+ | yanbu | CREATE DATABASE `yanbu` /*!40100 DEFAULT CHARACTER SET latin1 */ | +----------+------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> alter database yanbu charset utf8; Query OK, 1 row affected (0.00 sec) mysql> show create database yanbu; +----------+----------------------------------------------------------------+ | Database | Create Database | +----------+----------------------------------------------------------------+ | yanbu | CREATE DATABASE `yanbu` /*!40100 DEFAULT CHARACTER SET utf8 */ | +----------+----------------------------------------------------------------+ 1 row in set (0.01 sec)
表级别指定
在创建表的时候也可以给表指定字符集及校对规则,如果没有指定,则会集成数据库级别的字符集
mysql> show variables like 'char%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.04 sec) mysql> create table t3(id int); Query OK, 0 rows affected (0.03 sec) mysql> create table t4(id int) charset utf8; Query OK, 0 rows affected (0.03 sec) mysql> show create table t3; +-------+----------------------------------------------------------------------------------------+ | Table | Create Table | +-------+----------------------------------------------------------------------------------------+ | t3 | CREATE TABLE `t3` ( `id` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+----------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> show create table t4; +-------+--------------------------------------------------------------------------------------+ | Table | Create Table | +-------+--------------------------------------------------------------------------------------+ | t4 | CREATE TABLE `t4` ( `id` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +-------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
列级别指定字符集:
列的字符类型可以在创建列是指定,也可以用alter table来修改;列级别的字符集转换有可能会丢数据。
mysql> create table t5(v1 varchar(20) charset latin1,v2 varchar(20) charset gbk,v3 varchar(20) charset utf8); Query OK, 0 rows affected (0.04 sec) mysql> show create table t5; +-------+--------------------------------------------------------------------------------------------------------------------------- | Table | Create Table | +-------+--------------------------------------------------------------------------------------------------------------------------- `v1` varchar(20) DEFAULT NULL, `v2` varchar(20) CHARACTER SET gbk DEFAULT NULL, `v3` varchar(20) CHARACTER SET utf8 DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+--------------------------------------------------------------------------------------------------------------------------- 1 row in set (0.00 sec)