MySQL字符集那些事
1.、查看mysql当前使用的字符集。
登录mysql 在mysql 里输入 show variables like 'character_set%'
mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
说明:show variables 是查看mysql的变量 like 是模糊查询 后面的%号是指任意字符 相当于linux里的*,以上语句表示查看mysql 里的以character_set开头变量的值。从上面的表中我们可以看到mysql 客户端是用的latin1 mysql连接也是latin1 还有mysql返回也是latin1。接下来我们看下我们要测试的库和表的字符集。
mysql> show create database qiuhom; +----------+-----------------------------------------------------------------+ | Database | Create Database | +----------+-----------------------------------------------------------------+ | qiuhom | CREATE DATABASE `qiuhom` /*!40100 DEFAULT CHARACTER SET utf8 */ | +----------+-----------------------------------------------------------------+ 1 row in set (0.00 sec)
说明:以上语句可以看出我们qiuhom库是用的utf8字符集
mysql> use qiuhom Database changed mysql> show tables; +------------------+ | Tables_in_qiuhom | +------------------+ | test | +------------------+ 1 row in set (0.00 sec) mysql> show create table test; +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------+ | test | CREATE TABLE `test` ( `id` int(4) NOT NULL AUTO_INCREMENT, `name` char(8) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
说明:以上语句可以看出我们test表用的是utf8字符集。
现在我们往test里面插入数据看看是什么情况呢
mysql> insert into test values(1,'a1'),(2,'a2'); Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from test; +----+------+ | id | name | +----+------+ | 1 | a1 | | 2 | a2 | +----+------+ 2 rows in set (0.00 sec)
可以看出 当我们用latin1的字符集往utf8的库表插入非中文字符数可以正常显示的。
那我们现在来插入中文字符来看看会是什么结果呢
mysql> insert into test values(3,"小明"),(4,"小红"); Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from test; +----+--------+ | id | name | +----+--------+ | 1 | a1 | | 2 | a2 | | 3 | 小明 | | 4 | 小红 | +----+--------+ 4 rows in set (0.00 sec)
恩?怎么中文也可以显示呢,我们这样退出mysql重新进入查看数呢?
mysql> select * from test; +----+--------+ | id | name | +----+--------+ | 1 | a1 | | 2 | a2 | | 3 | å°æ˜Ž | | 4 | å°çº¢ | +----+--------+ 4 rows in set (0.00 sec)
为什么我们退出了mysql重新登录后就乱码了呢?
我们在来看看系统的字符集
mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
可以看出现在系统的字符集变成了utf8,以上例子说明了一点就是,当库表的字符集和你当前数据库的字符集不同时 我们插入数据(特别是中文)时一定要调整字符集,一般是调整当前字符集和库表一致,否则会导致乱码。
2、说说设置字符集的方法
2.1临时设置方法 set names +字符集
mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec) mysql> set names latin1; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
set names 是临时生效的方式设置,当我们退出mysql后重新登录字符集又会变成设置前的字符集。当然临时设置的方式有很多,但是不管以怎么的方式 最后都是以执行set names +字符集生效的
比如 我们可以在导入数据的时候指定字符集 mysql -uroot -padmin --default-character-set=utf8 < /work/backup.sql
我们首先我们用mysqldump把qiuhom库给逻辑备份出来,并指定字符集
[root@qiuhom ~]# mysqldump -uroot -padmin --default-character-set=latin1 -B qiuhom > /work/backup.sql
如果执行以上命令没有报错就表示备份成功,简单说下mysqldump命令吧,它作用就是把mysql里面的数据以sql语句的形式给备份出来,备份出来的不是实际的数据文件,而是一堆sql语句,其中-B的作用是指定要备份的数据库,可以同时备份多个库。
以上语句我们指定以latin1的字符集导出数据是防止我们到处的数据是乱码。我们可以vim 查看导出的SQL语句
[root@qiuhom ~]# vim /work/backup.sql 1 -- MySQL dump 10.13 Distrib 5.5.32, for Linux (x86_64) 2 -- 3 -- Host: localhost Database: qiuhom 4 -- ------------------------------------------------------ 5 -- Server version 5.5.32-log 6 7 /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; 8 /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; 9 /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; 10 /*!40101 SET NAMES latin1 */; 11 /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */; 12 /*!40103 SET TIME_ZONE='+00:00' */; 13 /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */; 14 /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */; 15 /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */; 16 /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */; 17 18 -- 19 -- Current Database: `qiuhom` 20 -- 21 23 24 USE `qiuhom`; 25 26 -- 27 -- Table structure for table `test` 28 -- 29 30 DROP TABLE IF EXISTS `test`; 31 /*!40101 SET @saved_cs_client = @@character_set_client */; 32 /*!40101 SET character_set_client = utf8 */; 33 CREATE TABLE `test` ( 34 `id` int(4) NOT NULL AUTO_INCREMENT, 35 `name` char(8) DEFAULT NULL, 36 PRIMARY KEY (`id`) 37 ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8; 38 /*!40101 SET character_set_client = @saved_cs_client */; 39 40 -- 41 -- Dumping data for table `test` 42 -- 43 44 LOCK TABLES `test` WRITE; 45 /*!40000 ALTER TABLE `test` DISABLE KEYS */; 46 INSERT INTO `test` VALUES (1,'a1'),(2,'a2'),(3,'小明'),(4,'小红'); 47 /*!40000 ALTER TABLE `test` ENABLE KEYS */; 48 UNLOCK TABLES; 49 /*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */; 50 51 /*!40101 SET SQL_MODE=@OLD_SQL_MODE */; 52 /*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */; 53 /*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */; 54 /*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */; 55 /*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */; 56 /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */; 57 /*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */; 58 59 -- Dump completed on 2018-10-05 11:05:28
可以看出我们导出的sql语句插入语句的数据是不乱码的。如果我们不指定字符集,导出来就是乱码的,它默认以库表的字符集导出数据。
然后我们把qiuhom库给drop掉。然后在通过备份还原。
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | oldboy | | performance_schema | | qiuhom | +--------------------+ 5 rows in set (0.00 sec) mysql> drop database qiuhom; Query OK, 1 row affected (0.01 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | oldboy | | performance_schema | +--------------------+ 4 rows in set (0.00 sec)
可以看出现在qiuhom库已经给删除掉了。接下来我们用刚才备份的sql文件,以指定的字符集导入数据。
指定字符集latin1导入数据
[root@qiuhom ~]# mysql -uroot -padmin --default-character-set=latin1 < /work/backup.sql [root@qiuhom ~]# mysql -uroot -padmin -e "use qiuhom;select * from test;" +----+--------------+ | id | name | +----+--------------+ | 1 | a1 | | 2 | a2 | | 3 | å°æ˜Ž | | 4 | å°çº¢ | +----+--------------+
查看数据还是乱码,这个乱码的原因上面已经解释过了,是因为插入数据以latin1的字符集插入,而系统本身是utf8的字符集,所以乱码,我们可以指定字符集查看
[root@qiuhom ~]# mysql -uroot -padmin --default-character-set=latin1 -e "use qiuhom;select * from test;" +----+--------+ | id | name | +----+--------+ | 1 | a1 | | 2 | a2 | | 3 | 小明 | | 4 | 小红 | +----+--------+ [root@qiuhom ~]# mysql -uroot -padmin -e "show variables like 'character_set%';" +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+
这时就不乱码,是因为我们指定了临时的字符集为latin1,系统本身还是utf8。所以 --default-character-set 是指定临时字符集的方式之一。
说明:我们这里容易产生一个误区就是说,字符集和库表的字符相同后,库表里的内容就不乱码了,不是这样的。库表里的内容乱不乱码在于你插入数据时指定的字符集和你查看数据是指定的字符集是否一样,所以不管你以什么样的方式导入和导出 我们要查看数据是指定插入数据时的字符集就不乱码,否则怎么都乱码。所以在往别人的库表里插入数据时一定要注意看库表的默认字符集是否和系统默认字符集是否一致,不一致用set names 调成一致在插入数据,这样就不乱码,比如库表的字符集是latin1 而我们系统默认字符集是utf8 那么我们先把系统字符用set names latin1 临时更换成latin1,然后在插入数据,这样插入的数据字符集才是和库表字符集一致。
以上用set names 的方式和 --default-character-set 的方式都是临时生效,当我们退出mysql登录时,在进就失效了,反反复复很麻烦有没有什么办法搞成永久的呢?答案是肯定有,我们在mysql的配置文件更改字符集这样我们每次登录进入mysql就是配置文件的字符集了
[client] default-character-set=latin1 port = 3306 socket = /application/mysql-5.5.32/tmp/mysql.sock
我们在/etc/my.cnf里找到[client] 在[client]下加入default-character-set=latin1,表示mysql客户端的字符集为latin1
[mysqld] character-set-server=latin1
找到[mysqld] 在其下面加入character-set-server=latin1 ,表示mysql服务端的字符集为latin1。
说明:以上就是永久更改字符集配置文件需要修改的地方,值得注意的是修改client 不需要重启服务,退出数据库重新进入字符集就发生了变化,但是服务端必须要重启服务,还有就是mysql5.1之前在服务端更字符集应该写成default-character-set=latin1,本人是用的mysql5.5 。
修改好配置文件 重新启动mysql服务 我们查看字符集
[root@qiuhom ~]# mysql -uroot -padmin -e "show variables like 'character_set%';" +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+
这样我们以后每次登录进mysql就都是latin1的字符集了。
3、建库建表指定字符集
3.1建库指定字符集。
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | qiuhom | +--------------------+ 4 rows in set (0.00 sec) mysql> create database qiuhom_gbk default character set gbk collate gbk_chinese_ci; Query OK, 1 row affected (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | qiuhom | | qiuhom_gbk | +--------------------+ 5 rows in set (0.00 sec) mysql> show create database qiuhom_gbk; +------------+--------------------------------------------------------------------+ | Database | Create Database | +------------+--------------------------------------------------------------------+ | qiuhom_gbk | CREATE DATABASE `qiuhom_gbk` /*!40100 DEFAULT CHARACTER SET gbk */ | +------------+--------------------------------------------------------------------+ 1 row in set (0.00 sec)
提示:查看校对规则 可以用show character set;语句在mysql里查字符集对应校对规则。
3.2建表指定字符集
mysql> create table test( -> id int not null, -> name char not null, -> primary key(id) -> )engine=innodb default charset=latin1; Query OK, 0 rows affected (0.00 sec) mysql> show tables; +----------------------+ | Tables_in_qiuhom_gbk | +----------------------+ | test | +----------------------+ 1 row in set (0.00 sec) mysql> show create table test; +-------+---------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------------+ | test | CREATE TABLE `test` ( `id` int(11) NOT NULL, `name` char(1) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+---------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
提示:创建表的后面可以不用写engine=innodb,因为mysql5.5是默认引擎是innodb
4、总结
关于mysql字符集的问题 我们要有一个思想就是要统一字符集思想,怎么统一字符集呢,如果有中英字符混杂的环境建议用utf8字符集。统一字符集我们从linux系统 到mysql 系统 再到应用程序。具体设置方式如下:
1.调整linux服务端系统字符集
[root@qiuhom ~]# cat /etc/sysconfig/i18n LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 [root@qiuhom ~]# echo $LANG en_US.UTF-8
如果是echo $LANG返回的是我们设置的字符集,说明linux服务端的字符集设置好了
2.调整linux客户端字符集,所谓Linux客户端就是我们用的xshell和SecureCRT远程连接软件。
2.1SecureCRT设置方法和位置
2.2xshell设置位置
3.调整mysql客户端的字符集
3.1临时设置方式
方法一:
mysql> set names latin1; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
方法二:
[root@qiuhom ~]# mysql -uroot -padmin --default-character-set=latin1 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.5.32-log Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
3.2永久设置编辑/etc/my.cnf找到[client]
[client] default-character-set=latin1
然后再进mysql字符集就变成latin1了
4.设置mysql服务端字符集
4.1临时设置
mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec) mysql> set character_set_server=latin1; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'character_set%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /application/mysql-5.5.32/share/charsets/ | +--------------------------+-------------------------------------------+ 8 rows in set (0.00 sec)
4.2永久设置,在my.cnf里找到[mysqld]
[mysqld] character-set-server=latin1
说明:mysql5.1之前 是 default-character-set=latin1
5.创建库设置字符集
mysql> create database qiuhom_gbk default character set gbk collate gbk_chinese_ci; Query OK, 1 row affected (0.00 sec) mysql> show create database qiuhom_gbk; +------------+--------------------------------------------------------------------+ | Database | Create Database | +------------+--------------------------------------------------------------------+ | qiuhom_gbk | CREATE DATABASE `qiuhom_gbk` /*!40100 DEFAULT CHARACTER SET gbk */ | +------------+--------------------------------------------------------------------+ 1 row in set (0.00 sec)
6.创建表设置字符集
mysql> create table test( -> id int not null, -> name char not null, -> primary key(id) -> )engine=innodb default charset=latin1; Query OK, 0 rows affected (0.00 sec) mysql> show create table test; +-------+---------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------------+ | test | CREATE TABLE `test` ( `id` int(11) NOT NULL, `name` char(1) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+---------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
统一以上6步的字符集就不会出现乱码。