_cs, _ci, or _bin,

High Performance MySQL, Third Edition
by Baron Schwartz, Peter Zaitsev, and Vadim Tkachenko
 
http://dev.mysql.com/doc/refman/5.7/en/charset-general.html
 1 DROP TABLE IF EXISTS `w_ci_bin_cs`;
 2 CREATE TABLE `w_ci_bin_cs` (
 3   `pkey` int(11) NOT NULL AUTO_INCREMENT,
 4   `w` char(255) NOT NULL DEFAULT 'W',
 5   `w_ci` char(255) NOT NULL DEFAULT 'a',
 6   `w_bin` char(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'bin',
 7   `w_ci_bin` char(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'ci_bin',
 8   `w__bin` char(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '__bin',
 9   PRIMARY KEY (`pkey`)
10 ) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
11 
12 -- ----------------------------
13 -- Records of w_ci_bin_cs
14 -- ----------------------------
15 INSERT INTO `w_ci_bin_cs` VALUES ('1', 'w', 'a', 'bin', 'ci_bin', '__bin');
16 INSERT INTO `w_ci_bin_cs` VALUES ('2', 'W', 'A', 'BIN', 'CI_BIN', 'BIN');

 

 1 mysql> SELECT * FROM w_ci_bin_cs;
 2 +------+---+------+-------+----------+--------+
 3 | pkey | w | w_ci | w_bin | w_ci_bin | w__bin |
 4 +------+---+------+-------+----------+--------+
 5 |    1 | w | a    | bin   | ci_bin   | __bin  |
 6 |    2 | W | A    | BIN   | CI_BIN   | BIN    |
 7 +------+---+------+-------+----------+--------+
 8 2 rows in set (0.00 sec)
 9 
10 mysql> SELECT * FROM w_ci_bin_cs WHERE w='w';
11 +------+---+------+-------+----------+--------+
12 | pkey | w | w_ci | w_bin | w_ci_bin | w__bin |
13 +------+---+------+-------+----------+--------+
14 |    1 | w | a    | bin   | ci_bin   | __bin  |
15 |    2 | W | A    | BIN   | CI_BIN   | BIN    |
16 +------+---+------+-------+----------+--------+
17 2 rows in set (0.00 sec)
18 
19 mysql> SELECT * FROM w_ci_bin_cs WHERE w_ci='a';
20 +------+---+------+-------+----------+--------+
21 | pkey | w | w_ci | w_bin | w_ci_bin | w__bin |
22 +------+---+------+-------+----------+--------+
23 |    1 | w | a    | bin   | ci_bin   | __bin  |
24 |    2 | W | A    | BIN   | CI_BIN   | BIN    |
25 +------+---+------+-------+----------+--------+
26 2 rows in set (0.00 sec)
27 
28 mysql> SELECT * FROM w_ci_bin_cs WHERE w_bin='BIN';
29 +------+---+------+-------+----------+--------+
30 | pkey | w | w_ci | w_bin | w_ci_bin | w__bin |
31 +------+---+------+-------+----------+--------+
32 |    2 | W | A    | BIN   | CI_BIN   | BIN    |
33 +------+---+------+-------+----------+--------+
34 1 row in set (0.00 sec)
35 
36 mysql>

 

 
 
 

11.1.1 Character Sets and Collations in General

        A character set is a set        of symbols and encodings. A         collation is a set of        rules for comparing characters in a character set. Let's make        the distinction clear with an example of an imaginary character        set.      

        Suppose that we have an alphabet with four letters:         A, B,         a, b. We give each letter        a number: A = 0, B = 1,         a = 2, b = 3. The letter         A is a symbol, the number 0 is the         encoding for         A, and the combination of all four letters        and their encodings is a character        set.      

        Suppose that we want to compare two string values,         A and B. The simplest way        to do this is to look at the encodings: 0 for         A and 1 for B. Because 0        is less than 1, we say A is less than         B. What we've just done is apply a collation        to our character set. The collation is a set of rules (only one        rule in this case): compare the encodings.” We        call this simplest of all possible collations a         binary collation.      

        But what if we want to say that the lowercase and uppercase        letters are equivalent? Then we would have at least two rules:        (1) treat the lowercase letters a and         b as equivalent to A and         B; (2) then compare the encodings. We call        this a case-insensitive        collation. It is a little more complex than a binary collation.      

        In real life, most character sets have many characters: not just         A and B but whole        alphabets, sometimes multiple alphabets or eastern writing        systems with thousands of characters, along with many special        symbols and punctuation marks. Also in real life, most        collations have many rules, not just for whether to distinguish        lettercase, but also for whether to distinguish accents (an         accent” is a mark attached to a character as in        German Ö), and for multiple-character        mappings (such as the rule that Ö =         OE in one of the two German collations).      

        MySQL can do these things for you:

  •             Store strings using a variety of character sets.          

  •             Compare strings using a variety of collations.          

  •             Mix strings with different character sets or collations in            the same server, the same database, or even the same table.          

  •             Enable specification of character set and collation at any            level.

        To use these features effectively, you must know what character        sets and collations are available, how to change the defaults,        and how they affect the behavior of string operators and        functions.

 

//极简原则 KEEP IT SIMPLE

For sanity’s sake, it’s best to choose sensible defaults on the server level, and perhaps on the database level. Then you can deal with special exceptions on a case-by-case basis, probably at the column level.
 
 
posted @ 2016-09-03 23:34  papering  阅读(334)  评论(0编辑  收藏  举报