MySQL 8.0 Reference Manual(读书笔记41节-- Data Types(3))

1.Data Type Default Values

Data type specifications can have explicit【ɪkˈsplɪsɪt 明确的;详述的;直言的, 坦率的;一目了然的;】 or implicit【ɪmˈplɪsɪt 含蓄的;完全的;内含的;无疑问的;不直接言明的;成为一部分的;】 default values.

A DEFAULT value clause in a data type specification explicitly indicates a default value for a column. Examples:

CREATE TABLE t1 (
 i INT DEFAULT -1,
 c VARCHAR(10) DEFAULT '',
 price DOUBLE(16,2) DEFAULT 0.00
);

SERIAL DEFAULT VALUE is a special case. In the definition of an integer column, it is an alias for NOT NULL AUTO_INCREMENT UNIQUE.

Some aspects of explicit DEFAULT clause handling are version dependent【dɪˈpendənt 从属物;依靠者;被扶养人:被赡养人:非独立生活的人;】, as described following.

1.1 Explicit Default Handling as of MySQL 8.0.13

The default value specified in a DEFAULT clause can be a literal【ˈlɪtərəl 字面意义的;缺乏想象力的;完全按原文的;】 constant or an expression. With one exception, enclose expression default values within parentheses【pəˈrɛnθəˌsiz 插入语;】 to distinguish【dɪˈstɪŋɡwɪʃ 区分;辨别;分清;使有别于;使出众;认出;看清;使具有…的特色;听出;成为…的特征;】 them from literal constant default values. Examples:

CREATE TABLE t1 (
 -- literal defaults
 i INT DEFAULT 0,
 c VARCHAR(10) DEFAULT '',
 -- expression defaults
 f FLOAT DEFAULT (RAND() * RAND()),
 b BINARY(16) DEFAULT (UUID_TO_BIN(UUID())),
 d DATE DEFAULT (CURRENT_DATE + INTERVAL 1 YEAR),
 p POINT DEFAULT (Point(0,0)),
 j JSON DEFAULT (JSON_ARRAY())
);

The exception is that, for TIMESTAMP and DATETIME columns, you can specify the CURRENT_TIMESTAMP function as the default, without enclosing parentheses.

The BLOB, TEXT, GEOMETRY, and JSON data types can be assigned a default value only if the value is written as an expression, even if the expression value is a literal:

• This is permitted (literal default specified as expression):

CREATE TABLE t2 (b BLOB DEFAULT ('abc'));

• This produces an error (literal default not specified as expression):

CREATE TABLE t2 (b BLOB DEFAULT 'abc');

Expression default values must adhere【ədˈhɪr 遵守;附着;黏附;】 to the following rules. An error occurs if an expression contains disallowed constructs.

• Literals, built-in functions (both deterministic and nondeterministic), and operators are permitted.

• Subqueries, parameters【pəˈræmətərz 范围;规范;决定因素;】, variables, stored functions, and loadable functions are not permitted.

• An expression default value cannot depend on a column that has the AUTO_INCREMENT attribute.

• An expression default value for one column can refer to other table columns, with the exception that references to generated columns or columns with expression default values must be to columns that occur earlier in the table definition. That is, expression default values cannot contain forward references to generated columns or columns with expression default values.

The ordering constraint also applies to the use of ALTER TABLE to reorder table columns. If the resulting table would have an expression default value that contains a forward reference to a generated column or column with an expression default value, the statement fails.

说明:If any component of an expression default value depends on the SQL mode, different results may occur for different uses of the table unless the SQL mode is the same during all uses.

For CREATE TABLE ... LIKE and CREATE TABLE ... SELECT, the destination table preserves【prɪˈzɜːrvz 保存;保护;保留;维护;保鲜;保养;贮存;维持…的原状;】 expression default values from the original table.

If an expression default value refers to a nondeterministic function, any statement that causes the expression to be evaluated【ɪˈvæljueɪtɪd 评价;评估;估计;】 is unsafe for statement-based replication. This includes statements such as INSERT and UPDATE. In this situation, if binary logging is disabled, the statement is executed as normal. If binary logging is enabled and binlog_format is set to STATEMENT, the statement is logged and executed but a warning message is written to the error log, because replication slaves might diverge【daɪˈvɜːrdʒ 分歧;偏离;分叉;背离;岔开;相异;违背;】. When binlog_format is set to MIXED or ROW, the statement is executed as normal.

When inserting a new row, the default value for a column with an expression default can be inserted either by omitting the column name or by specifying the column as DEFAULT (just as for columns with literal defaults):

mysql> CREATE TABLE t4 (uid BINARY(16) DEFAULT (UUID_TO_BIN(UUID())));
mysql> INSERT INTO t4 () VALUES();
mysql> INSERT INTO t4 () VALUES(DEFAULT);
mysql> SELECT BIN_TO_UUID(uid) AS uid FROM t4;
+--------------------------------------+
| uid                                  |
+--------------------------------------+
| f1109174-94c9-11e8-971d-3bf1095aa633 |
| f110cf9a-94c9-11e8-971d-3bf1095aa633 |
+--------------------------------------+

However, the use of DEFAULT(col_name) to specify the default value for a named column is permitted only for columns that have a literal default value, not for columns that have an expression default value.

Not all storage engines permit expression default values. For those that do not, an ER_UNSUPPORTED_ACTION_ON_DEFAULT_VAL_GENERATED error occurs.

If a default value evaluates to a data type that differs from the declared column type, implicit coercion【koʊˈɜːrʒn 强迫;胁迫;】 to the declared type occurs according to the usual MySQL type-conversion rules.

1.2.Explicit Default Handling Prior to MySQL 8.0.13

With one exception【ɪkˈsepʃn 例外;规则的例外;一般情况以外的人(或事物);例外的事物;】, the default value specified in a DEFAULT clause must be a literal constant; it cannot be a function or an expression. This means, for example, that you cannot set the default for a date column to be the value of a function such as NOW() or CURRENT_DATE. The exception is that, for TIMESTAMP and DATETIME columns, you can specify CURRENT_TIMESTAMP as the default.

The BLOB, TEXT, GEOMETRY, and JSON data types cannot be assigned a default value.

If a default value evaluates to a data type that differs from the declared column type, implicit coercion to the declared type occurs according to the usual MySQL type-conversion rules.

1.3 Implicit Default Handling

If a data type specification includes no explicit DEFAULT value, MySQL determines the default value as follows:

If the column can take NULL as a value, the column is defined with an explicit DEFAULT NULL clause.

If the column cannot take NULL as a value, MySQL defines the column with no explicit DEFAULT clause.

For data entry into a NOT NULL column that has no explicit DEFAULT clause, if an INSERT or REPLACE statement includes no value for the column, or an UPDATE statement sets the column to NULL, MySQL handles the column according to the SQL mode in effect at the time:

• If strict SQL mode is enabled, an error occurs for transactional tables and the statement is rolled back. For nontransactional tables, an error occurs, but if this happens for the second or subsequent row of a multiple-row statement, the preceding rows are inserted.

• If strict mode is not enabled, MySQL sets the column to the implicit default value for the column data type.

Suppose that a table t is defined as follows:

CREATE TABLE t (i INT NOT NULL);

In this case, i has no explicit default, so in strict mode each of the following statements produce an error and no row is inserted. When not using strict mode, only the third statement produces an error; the implicit default is inserted for the first two statements, but the third fails because DEFAULT(i) cannot produce a value:

INSERT INTO t VALUES();
INSERT INTO t VALUES(DEFAULT);
INSERT INTO t VALUES(DEFAULT(i));

For a given table, the SHOW CREATE TABLE statement displays which columns have an explicit DEFAULT clause.

Implicit defaults are defined as follows:

• For numeric types, the default is 0, with the exception that for integer or floating-point types declared with the AUTO_INCREMENT attribute, the default is the next value in the sequence.

• For date and time types other than TIMESTAMP, the default is the appropriate “zero” value for the type. This is also true for TIMESTAMP if the explicit_defaults_for_timestamp system variable is enabled. Otherwise, for the first TIMESTAMP column in a table, the default value is the current date and time.  

• For string types other than ENUM, the default value is the empty string. For ENUM, the default is the first enumeration value.

2. Data Type Storage Requirements

The storage requirements for table data on disk depend on several factors. Different storage engines represent【ˌreprɪˈzent 代表,表示;(在竞赛或体育赛事中)代表(国家或地区);(标志或符号)意味着,代表着,标志着;描绘,(形象地)表现,描写;代理(个人或团体);代表(个人或团体)出席;】 data types and store raw data differently. Table data might be compressed, either for a column or an entire row, complicating【ˈkɑːmplɪkeɪtɪŋ 使复杂化;】 the calculation of storage requirements for a table or column.

Despite differences in storage layout on disk, the internal MySQL APIs that communicate and exchange information about table rows use a consistent data structure that applies across all storage engines.

This section includes guidelines and information for the storage requirements for each data type supported by MySQL, including the internal format and size for storage engines that use a fixed-size representation for data types. Information is listed by category or storage engine.

The internal representation of a table has a maximum row size of 65,535 bytes, even if the storage engine is capable of supporting larger rows. This figure【ˈfɪɡjər】 excludes BLOB or TEXT columns, which contribute only 9 to 12 bytes toward this size. For BLOB and TEXT data, the information is stored internally in a different area of memory than the row buffer. Different storage engines handle the allocation and storage of this data in different ways, according to the method they use for handling the corresponding types.

2.1 InnoDB Table Storage Requirements

See https://www.cnblogs.com/xuliuzai/p/18102704 for information about storage requirements for InnoDB tables.

2.2 NDB Table Storage Requirements

NDB tables use 4-byte alignment【əˈlaɪnmənt 对齐;(国家、团体间的)结盟;排成直线;】; all NDB data storage is done in multiples of 4 bytes. Thus, a column value that would typically take 15 bytes requires 16 bytes in an NDB table. For example, in NDB tables, the TINYINT, SMALLINT, MEDIUMINT, and INTEGER (INT) column types each require 4 bytes storage per record due to the alignment factor. Each BIT(M) column takes M bits of storage space. Although an individual BIT column is not 4-byte aligned, NDB reserves 4 bytes (32 bits) per row for the first 1-32 bits needed for BIT columns, then another 4 bytes for bits 33-64, and so on. While a NULL itself does not require any storage space, NDB reserves 4 bytes per row if the table definition contains any columns allowing NULL, up to 32 NULL columns. (If an NDB Cluster table is defined with more than 32 NULL columns up to 64 NULL columns, then 8 bytes per row are reserved.)

Every table using the NDB storage engine requires a primary key; if you do not define a primary key, a “hidden” primary key is created by NDB. This hidden primary key consumes 31-35 bytes per table record.

You can use the ndb_size.pl Perl script to estimate NDB storage requirements. It connects to a current MySQL (not NDB Cluster) database and creates a report on how much space that database would require if it used the NDB storage engine.

2.3 Numeric Type Storage Requirements

 Values for DECIMAL (and NUMERIC) columns are represented using a binary format that packs nine decimal (base 10) digits into four bytes. Storage for the integer and fractional parts of each value are determined separately. Each multiple of nine digits requires four bytes, and the “leftover” digits require some fraction of four bytes. The storage required for excess digits is given by the following table.

 2.4 Date and Time Type Storage Requirements

For TIME, DATETIME, and TIMESTAMP columns, the storage required for tables created before MySQL 5.6.4 differs from tables created from 5.6.4 on. This is due to a change in 5.6.4 that permits these types to have a fractional part, which requires from 0 to 3 bytes.

As of MySQL 5.6.4, storage for YEAR and DATE remains unchanged. However, TIME, DATETIME, and TIMESTAMP are represented differently. DATETIME is packed more efficiently, requiring 5 rather than 8 bytes for the nonfractional part, and all three parts have a fractional part that requires from 0 to 3 bytes, depending on the fractional seconds precision of stored values.

For example, TIME(0), TIME(2), TIME(4), and TIME(6) use 3, 4, 5, and 6 bytes, respectively. TIME and TIME(0) are equivalent and require the same storage.

2.5 String Type Storage Requirements 

In the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value.

 Variable-length string types are stored using a length prefix plus data. The length prefix requires from one to four bytes depending on the data type, and the value of the prefix is L (the byte length of the string). For example, storage for a MEDIUMTEXT value requires L bytes to store the value plus three bytes to store the length of the value.

To calculate the number of bytes used to store a particular CHAR, VARCHAR, or TEXT column value, you must take into account the character set used for that column and whether the value contains multibyte characters. In particular, when using a UTF-8 Unicode character set, you must keep in mind that not all characters use the same number of bytes. utf8mb3 and utf8mb4 character sets can require up to three and four bytes per character, respectively.

VARCHAR, VARBINARY, and the BLOB and TEXT types are variable-length types. For each, the storage requirements depend on these factors:

• The actual length of the column value

• The column's maximum possible length

• The character set used for the column, because some character sets contain multibyte characters

 For example, a VARCHAR(255) column can hold a string with a maximum length of 255 characters. Assuming that the column uses the latin1 character set (one byte per character), the actual storage required is the length of the string (L), plus one byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is five bytes. If the same column is instead declared to use the ucs2 double-byte character set, the storage requirement is 10 bytes: The length of 'abcd' is eight bytes and the column requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes).

The effective maximum number of bytes that can be stored in a VARCHAR or VARBINARY column is subject to the maximum row size of 65,535 bytes, which is shared among all columns. For a VARCHAR column that stores multibyte characters, the effective maximum number of characters is less. For example, utf8mb4 characters can require up to four bytes per character, so a VARCHAR column that uses the utf8mb4 character set can be declared to be a maximum of 16,383 characters.

InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

The NDB storage engine supports variable-width columns. This means that a VARCHAR column in an NDB Cluster table requires the same amount of storage as would any other storage engine, with the exception that such values are 4-byte aligned. Thus, the string 'abcd' stored in a VARCHAR(50) column using the latin1 character set requires 8 bytes (rather than 5 bytes for the same column value in a MyISAM table).

TEXT, BLOB, and JSON columns are implemented differently in the NDB storage engine, wherein【werˈɪn 其中;在那里;在那种情况下;以什么方式;】 each row in the column is made up of two separate parts. One of these is of fixed size (256 bytes for TEXT and BLOB, 4000 bytes for JSON), and is actually stored in the original table. The other consists of any data in excess of 256 bytes, which is stored in a hidden blob parts table. The size of the rows in this second table are determined by the exact type of the column, as shown in the following table:

 This means that the size of a TEXT column is 256 if size <= 256 (where size represents the size of the row); otherwise, the size is 256 + size + (2000 × (size − 256) % 2000).

No blob parts are stored separately by NDB for TINYBLOB or TINYTEXT column values.

You can increase the size of an NDB blob column's blob part to the maximum of 13948 using NDB_COLUMN in a column comment when creating or altering the parent table. In NDB 8.0.30 and later, it is also possible to set the inline size for a TEXT, BLOB, or JSON column, using NDB_TABLE in a column comment.

The size of an ENUM object is determined by the number of different enumeration values. One byte is used for enumerations with up to 255 possible values. Two bytes are used for enumerations having between 256 and 65,535 possible values.

The size of a SET object is determined by the number of different set members. If the set size is N, the object occupies (N+7)/8 bytes, rounded up to 1, 2, 3, 4, or 8 bytes. A SET can have a maximum of 64 members.

2.6 Spatial Type Storage Requirements

MySQL stores geometry【dʒiˈɑːmətri 几何学;几何形状;几何图形;几何结构;几何(学);】 values using 4 bytes to indicate the SRID followed by the WKB representation of the value. The LENGTH() function returns the space in bytes required for value storage.

2.7 JSON Storage Requirements

In general, the storage requirement for a JSON column is approximately the same as for a LONGBLOB or LONGTEXT column; that is, the space consumed by a JSON document is roughly【ˈrʌfli 大致;大约;粗略地;差不多;粗暴地;粗鲁地;粗糙地;凹凸不平地;】 the same as it would be for the document's string representation stored in a column of one of these types. However, there is an overhead【ˌoʊvərˈhed , ˈoʊvərhed 开销;经常费用;经常开支;(尤指飞机的)顶舱;用于高射投影器的幻灯片;】 imposed【ɪmˈpoʊzd 把…强加于;推行,采用(规章制度);迫使;强制实行;使(别人)接受自己的意见;】 by the binary encoding, including metadata and dictionaries needed for lookup, of the individual values stored in the JSON document. For example, a string stored in a JSON document requires 4 to 10 bytes additional storage, depending on the length of the string and the size of the object or array in which it is stored.

In addition, MySQL imposes a limit on the size of any JSON document stored in a JSON column such that it cannot be any larger than the value of max_allowed_packet.

3. Choosing the Right Type for a Column 

For optimum【ˈɑːptɪməm 最佳的;最适宜的;最佳结果;最好的条件;】 storage, you should try to use the most precise【prɪˈsaɪs 准确的;精确的;确切的;明确的;精细的;(强调时间或方式等)就,恰好;细致的;认真的;一丝不苟的;】 type in all cases. For example, if an integer column is used for values in the range from 1 to 99999, MEDIUMINT UNSIGNED is the best type. Of the types that represent all the required values, this type uses the least amount of storage.

All basic calculations (+, -, *, and /) with DECIMAL columns are done with precision of 65 decimal (base 10) digits.

If accuracy【ˈækjərəsi 准确(性);精确(程度);】 is not too important or if speed is the highest priority, the DOUBLE type may be good enough. For high precision, you can always convert to a fixed-point type stored in a BIGINT. This enables you to do all calculations with 64-bit integers and then convert results back to floating-point values as necessary.

4. Using Data Types from Other Database Engines

To facilitate【fəˈsɪlɪteɪt 促进;使便利;促使;】 the use of code written for SQL implementations【ˌɪmpləmɛnˈteɪʃənz 实施;实现;实作;实现工具;】 from other vendors【ˈvɛndərz 小贩;(房屋等的)卖主;摊贩;(某种产品的)销售公司;】, MySQL maps data types as shown in the following table. These mappings make it easier to import table definitions from other database systems into MySQL.

 Data type mapping occurs at table creation time, after which the original type specifications are discarded【dɪˈskɑːrdɪd 丢弃;抛弃;打出(无用的牌);垫(牌);】. If you create a table with types used by other vendors and then issue a DESCRIBE tbl_name statement, MySQL reports the table structure using the equivalent【ɪˈkwɪvələnt (价值、数量、意义、重要性等)相同的;相等的;】 MySQL types. For example:

mysql> CREATE TABLE t (a BOOL, b FLOAT8, c LONG VARCHAR, d NUMERIC);
Query OK, 0 rows affected (0.00 sec)
mysql> DESCRIBE t;
+-------+---------------+------+-----+---------+-------+
| Field | Type          | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+-------+
| a     | tinyint(1)    | YES  |     | NULL    |       |
| b     | double        | YES  |     | NULL    |       |
| c     | mediumtext    | YES  |     | NULL    |       |
| d     | decimal(10,0) | YES  |     | NULL    |       |
+-------+---------------+------+-----+---------+-------+
4 rows in set (0.01 sec)

 

posted @ 2024-04-17 22:26  东山絮柳仔  阅读(35)  评论(0编辑  收藏  举报