Working with Sequences[次序号]

 

Many applications need to use unique numbers for identification purposes. The requirement for unique values occurs in a number of contexts: membership numbers, sample or lot numbering, customer IDs, bug report or trouble ticket tags, and so on.

MySQL's mechanism for providing unique numbers is through AUTO_INCREMENT columns that allow you to generate sequential numbers automatically. However, AUTO_INCREMENT columns are handled somewhat differently for the various table types that MySQL supports, so it's important to understand not only the general concepts underlying the AUTO_INCREMENT mechanism, but also the differences between table types. This section describes how AUTO_INCREMENT columns work so that you can use them effectively without running into the traps that sometimes surprise people. It also describes how you can generate a sequence without using an AUTO_INCREMENT column.

For versions of MySQL up to 3.23, the only table type available is ISAM. After that, additional table types were introduced?the MyISAM and HEAP types first, and the BDB and InnoDB types later. The discussion here indicates how each table type behaves with respect to AUTO_INCREMENT columns. (For more general information about the characteristics of MySQL's table handlers, see Chapter 3.)

AUTO_INCREMENT for ISAM Tables

AUTO_INCREMENT columns in ISAM tables behave as follows:

  • Inserting NULL into an AUTO_INCREMENT column causes MySQL to automatically generate the next sequence number and insert that value into the column. AUTO_INCREMENT sequences begin at 1, so the first record inserted into the table gets a sequence column value of 1 and subsequent records get values of 2, 3, and so forth. Each automatically generated value will be one more than the current maximum value stored in the column.

  •  

    Inserting 0 into an AUTO_INCREMENT column has the same effect as inserting NULL. However, this is not guaranteed to be true indefinitely, so it's better to insert NULL.

  • Inserting a row without specifying an explicit value for the AUTO_INCREMENT column is the same as inserting NULL into the column.

  • If you insert a record and specify a non-NULL, non-zero value for the AUTO_INCREMENT column, one of two things will happen. If a record already exists with that value, an error occurs because values in AUTO_INCREMENT columns must be unique. If a record does not exist with that value, the record is inserted and the sequence continues with the next value after that for subsequent rows. In other words, you can "bump up" the counter by inserting a record with a sequence value greater than the current counter value.

    Bumping up the counter can result in gaps in the sequence, but you can also exploit this behavior to generate a sequence that begins at a value higher than 1. Suppose you create an ISAM table with an AUTO_INCREMENT column, but you want the sequence to begin at 1000 rather than at 1. To achieve this, insert a "fake" record with a value of 999 in the AUTO_INCREMENT column. Records inserted subsequently will be assigned sequence numbers beginning with 1000, after which you can delete the fake record.

    (Why might you want to begin a sequence with a value higher than 1? One reason is to make sequence numbers all have the same number of digits. If you're generating customer ID numbers, and you expect never to have more than a million customers, you could begin the series at 1,000,000. You'll be able to add well over a million customer records before the digit count for customer ID values changes. Other reasons for not beginning a sequence at 1 might have nothing to do with technical considerations. For example, if you were assigning membership numbers, you might want to begin a sequence at a number higher than 1 to forestall political squabbling over who gets to be member number 1?by making sure there isn't any such number. Hey, it happens. Sad, but true.)

  •  

    If you delete the record containing the largest value in an AUTO_INCREMENT column, that value is reused the next time you generate a new value. This is a consequence of the principle that for ISAM tables, each new automatically generated value is one larger than the current maximum value stored in the column. Another consequence is that if you delete all the records in the table, all values are reused, so the sequence starts over beginning at 1.

  • If you use UPDATE to set an AUTO_INCREMENT column to a value that already exists in another row, a "duplicate key" error occurs. If you update the column to a value larger than any existing column value, the sequence continues with the next number after that for subsequent records.

  • If you use REPLACE to update a record based on the value of the AUTO_INCREMENT column, the AUTO_INCREMENT value does not change. If you use REPLACE to update a record based on the value of some other PRIMARY KEY or UNIQUE index, the AUTO_INCREMENT column will be updated with a new sequence number if you set it to NULL.

  • The value of the most recent automatically generated sequence number is available by calling the LAST_INSERT_ID() function. This allows you to reference the AUTO_INCREMENT value in other statements without knowing what the value is. LAST_INSERT_ID() is tied to AUTO_INCREMENT values generated during the current server session; it is not affected by AUTO_INCREMENT activity associated with other clients. If no AUTO_INCREMENT value has been generated during the current session, LAST_INSERT_ID() returns 0.

The AUTO_INCREMENT mechanism for ISAM forms the basis for understanding sequence behavior for the other table types. Those types implement behavior that for the most part is similar to that just described, so keep the preceding discussion in mind as you read on.

AUTO_INCREMENT for MyISAM Tables

MyISAM tables offer the most flexibility for sequence handling. The MyISAM storage format introduces some features that address some of the shortcomings of ISAM tables:

  • With ISAM tables, values deleted from the top of the sequence are reused. If you delete the record with the highest sequence number, the new record added gets the same sequence value as the deleted record. This results in sequences that are not strictly monotonic, which is a problem should you need to guarantee that no record be given a number that has been used before. With MyISAM, the values in an automatically generated series are strictly increasing and are not reused. If the maximum value is 143 and you delete the record containing that value, MySQL still generates the next value as 144.

  • ISAM sequences always begin at 1 unless you use the fake-record technique mentioned earlier to start the sequence at a higher value. With MyISAM tables, you can specify the initial value explicitly by using an AUTO_INCREMENT = n option in the CREATE TABLE statement. The following example creates a MyISAM table with an AUTO_INCREMENT column named seq that begins at 1,000,000:

    CREATE TABLE mytbl 
    (
        seq INT UNSIGNED AUTO_INCREMENT NOT NULL,
        PRIMARY KEY (seq)
    ) TYPE = MYISAM AUTO_INCREMENT = 1000000;
    

    A table can have only one AUTO_INCREMENT column, so there is never any ambiguity about the column to which the terminating AUTO_INCREMENT = n option applies, even if the table has multiple columns (as most tables do).

  • You can change the current sequence counter for an existing MyISAM table with ALTER TABLE. If the sequence currently stands at 1000, the following statement will cause the next number generated to be 2000:

    ALTER TABLE mytbl AUTO_INCREMENT = 2000; 
    

    If you want to reuse values that have been deleted from the top of the sequence, you can do that, too. The following statement will set the counter down as far as possible, causing the next number to be one larger than the current maximum sequence value:

    ALTER TABLE mytbl AUTO_INCREMENT = 1; 
    

In addition to overcoming the weaknesses of ISAM sequence handling, the MySQL table handler as of MySQL 3.23.5 supports the use of composite (multiple-column) indexes for creating multiple independent sequences within the same table. To use this feature, create a multiple-column PRIMARY KEY or UNIQUE index that includes an AUTO_INCREMENT column as its last column. For each distinct key in the leftmost column or columns of the index, the AUTO_INCREMENT column will generate a separate sequence of values. For example, you might use a table named bugs for tracking bug reports of several software projects, where the table is declared as follows:

CREATE TABLE bugs 
(
    proj_name   VARCHAR(20) NOT NULL,
    bug_id      INT UNSIGNED AUTO_INCREMENT NOT NULL,
    description VARCHAR(100),
    PRIMARY KEY (proj_name, bug_id)
) TYPE = MYISAM;

Here, the proj_name column identifies the project name and the description column contains the bug description. The bug_id column is an AUTO_INCREMENT column; by creating an index that ties it to the proj_name column, you can generate an independent series of sequence numbers for each project. Suppose you enter the following records into the table to register three bugs for SuperBrowser and two for SpamSquisher:

mysql> INSERT INTO bugs (proj_name,description) 
    -> VALUES('SuperBrowser','crashes when displaying complex tables');
mysql> INSERT INTO bugs (proj_name,description)
    -> VALUES('SuperBrowser','image scaling does not work');
mysql> INSERT INTO bugs (proj_name,description)
    -> VALUES('SpamSquisher','fails to block known blacklisted domains');
mysql> INSERT INTO bugs (proj_name,description)
    -> VALUES('SpamSquisher','fails to respect whitelist addresses');
mysql> INSERT INTO bugs (proj_name,description)
    -> VALUES('SuperBrowser','background patterns not displayed');

The resulting table contents look like the this:

mysql> SELECT * FROM bugs ORDER BY proj_name, bug_id; 
+--------------+--------+------------------------------------------+
| proj_name    | bug_id | description                              |
+--------------+--------+------------------------------------------+
| SpamSquisher |      1 | fails to block known blacklisted domains |
| SpamSquisher |      2 | fails to respect whitelist addresses     |
| SuperBrowser |      1 | crashes when displaying complex tables   |
| SuperBrowser |      2 | image scaling does not work              |
| SuperBrowser |      3 | background patterns not displayed        |
+--------------+--------+------------------------------------------+

Note that it does not matter that the order of record entry switches between projects. The table numbers bug_id values for each project separately.

If you use a composite index to create multiple sequences, values deleted from the top of a sequence are reused. This contrasts with the usual MyISAM behavior of not reusing values.

AUTO_INCREMENT for HEAP Tables

HEAP tables do not support the AUTO_INCREMENT mechanism prior to MySQL 4.1. As of 4.1, AUTO_INCREMENT columns are allowed and behave as follows:

  • The initial sequence value can be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement and can be modified after table creation time using that option with ALTER TABLE.

  • Values that are deleted from the top of the sequence are not reused.

  • Composite indexes cannot be used to generate multiple independent sequences within a table.

AUTO_INCREMENT for BDB Tables

The BDB table handler manages AUTO_INCREMENT columns as follows:

  • The initial sequence value cannot be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement. Nor can it be modified using that option with ALTER TABLE.

  • Values that are deleted from the top of the sequence are reused.

  • Composite indexes can be used to generate multiple independent sequences within a table.

AUTO_INCREMENT for InnoDB Tables

The InnoDB table handler manages AUTO_INCREMENT columns as follows:

  • The initial sequence value cannot be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement. Nor can it be modified using that option with ALTER TABLE.

  • Values that are deleted from the top of the sequence are not reused.

  • Composite indexes cannot be used to generate multiple independent sequences within a table.

Issues to Consider with AUTO_INCREMENT

You should keep the following points in mind to avoid being surprised when you use AUTO_INCREMENT columns:

  • AUTO_INCREMENT is not a column type; it's a column type attribute. Furthermore, AUTO_INCREMENT is an attribute intended for use only with integer types. Versions of MySQL earlier than 3.23 are lax in enforcing this constraint and will let you declare a column type such as CHAR with the AUTO_INCREMENT attribute. However, only the integer types work correctly as AUTO_INCREMENT columns.

  • The primary purpose of the AUTO_INCREMENT mechanism is to allow you to generate a sequence of positive integers, so you should declare AUTO_INCREMENT columns to be UNSIGNED. This also has the advantage of giving you twice as many sequence numbers before you hit the upper end of the column type's range.

    It is possible under some circumstances to generate sequences of negative values using an AUTO_INCREMENT column. But this is an unsupported use of AUTO_INCREMENT and the results are not guaranteed. My own experiments indicate somewhat inconsistent behavior between versions with regard to negative sequences, so even if you achieve the results you want with one version of MySQL, that may change if you upgrade to a newer version. (In other words, attempting to use AUTO_INCREMENT for anything but a sequence of positive integers can result in unpredictable behavior. You have been warned!)

  • Don't be fooled into thinking that adding AUTO_INCREMENT to a column declaration is a magic way of getting an unlimited sequence of numbers. It's not; AUTO_INCREMENT sequences are always bound by the range of the underlying column type. For example, if you use a TINYINT column, the maximum sequence number is 127. When you reach that limit, your application will begin to fail with "duplicate key" errors. If you use TINYINT UNSIGNED instead, you'll reach the limit at 255.

  • Clearing a table's contents entirely may reset a sequence to begin again at 1, even for table types that normally to not reuse AUTO_INCREMENT values. This can occur for either of the following statements:

    DELETE FROM tbl_name; 
    TRUNCATE TABLE tbl_name;
    

    The sequence reset occurs due to the way MySQL optimizes a complete table erasure operation: It tosses the data rows and indexes and recreates the table from scratch rather than deleting individual rows. This causes all sequence number information to be lost. If you want to delete all records but preserve the sequence information, you can suppress this optimization by using DELETE with a WHERE clause that is always true:

    DELETE FROM tbl_name WHERE 1; 
    

    This forces MySQL to evaluate the condition for each row and thus delete every row individually.

Forcing Non-Reuse of Sequence Values

What can you do to maintain a strictly increasing series of values for table types that reuse values that are deleted from the top of a sequence? One solution is to maintain a separate table that you use only for generating AUTO_INCREMENT values and from which you never delete records. That way, the values in the table are never reused. When you need to generate a new record in your main table, first insert a NULL into the sequence number table. Then insert the record into your main table using the value of LAST_INSERT_ID() for the column that you want to contain a sequence number:

INSERT INTO ai_tbl SET ai_col = NULL; 
INSERT INTO main_tbl SET id=LAST_INSERT_ID() ... ;

Adding a Sequence Number Column to a Table

Suppose you create a table and put some information into it:

mysql> CREATE TABLE t (c CHAR(10)); 
mysql> INSERT INTO t VALUES('a'),('b'),('c');
mysql> SELECT * FROM t;
+------+
| c    |
+------+
| a    |
| b    |
| c    |
+------+

Then you decide that you want to include a sequence number column in the table. To do this, issue an ALTER TABLE statement to add an AUTO_INCREMENT column, using the same kind of type definition that you'd use with CREATE TABLE:

mysql> ALTER TABLE t ADD i INT AUTO_INCREMENT NOT NULL PRIMARY KEY; 
mysql> SELECT * FROM t;
+------+---+
| c    | i |
+------+---+
| a    | 1 |
| b    | 2 |
| c    | 3 |
+------+---+

Note how MySQL has assigned sequence values to the AUTO_INCREMENT column automatically. You need not do so yourself.

Resequencing an Existing Column

If a table already has an AUTO_INCREMENT column but you want to renumber it to eliminate gaps in the sequence that may have resulted from row deletions, the easiest way to do it is to drop the column and then add it again. When MySQL adds the column, it will assign new sequence numbers automatically, as shown in the previous example.

Suppose a table t looks like the following, where i is the AUTO_INCREMENT column:

mysql> CREATE TABLE t (c CHAR(10), i INT NOT NULL AUTO_INCREMENT PRIMARY KEY); 
mysql> INSERT INTO t (c)
    -> VALUES('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k');
mysql> DELETE FROM t WHERE c IN('a','d','f','g','j');
mysql> SELECT * FROM t;
+------+----+
| c    | i  |
+------+----+
| b    |  2 |
| c    |  3 |
| e    |  5 |
| h    |  8 |
| i    |  9 |
| k    | 11 |
+------+----+

The following ALTER TABLE statement drops the column and then adds it again:

mysql> ALTER TABLE t 
    -> DROP i,
    -> ADD i INT UNSIGNED AUTO_INCREMENT NOT NULL,
    -> AUTO_INCREMENT = 1;
mysql> SELECT * FROM t;
+------+---+
| c    | i |
+------+---+
| b    | 1 |
| c    | 2 |
| e    | 3 |
| h    | 4 |
| i    | 5 |
| k    | 6 |
+------+---+

The AUTO_INCREMENT = 1 clause resets the sequence to begin again at 1. For a MyISAM table (or a HEAP table as of MySQL 4.1), you can use a value other than 1 to begin the sequence at a different value. For other table types, just omit the AUTO_INCREMENT clause, because they do not allow the initial value to be specified this way. The sequence will begin at 1.

Note that although it's easy to resequence a column, there is usually very little reason to do so. MySQL doesn't care whether a sequence has holes in it, nor do you gain any performance efficiencies by resequencing.

Generating Sequences Without AUTO_INCREMENT

Another method for generating sequence numbers doesn't use an AUTO_INCREMENT column at all. Instead, it uses an alternate form of the LAST_INSERT_ID() function that takes an argument. (This form was introduced in MySQL 3.22.9.) If you insert or update a column using LAST_INSERT_ID(expr), the next call to LAST_INSERT_ID() with no argument returns the value of expr. In other words, expr is treated as though it had been generated as an AUTO_INCREMENT value. This allows you to generate a sequence number and then retrieve it later in your session, confident that the value will not have been affected by the activity of other clients.

One way to use this strategy is to create a single-row table containing a value that is updated each time you want the next value in the sequence. For example, you can create and initialize the table as follows:

CREATE TABLE seq_table (seq INT UNSIGNED NOT NULL); 
INSERT INTO seq_table VALUES(0);

These statements set up seq_table with a single row containing a seq value of 0. To use the table, generate the next sequence number and retrieve it as follows:

UPDATE seq_table SET seq = LAST_INSERT_ID(seq+1); 
SELECT LAST_INSERT_ID();

The UPDATE statement retrieves the current value of the seq column and increments it by 1 to produce the next value in the sequence. Generating the new value using LAST_INSERT_ID(seq+1) causes it to be treated like an AUTO_INCREMENT value, which allows it to be retrieved by calling LAST_INSERT_ID() without an argument. LAST_INSERT_ID() is client-specific, so you get the correct value even if other clients have generated other sequence numbers in the interval between the UPDATE and the SELECT.

Other uses for this method are to generate sequence values that increment by a value other than 1 or that are negative. For example, the following statement can be executed repeatedly to generate a sequence of numbers that increase by 100 each time:

UPDATE seq_table SET seq = LAST_INSERT_ID(seq+100); 

Repeating the following statement will generate a sequence of decreasing numbers:

UPDATE seq_table SET seq = LAST_INSERT_ID(seq-1); 

You can also use this technique to generate a sequence that begins at an arbitrary value by setting the seq column to an appropriate initial value.

The preceding discussion describes how to set up a counter using a table with a single row. That's okay for a single counter, but if you want several of them, creating one table per counter leads to needless multiplication of tables. For example, suppose you have a Web site and you want to put some "this page has been accessed n times" counters in several pages. You probably don't want to set up a separate counter table for every page that has a counter.

One way to avoid creating multiple counter tables is to create a single table with two columns. One column holds a counter value; the other holds a name that uniquely identifies each counter. You can still use the LAST_INSERT_ID() function, but you determine which row it applies to by using the counter name. The table looks like this:

CREATE TABLE counter 
(
    name  VARCHAR(255) BINARY NOT NULL,
    PRIMARY KEY (name),
    value INT UNSIGNED
);

The name column is a string so that you can name a counter whatever you want, and it's declared as a PRIMARY KEY to prevent duplicate names. This assumes that applications using the table agree on the names they'll be using. For Web counters, uniqueness of counter names is ensured simply by using the pathname of each page within the document tree as its counter name. The BINARY attribute causes pathname values to be treated as case sensitive. (Omit it if your system has pathnames that are not case sensitive.)

To use the counter table, insert a row corresponding to each page for which you need a counter. For example, to set up a new counter for the site's home page, do the following:

INSERT INTO counter (name,value) VALUES('index.html',0); 

That initializes a counter named 'index.html' with a value of zero. To generate the next sequence value for the page, use its pathname to look up the correct counter value and increment it with LAST_INSERT_ID(expr) and then retrieve the value with LAST_INSERT_ID():

UPDATE counter SET value = LAST_INSERT_ID(value+1) WHERE name = 'index.html'; 
SELECT LAST_INSERT_ID();

An alternative approach is to increment the counter without using LAST_INSERT_ID():

UPDATE counter SET value = value+1 WHERE name = 'index.html'; 
SELECT value FROM counter WHERE name = 'index.html';

However, that doesn't work correctly if another client increments the counter after you issue the UPDATE and before you issue the SELECT. You could solve that problem by using a transaction or by putting LOCK TABLES and UNLOCK TABLES around the two statements to block other clients while you're using the counter. But the LAST_INSERT_ID() method accomplishes the same thing more easily. Because its value is client-specific, you always get the value you inserted, not the one from some other client, and you don't have to complicate the code with transactions or locks to keep other clients out.

posted on 2011-11-30 09:41  kingang  阅读(193)  评论(0编辑  收藏  举报

导航