Sharding & IDs at Instagram, Flickr ID generation

Instagram: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

Flickr: http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/

 

Ticket Server based on database (auto-increment in DB, ACID guarantee)

 

Instagram:

id should be chronologically sorted;

Sharding: several thousands "logical" shard, and hosted in far fewer physical shards => makes it much easier to move logical shards out to new physical shards.

Id genration : PL/PGSQL stored procedure to generate next_id() in database (transactionally), which contains : 41bits for current time in mililliseconds, 13 bits for logical shards, 10 bits for auto-incrementing sequence % 1024 => at most 1024 IDs per shard per millisecond;

Given the ID also encodes the shard, it's easy to find out shard for a primary key id

CREATE OR REPLACE FUNCTION insta5.next_id(OUT result bigint) AS $$
DECLARE
    our_epoch bigint := 1314220021721;
    seq_id bigint;
    now_millis bigint;
    shard_id int := 5;
BEGIN
    SELECT nextval('insta5.table_id_seq') %% 1024 INTO seq_id;

    SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
    result := (now_millis - our_epoch) << 23;
    result := result | (shard_id << 10);
    result := result | (seq_id);
END;
$$ LANGUAGE PLPGSQL;


CREATE TABLE insta5.our_table (
    "id" bigint NOT NULL DEFAULT insta5.next_id(),
    ...rest of table schema...
)

 

Flickr : http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/

/* The Tickets64 schema looks like: */

CREATE TABLE `Tickets64` (
  `id` bigint(20) unsigned NOT NULL auto_increment,
  `stub` char(1) NOT NULL default '',
  PRIMARY KEY  (`id`),
  UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
SELECT * from Tickets64 returns a single row that looks something like: +-------------------+------+ | id | stub | +-------------------+------+ | 72157623227190423 | a | +-------------------+------+
/* When I need a new globally unique 64-bit ID I issue the following SQL: */ REPLACE INTO Tickets64 (stub) VALUES ('a'); SELECT LAST_INSERT_ID();

In order not to make it a single-point-of-failure:

TicketServer1:
auto-increment-increment = 2
auto-increment-offset = 1

TicketServer2:
auto-increment-increment = 2
auto-increment-offset = 2

 

presentation on this topic:

http://media.postgresql.org/sfpug/instagram_sfpug.pdf 

 

posted @ 2016-10-10 22:14  qsort  阅读(633)  评论(0编辑  收藏  举报