从虚拟机磁盘扩容看linux下磁盘管理及ext文件系统

一、虚拟机磁盘扩容

最开始的时候创建的虚拟机的磁盘容量是20G，运行了一段时间之后，发现磁盘空间已经不够用，需要增加磁盘的容量。想到最简单的方法就是增加磁盘容量，把20G扩容到30G，这里的也有两种方法，一种是现有磁盘的容量，另一种是新加一个硬盘设备。当时使用的是第一种方法，也就是增加虚拟机现在使用的那块磁盘的容量。但是磁盘容量扩容之后，系统并没有自动检测到磁盘空间的变化，由于最开始在创建文件系统的时候只是用来当时存在的磁盘空间，新添加的空间对之前的文件系统就是不可见的，在现有的情况下，需要为新增加的磁盘空间创建一个新的分区，这个时候就用到了磁盘的分区工具，这个最早考虑的是fdisk，但是这个工具在使用的时候我试了几次，添加新的分区总是失败，所以放弃。后台看到在fdisk的man手册中有提到cfdisk，所以试了下cfdisk，使用cfdisk按照提示傻瓜式操作就可以完成新逻辑分区的添加。

这里的fdisk应该是format disk的意思，而cfdisk的c前缀是使用了ncurses库中的c，使用这库之后用户可以看到菜单项，可以使用方向键，快捷键等操作，用户界面更加友好些。fdisk的功能就是对一块物理磁盘进行分区管理，这个分区是更为通用的一个层次，它面向的使用者是BIOS，我们直观的感觉就是它真正是跨越了操作系统，比方说我们可以在一块磁盘上安装linux系统，也可以按照windows系统，这些系统可以在同一个磁盘共存。由于不同操作系统通常使用自己私有的“官方”文件系统，例如windows的ntfs，linux的extN系列，所以这个分区也是可以不理解操作文件系统的。分出来的区可以支持任意的文件系统，甚至不创建文件系统也可以。

我看了下cfdisk的代码，感觉里面的代码注释没有内核的注释清楚，虽然两者的功能相似，所以下面对于磁盘分区表的格式解析以内核代码为例子。只是 cfdisk毕竟是一个用户态程序，它的输出更加的自由，调用也更加方便，所以看下它提供的相对比较友好的输出内容，下面是我的虚拟机系统中cfdisk /dev/sda的输出内容。这里看到最后两个分区，前一个为20G左右，也就是最初创建虚拟机时的磁盘容量，最后的10G是后来追加之后通过cfdisk添加的逻辑分区。这里比较感兴趣的是，这个地方的cfdisk是如何知道不同分区的文件系统类型的，例如它识别出了ext4和swap这种linux文件系统类型。

cfdisk (util-linux 2.24)

Disk Drive: /dev/sda

Size: 32212254720 bytes, 32.2 GB

Heads: 255 Sectors per Track: 63 Cylinders: 3916

Name Flags Part Type FS Type [Label] Size (MB)

--------------------------------------------------------------------------------------------------------------------

Unusable 1.05 *

sda1 Primary Linux 1.05 *

sda2 Boot Primary ext4 314.58 *

sda3 Primary swap 2147.49 *

sda5 NC Logical ext4 19010.69 *

sda6 Logical ext4 10737.42 *

[ Help ] [ Print ] [ Quit ] [ Units ] [ Write ]

No more partitions

Print help screen

二、cfdisk对于文件系统内容的识别

通过hexdump看下这个磁盘最开始的一个删除的后256字节的内容

tsecer@harry: hexdump -Cs 0x100 /dev/sda | more

00000100 f4 40 89 44 08 0f b6 c2 c0 e8 02 66 89 04 66 a1 |.@.D.......f..f.|

00000110 60 7c 66 09 c0 75 4e 66 a1 5c 7c 66 31 d2 66 f7 |`|f..uNf.\|f1.f.|

00000120 34 88 d1 31 d2 66 f7 74 04 3b 44 08 7d 37 fe c1 |4..1.f.t.;D.}7..|

00000130 88 c5 30 c0 c1 e8 02 08 c1 88 d0 5a 88 c6 bb 00 |..0........Z....|

00000140 70 8e c3 31 db b8 01 02 cd 13 72 1e 8c c3 60 1e |p..1......r...`.|

00000150 b9 00 01 8e db 31 f6 bf 00 80 8e c6 fc f3 a5 1f |.....1..........|

00000160 61 ff 26 5a 7c be 80 7d eb 03 be 8f 7d e8 34 00 |a.&Z|..}....}.4.|

00000170 be 94 7d e8 2e 00 cd 18 eb fe 47 52 55 42 20 00 |..}.......GRUB .|

00000180 47 65 6f 6d 00 48 61 72 64 20 44 69 73 6b 00 52 |Geom.Hard Disk.R|

00000190 65 61 64 00 20 45 72 72 6f 72 0d 0a 00 bb 01 00 |ead. Error......|

000001a0 b4 0e cd 10 ac 3c 00 75 f4 c3 00 00 00 00 00 00 |.....<.u........|

000001b0 00 00 00 00 00 00 00 00 90 b8 00 00 00 00 00 20 |............... |

000001c0 21 00 83 41 01 00 00 08 00 00 00 08 00 00 80 41 |!..A...........A|

000001d0 02 00 83 7f 19 26 00 10 00 00 00 60 09 00 00 7f |.....&.....`....|

000001e0 1a 26 82 94 69 2b 00 70 09 00 00 00 40 00 00 94 |.&..i+.p....@...|

000001f0 6a 2b 05 fe ff ff 00 70 49 00 00 90 76 03 55 aa |j+.....pI...v.U.|

00000200 52 bf f4 81 66 8b 2d 83 7d 08 00 0f 84 e2 00 80 |R...f.-.}.......|

00000210 7c ff 00 74 46 66 8b 1d 66 8b 4d 04 66 31 c0 b0 ||..tFf..f.M.f1..|

00000220 7f 39 45 08 7f 03 8b 45 08 29 45 08 66 01 05 66 |.9E....E.)E.f..f|

00000230 83 55 04 00 c7 04 10 00 89 44 02 66 89 5c 08 66 |.U.......D.f.\.f|

00000240 89 4c 0c c7 44 06 00 70 50 c7 44 04 00 00 b4 42 |.L..D..pP.D....B|

00000250 cd 13 0f 82 af 00 bb 00 70 eb 66 66 8b 45 04 66 |........p.ff.E.f|

00000260 09 c0 0f 85 97 00 66 8b 05 66 31 d2 66 f7 34 88 |......f..f1.f.4.|

00000270 54 0a 66 31 d2 66 f7 74 04 88 54 0b 89 44 0c 3b |T.f1.f.t..T..D.;|

tsecer@harry:

可以看到000001f0行最后两个DOS系统的签名"55 aa"，在这个签名之前，是一个包含4个partition结构的数组，这个partition结构的大小为16字节

struct partition {

unsigned char boot_ind; /* 0x80 - active */

unsigned char head; /* starting head */

unsigned char sector; /* starting sector */

unsigned char cyl; /* starting cylinder */

unsigned char sys_ind; /* What partition type */

unsigned char end_head; /* end head */

unsigned char end_sector; /* end sector */

unsigned char end_cyl; /* end cylinder */

unsigned char start4[4]; /* starting sector counting from 0 */

unsigned char size4[4]; /* nr of sectors in partition */

};

对于swap系统，它在000001e2未知的sys_ind值为0x82，对应于源代码中的util-linux-ng-2.16.2\fdisk\i386_sys_types.c

struct systypes i386_sys_types[] = {

{0x00, N_("Empty")},

……

{0x81, N_("Minix / old Linux")},/* Minix 1.4b and later */

{0x82, N_("Linux swap / Solaris")},

{0x83, N_("Linux")},

对于包含boot的000001d2，包含了文件系统为0x83，这里并没有提示出文件系统是ext2或者ext4等，这个地方通过get_linux_label(int i) 来读取磁盘分区的超级块来判断具体文件文件系统类型。我现在这个版本的util没有对ext4的处理，但是不影响实现的思路：

static void

get_linux_label(int i) {

#define EXT2LABELSZ 16

#define EXT2_SUPER_MAGIC 0xEF53

#define EXT3_FEATURE_COMPAT_HAS_JOURNAL 0x0004

struct ext2_super_block {

char s_dummy0[56];

unsigned char s_magic[2];

char s_dummy1[34];

unsigned char s_feature_compat[4];

char s_dummy2[24];

char s_volume_name[EXT2LABELSZ];

char s_last_mounted[64];

char s_dummy3[824];

} e2fsb;

offset = (p_info[i].first_sector + p_info[i].offset) * SECTOR_SIZE

+ 1024;

if (lseek(fd, offset, SEEK_SET) == offset

&& read(fd, &e2fsb, sizeof(e2fsb)) == sizeof(e2fsb)

&& e2fsb.s_magic[0] + (e2fsb.s_magic[1]<<8) == EXT2_SUPER_MAGIC) {

label = e2fsb.s_volume_name;

for(j=0; j<EXT2LABELSZ && j<LABELSZ && isprint(label[j]); j++)

p_info[i].volume_label[j] = label[j];

p_info[i].volume_label[j] = 0;

/* ext2 or ext3? */

if (e2fsb.s_feature_compat[0]&EXT3_FEATURE_COMPAT_HAS_JOURNAL)

strncpy(p_info[i].fstype, "ext3", FSTYPESZ);

else

strncpy(p_info[i].fstype, "ext2", FSTYPESZ);

return;

}

……

三、操作系统对于磁盘分区的识别和处理

这个功能和cfdisk中的部分功能是重合的，但是内核中的代码更加全面些，操作系统对于逻辑分区的处理

linux-3.12.6\block\partitions\check.c

static int (*check_part[])(struct parsed_partitions *) = {

* Probe partition formats with tables at disk address 0

* that also have an ADFS boot block at 0xdc0.

……

#ifdef CONFIG_EFI_PARTITION

efi_partition, /* this must come before msdos */

#endif

……

#ifdef CONFIG_MSDOS_PARTITION

msdos_partition,

#endif

……

}

struct parsed_partitions *

check_partition(struct gendisk *hd, struct block_device *bdev)

{

……

i = res = err = 0;

while (!res && check_part[i]) {

memset(state->parts, 0, state->limit * sizeof(state->parts[0]));

res = check_part[i++](state);

if (res < 0) {

/* We have hit an I/O error which we don't report now.

* But record it, and let the others do their job.

err = res;

res = 0;

}

……

}

linux-3.12.6\block\partitions\msdos.c

#define MSDOS_LABEL_MAGIC1 0x55

#define MSDOS_LABEL_MAGIC2 0xAA

static inline int

msdos_magic_present(unsigned char *p)

{

return (p[0] == MSDOS_LABEL_MAGIC1 && p[1] == MSDOS_LABEL_MAGIC2);

}

int msdos_partition(struct parsed_partitions *state)

{

sector_t sector_size = bdev_logical_block_size(state->bdev) / 512;

Sector sect;

unsigned char *data;

struct partition *p;

struct fat_boot_sector *fb;

int slot;

u32 disksig;

data = read_part_sector(state, 0, &sect);

……

if (!msdos_magic_present(data + 510)) {

put_dev_sector(sect);

return 0;

}

* Now that the 55aa signature is present, this is probably

* either the boot sector of a FAT filesystem or a DOS-type

* partition table. Reject this in case the boot indicator

* is not 0 or 0x80.

p = (struct partition *) (data + 0x1be);

for (slot = 1; slot <= 4; slot++, p++) {

if (p->boot_ind != 0 && p->boot_ind != 0x80) {

* Even without a valid boot inidicator value

* its still possible this is valid FAT filesystem

* without a partition table.

fb = (struct fat_boot_sector *) data;

if (slot == 1 && fb->reserved && fb->fats

&& fat_valid_media(fb->media)) {

strlcat(state->pp_buf, "\n", PAGE_SIZE);

put_dev_sector(sect);

return 1;

} else {

put_dev_sector(sect);

return 0;

}

……

在系统日志中的输出，对应函数msdos_partition中的代码

tsecer@harry: dmesg | grep sd

[ 4.372603] sd 2:0:0:0: [sda] 62914560 512-byte logical blocks: (32.2 GB/30.0 GiB)

[ 4.372924] sd 2:0:0:0: [sda] Write Protect is off

[ 4.372948] sd 2:0:0:0: [sda] Mode Sense: 61 00 00 00

[ 4.373310] sd 2:0:0:0: [sda] Cache data unavailable

[ 4.373336] sd 2:0:0:0: [sda] Assuming drive cache: write through

[ 4.381242] sd 2:0:0:0: Attached scsi generic sg1 type 0

[ 4.382969] sd 2:0:0:0: [sda] Cache data unavailable

[ 4.382977] sd 2:0:0:0: [sda] Assuming drive cache: write through

[ 4.517016] sda: sda1 sda2 sda3 sda4 < sda5 sda6 >

操作系统对于扩展分区的解析linux-2.6.21\fs\partitions\msdos.c：

* Create devices for each logical partition in an extended partition.

* The logical partitions form a linked list, with each entry being

* a partition table with two entries. The first entry

* is the real data partition (with a start relative to the partition

* table start). The second is a pointer to the next logical partition

* (with a start relative to the entire extended partition).

* We do not create a Linux partition for the partition tables, but

* only for the actual data partitions.

static void

parse_extended(struct parsed_partitions *state, struct block_device *bdev,

u32 first_sector, u32 first_size)

{

……

while (1) {

if (++loopct > 100)

return;

……

* Usually, the first entry is the real data partition,

* the 2nd entry is the next extended partition, or empty,

* and the 3rd and 4th entries are unused.

* However, DRDOS sometimes has the extended partition as

* the first entry (when the data partition is empty),

* and OS/2 seems to use all four entries.

* First process the data partition(s)

……

* Next, process the (first) extended partition, if present.

* (So far, there seems to be no reason to make

* parse_extended() recursive and allow a tree

* of extended partitions.)

* It should be a link to the next logical partition.

p -= 4;

for (i=0; i<4; i++, p++)

if (NR_SECTS(p) && is_extended_partition(p))

break;

if (i == 4)

goto done; /* nothing left to do */

this_sector = first_sector + START_SECT(p) * sector_size;

this_size = NR_SECTS(p) * sector_size;

put_dev_sector(sect);

}

四、创建ext系列磁盘分区

我现在手头上没有e2fsprogs项目的代码，所以以busybox内置的分区代码为例来看下这个说明。busybox-1.19.4\util-linux\mkfs_ext2.c中关键信息是每个block的大小，这个block通常是操作系统处理磁盘的基本单位，这个单位并不是磁盘的基本单位扇区，而是一个扇区的整数倍，这样可以减少磁盘碎片，减少寻到时间，便于和内存中的页面结构对齐等，通常磁盘的block大小都是4K，和内存的物理页面大小对齐。

ext2的inode及block的管理再次以次为单位进行group分区，分区的依据就是这个1个block中的所有bit数量，因为这个group中的所有inode和block的占用/空闲都是通过这个一个bit的1/0来表示的。所以通常一个block大小为4K，那么一个group中管理的block大小为4K*8=32768，也就是下一节中Blocks per group: 32768值的由来，也就是mkfs_ext2.c中#define blocks_per_group (8 * blocksize)的由来。

然后就是确定预留的inode数量，这个数量受mkfs.ext2命令中的-i参数的影响，man命令中对于该参数的说明为：

-i bytes-per-inode

Specify the bytes/inode ratio. mke2fs creates an inode for every bytes-per-inode bytes of space on

the disk. The larger the bytes-per-inode ratio, the fewer inodes will be created. This value gener‐

ally shouldn't be smaller than the blocksize of the filesystem, since in that case more inodes would

be made than can ever be used. Be warned that it is not possible to expand the number of inodes on a

filesystem after it is created, so be careful deciding the correct value for this parameter.

其实也就是说，一个inode预计要管理多少的磁盘空间，或者更通俗的说，就是一个文件大小的期望值。有了这个参数，整个文件系统的框架其实已经确定了。确定了磁盘中所有inode的数量，加上已经知道了每个group的大小，就可以知道每个group中inode的数量；由于inode结构的大小是确定的，所以inode需要的磁盘空间确定，剩下的就可以给文件内容使用了。当然这里会涉及到一些具体的细节，这里就不展开了。

下面是我新添加的/dev/sda6逻辑分区上文件系统的内容：

tsecer@harry: dumpe2fs /dev/sda6

dumpe2fs 1.42.8 (20-Jun-2013)

Filesystem volume name: <none>

Last mounted on: /home/tsecer/sda6

Filesystem UUID: 056210ed-db41-4e69-ad11-1f08802a4afa

Filesystem magic number: 0xEF53

Filesystem revision #: 1 (dynamic)

Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Filesystem flags: signed_directory_hash

Default mount options: user_xattr acl

Filesystem state: clean

Errors behavior: Continue

Filesystem OS type: Linux

Inode count: 655360

Block count: 2621432

Reserved block count: 131071

Free blocks: 2505146

Free inodes: 651183

First block: 0

Block size: 4096

Fragment size: 4096

Reserved GDT blocks: 639

Blocks per group: 32768

Fragments per group: 32768

Inodes per group: 8192

Inode blocks per group: 512

Flex block group size: 16

Filesystem created: Sat Apr 30 00:58:10 2016

Last mount time: Sat May 14 09:54:26 2016

Last write time: Sat May 14 09:54:26 2016

Mount count: 9

Maximum mount count: -1

Last checked: Sat Apr 30 00:58:10 2016

Check interval: 0 (<none>)

Lifetime writes: 1168 MB

Reserved blocks uid: 0 (user root)

Reserved blocks gid: 0 (group root)

First inode: 11

Inode size: 256

Required extra isize: 28

Desired extra isize: 28

Journal inode: 8

Default directory hash: half_md4

Directory Hash Seed: b65a034c-de24-49ba-be45-1c1761cad713

Journal backup: inode blocks

Journal features: journal_incompat_revoke

Journal size: 128M

Journal length: 32768

Journal sequence: 0x00001208

Journal start: 1

Group 0: (Blocks 0-32767) [ITABLE_ZEROED]

Checksum 0x22ce, unused inodes 8181

Primary superblock at 0, Group descriptors at 1-1

Reserved GDT blocks at 2-640

Block bitmap at 641 (+641), Inode bitmap at 657 (+657)

Inode table at 673-1184 (+673)

23877 free blocks, 8181 free inodes, 2 directories, 8181 unused inodes

Free blocks: 8891-32767

Free inodes: 12-8192

Group 1: (Blocks 32768-65535) [INODE_UNINIT, ITABLE_ZEROED]

Checksum 0x1e6c, unused inodes 8192

Backup superblock at 32768, Group descriptors at 32769-32769

Reserved GDT blocks at 32770-33408

Block bitmap at 642 (bg #0 + 642), Inode bitmap at 658 (bg #0 + 658)

Inode table at 1185-1696 (bg #0 + 1185)

31728 free blocks, 8192 free inodes, 0 directories, 8192 unused inodes

Free blocks: 33409-33439, 33463-33535, 33679-35839, 35951-35967, 36049-36223, 36244-36255, 36277-65535

Free inodes: 8193-16384

五、inode的分配

linux-2.6.21\fs\ext2\ialloc.c

struct inode *ext2_new_inode(struct inode *dir, int mode)

{

……

if (S_ISDIR(mode)) {

if (test_opt(sb, OLDALLOC))

group = find_group_dir(sb, dir);

else

group = find_group_orlov(sb, dir);

} else

group = find_group_other(sb, dir);

……

for (i = 0; i < sbi->s_groups_count; i++) {

gdp = ext2_get_group_desc(sb, group, &bh2);

brelse(bitmap_bh);

bitmap_bh = read_inode_bitmap(sb, group);

if (!bitmap_bh) {

err = -EIO;

goto fail;

}

ino = 0;

repeat_in_this_group:

ino = ext2_find_next_zero_bit((unsigned long *)bitmap_bh->b_data,

EXT2_INODES_PER_GROUP(sb), ino);

if (ino >= EXT2_INODES_PER_GROUP(sb)) {

* Rare race: find_group_xx() decided that there were

* free inodes in this group, but by the time we tried

* to allocate one, they're all gone. This can also

* occur because the counters which find_group_orlov()

* uses are approximate. So just go and search the

* next block group.

if (++group == sbi->s_groups_count)

group = 0;

continue;

}

if (ext2_set_bit_atomic(sb_bgl_lock(sbi, group),

ino, bitmap_bh->b_data)) {

/* we lost this inode */

if (++ino >= EXT2_INODES_PER_GROUP(sb)) {

/* this group is exhausted, try next group */

if (++group == sbi->s_groups_count)

group = 0;

continue;

}

/* try to find free inode in the same group */

goto repeat_in_this_group;

}

goto got;

}

……

}

其中我们关心的对于inode的回收操作在于

ino = 0;

repeat_in_this_group:

ino = ext2_find_next_zero_bit((unsigned long *)bitmap_bh->b_data,

EXT2_INODES_PER_GROUP(sb), ino);

这里看到的操作是对于位图的扫描默认是从0开始的，这意味着在同一个文件夹下频繁的删除、创建操作之后，新分配的inode节点很容易和之前的inode节点重合；更加糟糕的是，文件系统对磁盘分成了不同的group，当创建文件的时候将会优先从父节点所在group中分配，而每个group的大小通常只有8K个(这一点从前面dump出的Inodes per group: 8192可以看出来)。在传统的共享内存创建过程中，通常都是通过ftok来获得共享内存的token，如果inode的回收比较频繁，那么意味着同样的文件在删除/创建之后，很容易生成相同的key，导致共享内存的使用产生冲突。

下面是验证的一个例子，可以看到新创建的inode复用了刚刚删除的inode节点的编号：

tsecer@harry: cat inoderecycle.sh

echo create new files

for ((i = 0; i < 10; i++))

touch old$i

ls -i old$i

done

echo remove old files

for ((i = 0; i < 10; i++))

rm -f old$i

done

echo create new files

for ((i = 0; i < 10; i++))

touch new$i

ls -i new$i

done

tsecer@harry: sh inoderecycle.sh

create new files

524291 old0

524292 old1

524293 old2

524294 old3

524295 old4

524296 old5

524297 old6

524298 old7

524299 old8

524300 old9

remove old files

create new files

524291 new0

524292 new1

524293 new2

524294 new3

524295 new4

524296 new5

524297 new6

524298 new7

524299 new8

524300 new9

tsecer@harry:

六、磁盘分区的自动挂载

这个其实是操作系统支持的一个用户态机制，简单的说就是通过在/etc/fstab中配置需要在启动中挂载的文件即可，这个没什么好深入讨论的。只是在fedoracore的系统中，这里通常给的不是文件的磁盘名称，而是一个uuid，这个内容其实也并不神秘，在busybox的代码中可以看到，是在创建文件系统的时候按照一定个规则生成的随机值，生成之后写入文件系统的super block的特定位置：

tsecer@harry: blkid

/dev/sr0: UUID="2013-12-12-14-06-50-00" LABEL="Fedora 20 x86_64" TYPE="iso9660" PTUUID="69e9fb83" PTTYPE="dos"

/dev/sda2: UUID="42cd7a7f-caf9-49cf-8b39-03391e7fd5d6" TYPE="ext4" PARTUUID="0000b890-02"

/dev/sda3: UUID="b820a18b-06a3-42d1-8db3-e05241417a6b" TYPE="swap" PARTUUID="0000b890-03"

/dev/sda5: UUID="91ec1fb0-f629-4a7f-9b5b-bc311ab85bf2" TYPE="ext4" PARTUUID="0000b890-05"

/dev/sda6: UUID="056210ed-db41-4e69-ad11-1f08802a4afa" TYPE="ext4" PARTUUID="0000b890-06"

/dev/sda1: PARTUUID="0000b890-01"

tsecer@harry: cat /etc/fstab

# /etc/fstab

# Created by anaconda on Sun Oct 18 15:14:41 2015

# Accessible filesystems, by reference, are maintained under '/dev/disk'

# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info

UUID=91ec1fb0-f629-4a7f-9b5b-bc311ab85bf2 / ext4 defaults 1 1

UUID=42cd7a7f-caf9-49cf-8b39-03391e7fd5d6 /boot ext4 defaults 1 2

UUID=b820a18b-06a3-42d1-8db3-e05241417a6b swap swap defaults 0 0

UUID=056210ed-db41-4e69-ad11-1f08802a4afa /home/tsecer/sda6 ext4 defaults 0 0

tsecer@harry:

busybox中工具的处理

int mkfs_ext2_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;

generate_uuid(sb->s_uuid);

void FAST_FUNC generate_uuid(uint8_t *buf)

{

/* http://www.ietf.org/rfc/rfc4122.txt

* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* | time_low |

* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* | time_mid | time_hi_and_version |

* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* |clk_seq_and_variant | node (0-1) |

* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* | node (2-5) |

* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* IOW, uuid has this layout:

* uint32_t time_low (big endian)

* uint16_t time_mid (big endian)

* uint16_t time_hi_and_version (big endian)

* version is a 4-bit field:

* 1 Time-based

* 2 DCE Security, with embedded POSIX UIDs

* 3 Name-based (MD5)

* 4 Randomly generated

* 5 Name-based (SHA-1)

* uint16_t clk_seq_and_variant (big endian)

* variant is a 3-bit field:

* 0xx Reserved, NCS backward compatibility

* 10x The variant specified in rfc4122

* 110 Reserved, Microsoft backward compatibility

* 111 Reserved for future definition

* uint8_t node[6]

* For version 4, these bits are set/cleared:

* time_hi_and_version & 0x0fff | 0x4000

* clk_seq_and_variant & 0x3fff | 0x8000

pid_t pid;

int i;

i = open("/dev/urandom", O_RDONLY);

if (i >= 0) {

read(i, buf, 16);

close(i);

}

/* Paranoia. /dev/urandom may be missing.

* rand() is guaranteed to generate at least [0, 2^15) range,

* but lowest bits in some libc are not so "random". */

srand(monotonic_us()); /* pulls in printf */

pid = getpid();

while (1) {

for (i = 0; i < 16; i++)

buf[i] ^= rand() >> 5;

if (pid == 0)

break;

srand(pid);

pid = 0;

}

/* version = 4 */

buf[4 + 2 ] = (buf[4 + 2 ] & 0x0f) | 0x40;

/* variant = 10x */

buf[4 + 2 + 2] = (buf[4 + 2 + 2] & 0x3f) | 0x80;

}

posted on 2019-03-07 09:54 tsecer 阅读(493) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

tsecer

从虚拟机磁盘扩容看linux下磁盘管理及ext文件系统

导航

公告