从虚拟机磁盘扩容看linux下磁盘管理及ext文件系统

一、虚拟机磁盘扩容
最开始的时候创建的虚拟机的磁盘容量是20G,运行了一段时间之后,发现磁盘空间已经不够用,需要增加磁盘的容量。想到最简单的方法就是增加磁盘容量,把20G扩容到30G,这里的也有两种方法,一种是现有磁盘的容量,另一种是新加一个硬盘设备。当时使用的是第一种方法,也就是增加虚拟机现在使用的那块磁盘的容量。但是磁盘容量扩容之后,系统并没有自动检测到磁盘空间的变化,由于最开始在创建文件系统的时候只是用来当时存在的磁盘空间,新添加的空间对之前的文件系统就是不可见的,在现有的情况下,需要为新增加的磁盘空间创建一个新的分区,这个时候就用到了磁盘的分区工具,这个最早考虑的是fdisk,但是这个工具在使用的时候我试了几次,添加新的分区总是失败,所以放弃。后台看到在fdisk的man手册中有提到cfdisk,所以试了下cfdisk,使用cfdisk按照提示傻瓜式操作就可以完成新逻辑分区的添加。
这里的fdisk应该是format disk的意思,而cfdisk的c前缀是使用了ncurses库中的c,使用这库之后用户可以看到菜单项,可以使用方向键,快捷键等操作,用户界面更加友好些。fdisk的功能就是对一块物理磁盘进行分区管理,这个分区是更为通用的一个层次,它面向的使用者是BIOS,我们直观的感觉就是它真正是跨越了操作系统,比方说我们可以在一块磁盘上安装linux系统,也可以按照windows系统,这些系统可以在同一个磁盘共存。由于不同操作系统通常使用自己私有的“官方”文件系统,例如windows的ntfs,linux的extN系列,所以这个分区也是可以不理解操作文件系统的。分出来的区可以支持任意的文件系统,甚至不创建文件系统也可以。
我看了下cfdisk的代码,感觉里面的代码注释没有内核的注释清楚,虽然两者的功能相似,所以下面对于磁盘分区表的格式解析以内核代码为例子。只是 cfdisk毕竟是一个用户态程序,它的输出更加的自由,调用也更加方便,所以看下它提供的相对比较友好的输出内容,下面是我的虚拟机系统中cfdisk /dev/sda的输出内容。这里看到最后两个分区,前一个为20G左右,也就是最初创建虚拟机时的磁盘容量,最后的10G是后来追加之后通过cfdisk添加的逻辑分区。这里比较感兴趣的是,这个地方的cfdisk是如何知道不同分区的文件系统类型的,例如它识别出了ext4和swap这种linux文件系统类型。
                                               cfdisk (util-linux 2.24)
 
                                                 Disk Drive: /dev/sda
                                           Size: 32212254720 bytes, 32.2 GB
                                 Heads: 255   Sectors per Track: 63   Cylinders: 3916
 
     Name              Flags            Part Type FS Type                 [Label]               Size (MB)
 --------------------------------------------------------------------------------------------------------------------
                                                        Unusable                                           1.05      *
     sda1                                Primary        Linux                                              1.05      *
     sda2              Boot              Primary        ext4                                             314.58      *
     sda3                                Primary        swap                                            2147.49      *
     sda5              NC                Logical        ext4                                           19010.69      *
     sda6                                Logical        ext4                                           10737.42      *
 
 
 
 
 
 
 
       [   Help   ]  [  Print   ]  [   Quit   ]  [  Units   ]  [  Write   ]
 
                                                  No more partitions
                                                  Print help screen
二、cfdisk对于文件系统内容的识别
通过hexdump看下这个磁盘最开始的一个删除的后256字节的内容
tsecer@harry: hexdump -Cs 0x100 /dev/sda | more
00000100  f4 40 89 44 08 0f b6 c2  c0 e8 02 66 89 04 66 a1  |.@.D.......f..f.|
00000110  60 7c 66 09 c0 75 4e 66  a1 5c 7c 66 31 d2 66 f7  |`|f..uNf.\|f1.f.|
00000120  34 88 d1 31 d2 66 f7 74  04 3b 44 08 7d 37 fe c1  |4..1.f.t.;D.}7..|
00000130  88 c5 30 c0 c1 e8 02 08  c1 88 d0 5a 88 c6 bb 00  |..0........Z....|
00000140  70 8e c3 31 db b8 01 02  cd 13 72 1e 8c c3 60 1e  |p..1......r...`.|
00000150  b9 00 01 8e db 31 f6 bf  00 80 8e c6 fc f3 a5 1f  |.....1..........|
00000160  61 ff 26 5a 7c be 80 7d  eb 03 be 8f 7d e8 34 00  |a.&Z|..}....}.4.|
00000170  be 94 7d e8 2e 00 cd 18  eb fe 47 52 55 42 20 00  |..}.......GRUB .|
00000180  47 65 6f 6d 00 48 61 72  64 20 44 69 73 6b 00 52  |Geom.Hard Disk.R|
00000190  65 61 64 00 20 45 72 72  6f 72 0d 0a 00 bb 01 00  |ead. Error......|
000001a0  b4 0e cd 10 ac 3c 00 75  f4 c3 00 00 00 00 00 00  |.....<.u........|
000001b0  00 00 00 00 00 00 00 00  90 b8 00 00 00 00 00 20  |............... |
000001c0  21 00 83 41 01 00 00 08  00 00 00 08 00 00 80 41  |!..A...........A|
000001d0  02 00 83 7f 19 26 00 10  00 00 00 60 09 00 00 7f  |.....&.....`....|
000001e0  1a 26 82 94 69 2b 00 70  09 00 00 00 40 00 00 94  |.&..i+.p....@...|
000001f0  6a 2b 05 fe ff ff 00 70  49 00 00 90 76 03 55 aa  |j+.....pI...v.U.|
00000200  52 bf f4 81 66 8b 2d 83  7d 08 00 0f 84 e2 00 80  |R...f.-.}.......|
00000210  7c ff 00 74 46 66 8b 1d  66 8b 4d 04 66 31 c0 b0  ||..tFf..f.M.f1..|
00000220  7f 39 45 08 7f 03 8b 45  08 29 45 08 66 01 05 66  |.9E....E.)E.f..f|
00000230  83 55 04 00 c7 04 10 00  89 44 02 66 89 5c 08 66  |.U.......D.f.\.f|
00000240  89 4c 0c c7 44 06 00 70  50 c7 44 04 00 00 b4 42  |.L..D..pP.D....B|
00000250  cd 13 0f 82 af 00 bb 00  70 eb 66 66 8b 45 04 66  |........p.ff.E.f|
00000260  09 c0 0f 85 97 00 66 8b  05 66 31 d2 66 f7 34 88  |......f..f1.f.4.|
00000270  54 0a 66 31 d2 66 f7 74  04 88 54 0b 89 44 0c 3b  |T.f1.f.t..T..D.;|
tsecer@harry: 
可以看到000001f0行最后两个DOS系统的签名"55 aa",在这个签名之前,是一个包含4个partition结构的数组,这个partition结构的大小为16字节
struct partition {
        unsigned char boot_ind;         /* 0x80 - active */
        unsigned char head;             /* starting head */
        unsigned char sector;           /* starting sector */
        unsigned char cyl;              /* starting cylinder */
        unsigned char sys_ind;          /* What partition type */
        unsigned char end_head;         /* end head */
        unsigned char end_sector;       /* end sector */
        unsigned char end_cyl;          /* end cylinder */
        unsigned char start4[4];        /* starting sector counting from 0 */
        unsigned char size4[4];         /* nr of sectors in partition */
};
对于swap系统,它在000001e2未知的sys_ind值为0x82,对应于源代码中的util-linux-ng-2.16.2\fdisk\i386_sys_types.c
struct systypes i386_sys_types[] = {
{0x00, N_("Empty")},
        ……
{0x81, N_("Minix / old Linux")},/* Minix 1.4b and later */
{0x82, N_("Linux swap / Solaris")},
{0x83, N_("Linux")},
        
对于包含boot的000001d2,包含了文件系统为0x83,这里并没有提示出文件系统是ext2或者ext4等,这个地方通过get_linux_label(int i) 来读取磁盘分区的超级块来判断具体文件文件系统类型。我现在这个版本的util没有对ext4的处理,但是不影响实现的思路:
static void
get_linux_label(int i) {
 
#define EXT2LABELSZ 16
#define EXT2_SUPER_MAGIC 0xEF53
#define EXT3_FEATURE_COMPAT_HAS_JOURNAL 0x0004
struct ext2_super_block {
char  s_dummy0[56];
unsigned char  s_magic[2];
char  s_dummy1[34];
unsigned char  s_feature_compat[4];
char  s_dummy2[24];
char  s_volume_name[EXT2LABELSZ];
char  s_last_mounted[64];
char  s_dummy3[824];
} e2fsb;
offset = (p_info[i].first_sector + p_info[i].offset) * SECTOR_SIZE
+ 1024;
if (lseek(fd, offset, SEEK_SET) == offset
    && read(fd, &e2fsb, sizeof(e2fsb)) == sizeof(e2fsb)
    && e2fsb.s_magic[0] + (e2fsb.s_magic[1]<<8) == EXT2_SUPER_MAGIC) {
label = e2fsb.s_volume_name;
for(j=0; j<EXT2LABELSZ && j<LABELSZ && isprint(label[j]); j++)
p_info[i].volume_label[j] = label[j];
p_info[i].volume_label[j] = 0;
/* ext2 or ext3? */
if (e2fsb.s_feature_compat[0]&EXT3_FEATURE_COMPAT_HAS_JOURNAL)
strncpy(p_info[i].fstype, "ext3", FSTYPESZ);
else
strncpy(p_info[i].fstype, "ext2", FSTYPESZ);
return;
}
……                
三、操作系统对于磁盘分区的识别和处理
这个功能和cfdisk中的部分功能是重合的,但是内核中的代码更加全面些,操作系统对于逻辑分区的处理
linux-3.12.6\block\partitions\check.c
static int (*check_part[])(struct parsed_partitions *) = {
/*
 * Probe partition formats with tables at disk address 0
 * that also have an ADFS boot block at 0xdc0.
 */
……
#ifdef CONFIG_EFI_PARTITION
efi_partition, /* this must come before msdos */
#endif
……
#ifdef CONFIG_MSDOS_PARTITION
msdos_partition,
#endif
……
}   
      
struct parsed_partitions *
check_partition(struct gendisk *hd, struct block_device *bdev)
{
……
i = res = err = 0;
while (!res && check_part[i]) {
memset(state->parts, 0, state->limit * sizeof(state->parts[0]));
res = check_part[i++](state);
if (res < 0) {
/* We have hit an I/O error which we don't report now.
  * But record it, and let the others do their job.
  */
err = res;
res = 0;
}
 
}
……
}        
 
linux-3.12.6\block\partitions\msdos.c
#define MSDOS_LABEL_MAGIC1 0x55
#define MSDOS_LABEL_MAGIC2 0xAA
 
static inline int
msdos_magic_present(unsigned char *p)
{
return (p[0] == MSDOS_LABEL_MAGIC1 && p[1] == MSDOS_LABEL_MAGIC2);
}
 
int msdos_partition(struct parsed_partitions *state)
{
sector_t sector_size = bdev_logical_block_size(state->bdev) / 512;
Sector sect;
unsigned char *data;
struct partition *p;
struct fat_boot_sector *fb;
int slot;
u32 disksig;
 
data = read_part_sector(state, 0, &sect);
……
if (!msdos_magic_present(data + 510)) {
put_dev_sector(sect);
return 0;
}
 
/*
 * Now that the 55aa signature is present, this is probably
 * either the boot sector of a FAT filesystem or a DOS-type
 * partition table. Reject this in case the boot indicator
 * is not 0 or 0x80.
 */
p = (struct partition *) (data + 0x1be);
for (slot = 1; slot <= 4; slot++, p++) {
if (p->boot_ind != 0 && p->boot_ind != 0x80) {
/*
 * Even without a valid boot inidicator value
 * its still possible this is valid FAT filesystem
 * without a partition table.
 */
fb = (struct fat_boot_sector *) data;
if (slot == 1 && fb->reserved && fb->fats
&& fat_valid_media(fb->media)) {
strlcat(state->pp_buf, "\n", PAGE_SIZE);
put_dev_sector(sect);
return 1;
} else {
put_dev_sector(sect);
return 0;
}
}
}
……
        
在系统日志中的输出,对应函数msdos_partition中的代码
tsecer@harry: dmesg | grep sd 
[    4.372603] sd 2:0:0:0: [sda] 62914560 512-byte logical blocks: (32.2 GB/30.0 GiB)
[    4.372924] sd 2:0:0:0: [sda] Write Protect is off
[    4.372948] sd 2:0:0:0: [sda] Mode Sense: 61 00 00 00
[    4.373310] sd 2:0:0:0: [sda] Cache data unavailable
[    4.373336] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    4.381242] sd 2:0:0:0: Attached scsi generic sg1 type 0
[    4.382969] sd 2:0:0:0: [sda] Cache data unavailable
[    4.382977] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    4.517016]  sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
 
操作系统对于扩展分区的解析linux-2.6.21\fs\partitions\msdos.c:
/*
 * Create devices for each logical partition in an extended partition.
 * The logical partitions form a linked list, with each entry being
 * a partition table with two entries.  The first entry
 * is the real data partition (with a start relative to the partition
 * table start).  The second is a pointer to the next logical partition
 * (with a start relative to the entire extended partition).
 * We do not create a Linux partition for the partition tables, but
 * only for the actual data partitions.
 */
 
static void
parse_extended(struct parsed_partitions *state, struct block_device *bdev,
u32 first_sector, u32 first_size)
{
……
while (1) {
if (++loopct > 100)
return;
                        ……
/*
 * Usually, the first entry is the real data partition,
 * the 2nd entry is the next extended partition, or empty,
 * and the 3rd and 4th entries are unused.
 * However, DRDOS sometimes has the extended partition as
 * the first entry (when the data partition is empty),
 * and OS/2 seems to use all four entries.
 */
 
/* 
 * First process the data partition(s)
 */
……
/*
 * Next, process the (first) extended partition, if present.
 * (So far, there seems to be no reason to make
 *  parse_extended()  recursive and allow a tree
 *  of extended partitions.)
 * It should be a link to the next logical partition.
 */
p -= 4;
for (i=0; i<4; i++, p++)
if (NR_SECTS(p) && is_extended_partition(p))
break;
if (i == 4)
goto done;  /* nothing left to do */
 
this_sector = first_sector + START_SECT(p) * sector_size;
this_size = NR_SECTS(p) * sector_size;
put_dev_sector(sect);
}     
四、创建ext系列磁盘分区
我现在手头上没有e2fsprogs项目的代码,所以以busybox内置的分区代码为例来看下这个说明。busybox-1.19.4\util-linux\mkfs_ext2.c中关键信息是每个block的大小,这个block通常是操作系统处理磁盘的基本单位,这个单位并不是磁盘的基本单位扇区,而是一个扇区的整数倍,这样可以减少磁盘碎片,减少寻到时间,便于和内存中的页面结构对齐等,通常磁盘的block大小都是4K,和内存的物理页面大小对齐。
ext2的inode及block的管理再次以次为单位进行group分区,分区的依据就是这个1个block中的所有bit数量,因为这个group中的所有inode和block的占用/空闲都是通过这个一个bit的1/0来表示的。所以通常一个block大小为4K,那么一个group中管理的block大小为4K*8=32768,也就是下一节中Blocks per group:         32768值的由来,也就是mkfs_ext2.c中#define blocks_per_group (8 * blocksize)的由来。
然后就是确定预留的inode数量,这个数量受mkfs.ext2命令中的-i参数的影响,man命令中对于该参数的说明为:
       -i bytes-per-inode
              Specify the bytes/inode ratio.  mke2fs creates an inode for every bytes-per-inode bytes of  space  on
              the disk.  The larger the bytes-per-inode ratio, the fewer inodes will be created.  This value gener‐
              ally shouldn't be smaller than the blocksize of the filesystem, since in that case more inodes  would
              be made than can ever be used.  Be warned that it is not possible to expand the number of inodes on a
              filesystem after it is created, so be careful deciding the correct value for this parameter.
其实也就是说,一个inode预计要管理多少的磁盘空间,或者更通俗的说,就是一个文件大小的期望值。有了这个参数,整个文件系统的框架其实已经确定了。确定了磁盘中所有inode的数量,加上已经知道了每个group的大小,就可以知道每个group中inode的数量;由于inode结构的大小是确定的,所以inode需要的磁盘空间确定,剩下的就可以给文件内容使用了。当然这里会涉及到一些具体的细节,这里就不展开了。
下面是我新添加的/dev/sda6逻辑分区上文件系统的内容:
tsecer@harry: dumpe2fs /dev/sda6
dumpe2fs 1.42.8 (20-Jun-2013)
Filesystem volume name:   <none>
Last mounted on:          /home/tsecer/sda6
Filesystem UUID:          056210ed-db41-4e69-ad11-1f08802a4afa
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              655360
Block count:              2621432
Reserved block count:     131071
Free blocks:              2505146
Free inodes:              651183
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      639
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Sat Apr 30 00:58:10 2016
Last mount time:          Sat May 14 09:54:26 2016
Last write time:          Sat May 14 09:54:26 2016
Mount count:              9
Maximum mount count:      -1
Last checked:             Sat Apr 30 00:58:10 2016
Check interval:           0 (<none>)
Lifetime writes:          1168 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:           256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      b65a034c-de24-49ba-be45-1c1761cad713
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x00001208
Journal start:            1
 
 
Group 0: (Blocks 0-32767) [ITABLE_ZEROED]
  Checksum 0x22ce, unused inodes 8181
  Primary superblock at 0, Group descriptors at 1-1
  Reserved GDT blocks at 2-640
  Block bitmap at 641 (+641), Inode bitmap at 657 (+657)
  Inode table at 673-1184 (+673)
  23877 free blocks, 8181 free inodes, 2 directories, 8181 unused inodes
  Free blocks: 8891-32767
  Free inodes: 12-8192
Group 1: (Blocks 32768-65535) [INODE_UNINIT, ITABLE_ZEROED]
  Checksum 0x1e6c, unused inodes 8192
  Backup superblock at 32768, Group descriptors at 32769-32769
  Reserved GDT blocks at 32770-33408
  Block bitmap at 642 (bg #0 + 642), Inode bitmap at 658 (bg #0 + 658)
  Inode table at 1185-1696 (bg #0 + 1185)
  31728 free blocks, 8192 free inodes, 0 directories, 8192 unused inodes
  Free blocks: 33409-33439, 33463-33535, 33679-35839, 35951-35967, 36049-36223, 36244-36255, 36277-65535
  Free inodes: 8193-16384
五、inode的分配
linux-2.6.21\fs\ext2\ialloc.c
struct inode *ext2_new_inode(struct inode *dir, int mode)
{
……
if (S_ISDIR(mode)) {
if (test_opt(sb, OLDALLOC))
group = find_group_dir(sb, dir);
else
group = find_group_orlov(sb, dir);
} else 
group = find_group_other(sb, dir);
 ……
  for (i = 0; i < sbi->s_groups_count; i++) {
gdp = ext2_get_group_desc(sb, group, &bh2);
brelse(bitmap_bh);
bitmap_bh = read_inode_bitmap(sb, group);
if (!bitmap_bh) {
err = -EIO;
goto fail;
}
ino = 0;
 
repeat_in_this_group:
ino = ext2_find_next_zero_bit((unsigned long *)bitmap_bh->b_data,
      EXT2_INODES_PER_GROUP(sb), ino);
if (ino >= EXT2_INODES_PER_GROUP(sb)) {
/*
 * Rare race: find_group_xx() decided that there were
 * free inodes in this group, but by the time we tried
 * to allocate one, they're all gone.  This can also
 * occur because the counters which find_group_orlov()
 * uses are approximate.  So just go and search the
 * next block group.
 */
if (++group == sbi->s_groups_count)
group = 0;
continue;
}
if (ext2_set_bit_atomic(sb_bgl_lock(sbi, group),
ino, bitmap_bh->b_data)) {
/* we lost this inode */
if (++ino >= EXT2_INODES_PER_GROUP(sb)) {
/* this group is exhausted, try next group */
if (++group == sbi->s_groups_count)
group = 0;
continue;
}
/* try to find free inode in the same group */
goto repeat_in_this_group;
}
goto got;
}
        ……
}   
其中我们关心的对于inode的回收操作在于
ino = 0;
 
repeat_in_this_group:
ino = ext2_find_next_zero_bit((unsigned long *)bitmap_bh->b_data,
      EXT2_INODES_PER_GROUP(sb), ino);
这里看到的操作是对于位图的扫描默认是从0开始的,这意味着在同一个文件夹下频繁的删除、创建操作之后,新分配的inode节点很容易和之前的inode节点重合;更加糟糕的是,文件系统对磁盘分成了不同的group,当创建文件的时候将会优先从父节点所在group中分配,而每个group的大小通常只有8K个(这一点从前面dump出的Inodes per group:         8192可以看出来)。在传统的共享内存创建过程中,通常都是通过ftok来获得共享内存的token,如果inode的回收比较频繁,那么意味着同样的文件在删除/创建之后,很容易生成相同的key,导致共享内存的使用产生冲突。
下面是验证的一个例子,可以看到新创建的inode复用了刚刚删除的inode节点的编号:
tsecer@harry: cat inoderecycle.sh 
echo create new files
 
for ((i = 0; i < 10; i++))
do 
touch old$i
ls -i old$i
done
 
echo remove old files
 
for ((i = 0; i < 10; i++))
do 
rm -f  old$i
done
 
echo create new files
 
for ((i = 0; i < 10; i++))
do 
touch new$i
ls -i new$i
done
 
tsecer@harry: sh inoderecycle.sh 
create new files
524291 old0
524292 old1
524293 old2
524294 old3
524295 old4
524296 old5
524297 old6
524298 old7
524299 old8
524300 old9
remove old files
create new files
524291 new0
524292 new1
524293 new2
524294 new3
524295 new4
524296 new5
524297 new6
524298 new7
524299 new8
524300 new9
tsecer@harry: 
 
六、磁盘分区的自动挂载
这个其实是操作系统支持的一个用户态机制,简单的说就是通过在/etc/fstab中配置需要在启动中挂载的文件即可,这个没什么好深入讨论的。只是在fedoracore的系统中,这里通常给的不是文件的磁盘名称,而是一个uuid,这个内容其实也并不神秘,在busybox的代码中可以看到,是在创建文件系统的时候按照一定个规则生成的随机值,生成之后写入文件系统的super block的特定位置:
tsecer@harry: blkid
/dev/sr0: UUID="2013-12-12-14-06-50-00" LABEL="Fedora 20 x86_64" TYPE="iso9660" PTUUID="69e9fb83" PTTYPE="dos" 
/dev/sda2: UUID="42cd7a7f-caf9-49cf-8b39-03391e7fd5d6" TYPE="ext4" PARTUUID="0000b890-02" 
/dev/sda3: UUID="b820a18b-06a3-42d1-8db3-e05241417a6b" TYPE="swap" PARTUUID="0000b890-03" 
/dev/sda5: UUID="91ec1fb0-f629-4a7f-9b5b-bc311ab85bf2" TYPE="ext4" PARTUUID="0000b890-05" 
/dev/sda6: UUID="056210ed-db41-4e69-ad11-1f08802a4afa" TYPE="ext4" PARTUUID="0000b890-06" 
/dev/sda1: PARTUUID="0000b890-01" 
tsecer@harry: cat /etc/fstab 
 
#
# /etc/fstab
# Created by anaconda on Sun Oct 18 15:14:41 2015
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=91ec1fb0-f629-4a7f-9b5b-bc311ab85bf2 /                       ext4    defaults        1 1
UUID=42cd7a7f-caf9-49cf-8b39-03391e7fd5d6 /boot                   ext4    defaults        1 2
UUID=b820a18b-06a3-42d1-8db3-e05241417a6b swap                    swap    defaults        0 0
UUID=056210ed-db41-4e69-ad11-1f08802a4afa /home/tsecer/sda6                    ext4    defaults        0 0
tsecer@harry: 
busybox中工具的处理
int mkfs_ext2_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
generate_uuid(sb->s_uuid);
void FAST_FUNC generate_uuid(uint8_t *buf)
{
/* http://www.ietf.org/rfc/rfc4122.txt
 *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 * |                          time_low                             |
 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 * |       time_mid                |         time_hi_and_version   |
 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 * |clk_seq_and_variant            |         node (0-1)            |
 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 * |                         node (2-5)                            |
 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 * IOW, uuid has this layout:
 * uint32_t time_low (big endian)
 * uint16_t time_mid (big endian)
 * uint16_t time_hi_and_version (big endian)
 *  version is a 4-bit field:
 *   1 Time-based
 *   2 DCE Security, with embedded POSIX UIDs
 *   3 Name-based (MD5)
 *   4 Randomly generated
 *   5 Name-based (SHA-1)
 * uint16_t clk_seq_and_variant (big endian)
 *  variant is a 3-bit field:
 *   0xx Reserved, NCS backward compatibility
 *   10x The variant specified in rfc4122
 *   110 Reserved, Microsoft backward compatibility
 *   111 Reserved for future definition
 * uint8_t node[6]
 *
 * For version 4, these bits are set/cleared:
 * time_hi_and_version & 0x0fff | 0x4000
 * clk_seq_and_variant & 0x3fff | 0x8000
 */
pid_t pid;
int i;
 
i = open("/dev/urandom", O_RDONLY);
if (i >= 0) {
read(i, buf, 16);
close(i);
}
/* Paranoia. /dev/urandom may be missing.
 * rand() is guaranteed to generate at least [0, 2^15) range,
 * but lowest bits in some libc are not so "random".  */
srand(monotonic_us()); /* pulls in printf */
pid = getpid();
while (1) {
for (i = 0; i < 16; i++)
buf[i] ^= rand() >> 5;
if (pid == 0)
break;
srand(pid);
pid = 0;
}
 
/* version = 4 */
buf[4 + 2    ] = (buf[4 + 2    ] & 0x0f) | 0x40;
/* variant = 10x */
buf[4 + 2 + 2] = (buf[4 + 2 + 2] & 0x3f) | 0x80;
}

posted on 2019-03-07 09:54  tsecer  阅读(493)  评论(0编辑  收藏  举报

导航