MIT6.S081-Lab9 FS [2021Fall]

开始日期:22.07.05

操作系统:Ubuntu20.0.4

Link:Lab file system

my github repository: duilec/MITS6.081-fall2021/tree/fs

Lab Lock

写在前面

此部分不涉及实现细节,可以放心阅读

bigfile主要采用了二级链表实现

symlink需要清楚iput()open()create()unlink()namei()等函数是如何交互的,它们又是怎么让inode在cache中释放存储的。

踩坑

  • symlinktest中使用存储路径名字的方式失败
    笔者一开始采用的是symlink存储路径名字(字符指针),但在测试执行if(write(fd1, buf, sizeof(buf)) != 4)之后,路径名字会直接丢失,无法引用,导致失败。具体原因没有分析出来。
  • iput
    iput的作用是将ip->ref减一,如果ip->ref == 1就直接将其从cache中释放掉
  • fs.c的注释
    做symlinktest前,请务必通读fs.c的大段注释(大约在105 line
  • writebig: panic freeing free block
    如果完成了symlinktest,在执行usertests之前,得先执行一次bigfile。
    否则会导致出错,因为wirtebif要写和MAXFILE一样大的文件,而只有执行bigfile之后我们才有修改后的MAXFILE
  • make clean
    如果发现测试怎么样都通过不了,觉得思路也没错,那可能是之前版本有问题遗留可以尝试make clean

参考材料

实验内容

警告:此部分涉及实现细节

bigfile

实验要求:实现磁盘的二级映射,将磁盘空间由268 blocks扩大为65803 blocks,约为原来的256倍

实现思路:将原来的12个直接映射+1个一级间接映射改为11个直接映射+1个一级间接映射+一个二级映射

具体而言就是链表多链接一层即可,再改一下参数,写之前最好完完全全看懂原版的bmap()函数。

注意在进入二级映射之前,记得减掉一个NINDIRECT,保证获取的中间地址号不会多1

access code

首先是修改一些定义

/* fs.h */
#define NDIRECT 11
#define NINDIRECT (BSIZE / sizeof(uint))
#define MAXFILE (NDIRECT + NINDIRECT + (NINDIRECT*NINDIRECT))

/* file.h */
// in-memory copy of an inode
struct inode {
	...
  uint addrs[NDIRECT + 2]; // 11 direct blocks, 1 singly indirect block and 1 double indirect block
};

然后是具体实现

static uint
bmap(struct inode *ip, uint bn)
{
  uint addr, *a;
  struct buf *bp;

  // 11 direct block[0 ~ 10] => [0 ~ 10] 
  if(bn < NDIRECT){
    // Load direct block, allocating if necessary.
    if((addr = ip->addrs[bn]) == 0)
      ip->addrs[bn] = addr = balloc(ip->dev);
    return addr;
  }
  bn -= NDIRECT;
  
  // NDIRECT == 11
  // 1 singly direct block[11] => [0 ~ 255]
  if(bn < NINDIRECT){
    // Load indirect block, allocating if necessary.
    if((addr = ip->addrs[NDIRECT]) == 0)
      ip->addrs[NDIRECT] = addr = balloc(ip->dev);
    
    bp = bread(ip->dev, addr);
    a = (uint*)bp->data;
    
    if((addr = a[bn]) == 0){
      a[bn] = addr = balloc(ip->dev);
      log_write(bp);
    }
    
    brelse(bp);
    return addr;
  }
  // go to double direct block[12]
  bn -= NINDIRECT;

  // 1 double direct block[12] => 256 * [0 ~ 255]
  if(bn < NINDIRECT * NINDIRECT){
    int mid_addr_num = bn / NINDIRECT; // get mid_addr_num and c-style: divide(/) to down 
    bn = bn % NINDIRECT; // get real bn

    // Load indirect block, allocating if necessary.
    if((addr = ip->addrs[NDIRECT + 1]) == 0)
      ip->addrs[NDIRECT + 1] = addr = balloc(ip->dev);
    
    // get the addr of mid_addr_num
    bp = bread(ip->dev, addr);
    a = (uint*)bp->data;
    if((addr = a[mid_addr_num]) == 0){
      a[mid_addr_num] = addr = balloc(ip->dev);
      log_write(bp);
    }
    brelse(bp);

    // get the addr of real bn
    bp = bread(ip->dev, addr);
    a = (uint*)bp->data;
    if((addr = a[bn]) == 0){
      a[bn] = addr = balloc(ip->dev);
      log_write(bp);
    }
    brelse(bp);

    return addr;
  }
  panic("bmap: out of range");
}

// Truncate inode (discard contents).
// Caller must hold ip->lock.
void
itrunc(struct inode *ip)
{
  int i, j, k;
  struct buf *bp;
  uint *a, *botton_a;

  // 11 direct block[0 ~ 10] => [0 ~ 10] 
  for(i = 0; i < NDIRECT; i++){
    if(ip->addrs[i]){
      bfree(ip->dev, ip->addrs[i]);
      ip->addrs[i] = 0;
    }
  }

  // NDIRECT == 11
  // 1 singly direct block[11] => [0 ~ 255]
  if(ip->addrs[NDIRECT]){
    bp = bread(ip->dev, ip->addrs[NDIRECT]);
    a = (uint*)bp->data;
    for(j = 0; j < NINDIRECT; j++){
      if(a[j])
        bfree(ip->dev, a[j]);
    }
    brelse(bp);
    bfree(ip->dev, ip->addrs[NDIRECT]);
    ip->addrs[NDIRECT] = 0;
  }

  // 1 double direct block[12] => 256 * [0 ~ 255]
  if(ip->addrs[NDIRECT + 1]){
    // get a of mid_addr
    bp = bread(ip->dev, ip->addrs[NDIRECT + 1]);
    a = (uint*)bp->data;
    brelse(bp);
    
    for(j = 0; j < NINDIRECT; j++){
      // get botton_a of botton_addr 
      // exist block?
      if(a[j]){
        bp = bread(ip->dev, a[j]);
        botton_a = (uint*)bp->data;
        brelse(bp);
        // free content of botton_a
        for(k = 0; k < NINDIRECT; k++)
          if(botton_a[k])
            bfree(ip->dev, botton_a[k]);
      }
    }
    bfree(ip->dev, ip->addrs[NDIRECT + 1]);
    ip->addrs[NDIRECT + 1] = 0;
  }

  ip->size = 0;
  iupdate(ip);
}

实验要求:实现软连接(symboil/soft link)
软连接不同于硬连接(hard link),可以在不同的磁盘块下连接起来。比如,我想D盘执行一个程序,但程序文件的位于C盘。那么通过软连接,即可在D盘执行C盘的程序,而硬连接则不行。

实现思路:实现syccall: sys_symlink将目标文件inode指针存储到路径文件的inode结构中,之后即可通过路径文件来访问目标文件

具体实现过程中,需要注意路径文件本身可能存在也可能不存在,所以需要检查,根据情况进行创建。这需要处理两个情况,即如果路径文件本身不存在,创建出来后是带锁的,那么自然不需要上锁;但如果路径文件本身就存在,那么它是没有锁的,那么我们就需要使用if(!holdingsleep(&ip->lock))进行检查。

同时,软连接到一个不存在的目标文件是被允许的,也就是说目标文件和路径文件一样,可能存在也可能不存在,所以需要检查,根据情况进行创建。这就会导致一个情况,就是如果带着O_CREATE参数去open()目标文件时,目标文件可能已经在symlink()被创建,如果再创建就是重复创建了,解决方法是进行检查,如果是T_SYMLINK,则不用新创建,改下参数即可;如果不是,则需要将其释放掉,然后再去创建。

open()函数中循环查找时记得解锁,再上锁,保证原子性,查找结束后,需要检查一下目标文件是否真的存在,存在的情况是nlink大于0

无论是目标文件还是路径文件,只要是T_SYMLINK类型,当被unlink()时,就需要执行iput()将其引用减一,让其有机会被释放掉,否则T_SYMLINK类型的inode就会遗留在cache,无法释放,数量一多,就会导致cache中没有空闲的inode可以存储了。(这点是依据报错和测试程序推测出来的)

注意,我们不需要处理链接对象是direcitory的情况

access code

首先是syccall: sys_symlink的注册流程,我这里就不详细写了,按照提示走即可。

然后是实现sys_symlink

uint64
sys_symlink(void)
{ 
  char target[MAXPATH], path[MAXPATH];
  struct inode *ip, *ip_target;

  // get the parameter of target and path
  if(argstr(0, target, MAXPATH) < 0 || argstr(1, path, MAXPATH) < 0)
    return -1;
  
  begin_op();

  // Symlinking to nonexistent file should succeed
  if((ip_target = namei(target)) == 0){
    if((ip_target = create(target, T_SYMLINK, 0, 0)) == 0){ // diff char* path
      end_op();
      return -1;
    }
  }

  // create ip if not exist
  if((ip = namei(path)) == 0){ // diff char* path
    if((ip = create(path, T_SYMLINK, 0, 0)) == 0){
      end_op();
      return -1;
    }   
  }

  // not have to handle symbolic links to directories for this lab.
  if(ip_target){
    if(ip_target->type == T_DIR){
      end_op();
      return -1;
    }
  }

  // inode(ip) with lock return from create(path) 
  // inode(ip) without lock return from namei(path)
  // so we need to check and maybe ilock it
  if(!holdingsleep(&ip->lock)){
    ilock(ip);
  }
  ip->sym_link = ip_target;
  iupdate(ip);
  iunlock(ip);
  end_op();
  return 0;
}

修改open()

// Modify the open system call to handle the case where the path refers to a symbolic link. 
// If the file does not exist, open must fail. 
// When a process specifies O_NOFOLLOW in the flags to open, 
// open should open the symlink (and not follow the symbolic link).
uint64
sys_open(void)
{
  char path[MAXPATH];
  int fd, omode;
  struct file *f;
  struct inode *ip;
  int n;

  if((n = argstr(0, path, MAXPATH)) < 0 || argint(1, &omode) < 0)
    return -1;

  begin_op();

  if(omode & O_CREATE){
    ip = namei(path);
    if(ip){
      // if it is T_SYMLINK, we just change it type
      if(ip->type == T_SYMLINK){
        ip->type = T_FILE;
        ip->major = ip->minor = 0;
      }
      else{
        // if not T_SYMLINK we need free it, then create a inode
        // becasue it is a new path so we can sure 'ip->ref == 1'
        // 'ip->ref == 1'(namei()=>namex()=>idup()=>ip->ref++=>1)
        iput(ip);
        ip = create(path, T_FILE, 0, 0);
        if(ip == 0){
          end_op();
          return -1;
        }        
      }
    }
    else{
      ip = create(path, T_FILE, 0, 0);
      if(ip == 0){
        end_op();
        return -1;
      }
    }
  } 
  else{
    if((ip = namei(path)) == 0){
      end_op();
      return -1;
    }
    ilock(ip);
    if(ip->type == T_DIR && omode != O_RDONLY){
      iunlockput(ip);
      end_op();
      return -1;
    }
  }

  if(ip->type == T_SYMLINK){
    // if((omode & O_NOFOLLOW)){
    //     // do nothing
    // }

    // FOLLOW => follow symlink
    if(!(omode & O_NOFOLLOW)){
      // we set threshold = 10, so we only check 10 times
      int threshold = 10;
      for(int i = 0; i < threshold; i++){
        iunlockput(ip);
        ip = ip->sym_link;
        ilock(ip);
        if(ip->type != T_SYMLINK){
          break;
        }
      }

      // check: file really exists?
      if(ip->nlink == 0){
        iunlockput(ip);
        end_op();
        return -1;        
      }
    }
  }

  if(ip->type == T_DEVICE && (ip->major < 0 || ip->major >= NDEV)){
    iunlockput(ip);
    end_op();
    return -1;
  }

  if((f = filealloc()) == 0 || (fd = fdalloc(f)) < 0){
    if(f)
      fileclose(f);
    iunlockput(ip);
    end_op();
    return -1;
  }

  if(ip->type == T_DEVICE){
    f->type = FD_DEVICE;
    f->major = ip->major;
  } else {
    f->type = FD_INODE;
    f->off = 0;
  }
  f->ip = ip;
  f->readable = !(omode & O_WRONLY);
  f->writable = (omode & O_WRONLY) || (omode & O_RDWR);
  
  if((omode & O_TRUNC) && ip->type == T_FILE){
    itrunc(ip);
  }

  iunlock(ip);
  end_op();

  return fd;
}

最后是修改unlink()

uint64
sys_unlink(void)
{
  struct inode *ip, *dp;
  struct dirent de;
  char name[DIRSIZ], path[MAXPATH];
  uint off;

  if(argstr(0, path, MAXPATH) < 0)
    return -1;

  begin_op();
  if((dp = nameiparent(path, name)) == 0){
    end_op();
    return -1;
  }

  ilock(dp);

  // Cannot unlink "." or "..".
  if(namecmp(name, ".") == 0 || namecmp(name, "..") == 0)
    goto bad;

  if((ip = dirlookup(dp, name, &off)) == 0)
    goto bad;
  ilock(ip);

  if(ip->nlink < 1)
    panic("unlink: nlink < 1");
  if(ip->type == T_DIR && !isdirempty(ip)){
    iunlockput(ip);
    goto bad;
  }

  memset(&de, 0, sizeof(de));
  if(writei(dp, 0, (uint64)&de, off, sizeof(de)) != sizeof(de))
    panic("unlink: writei");
  if(ip->type == T_DIR){
    dp->nlink--;
    iupdate(dp);
  }
  iunlockput(dp);

  ip->nlink--;
  iupdate(ip);
  iunlockput(ip);

  // we need to ip->ref-- or free symlink when we call unlink 
  if(ip->type == T_SYMLINK){
    iput(ip);
  }

  end_op();

  return 0;

bad:
  iunlockput(dp);
  end_op();
  return -1;
}

result

make[1]: Leaving directory '/home/duile/xv6-labs-2021'
== Test running bigfile ==
$ make qemu-gdb
running bigfile: OK (170.2s)
    (Old xv6.out.bigfile failure log removed)
== Test running symlinktest ==
$ make qemu-gdb
(0.9s)
== Test   symlinktest: symlinks ==
  symlinktest: symlinks: OK
== Test   symlinktest: concurrent symlinks ==
  symlinktest: concurrent symlinks: OK
== Test usertests ==
$ make qemu-gdb
usertests: OK (294.8s)
== Test time ==
time: OK
Score: 100/100

总结

  • 完成日期:22.07.07
  • 耗时15h,也是完全没看任何参考答案完成,bigfile用了3小时,2小时在看材料,看代码,1小时在写代码,实际上可能都没一个小时;symlink用了12小时,3小时看材料,查资料,看代码,9小时在调代码
  • 一开始写symlink的时候和bigfile区别和大,bigfile是我完全知道自己要干什么,非常清晰,而symlink是毫无头绪,查了不少资料也没有完全懂,后面是模仿着link()才写出个大概,然后一直调一直调一直调。不断猜测题目想要我写的是什么。
    从这我得出经验,真正明白编程所运用的知识才能写得流畅
  • 写symlink到四分之一的时候想出两个方案一个是存储路径名字指针,一个是存储路径文件inode指针,一开始用前者,失败了,但没找出原因,然后用后者,但遇到测试open("/testsymlink/4", O_CREATE | O_RDWR)时发现重复创建了文件,出现矛盾,想不出好办法解决。于是,又折回去实现名字指针,这回发现原因了,名字指针丢失,但找不出原理,对于我是个黑箱问题,无法解决。最后,折回去实现inode指针,想办法修改明确有问题的open(),解决了重复创建的问题。
    这个过程告诉我,当编程遇到明显问题的时候,有些看似无法解决,应该先休息,而不是立马放弃,调整好状态再思考。同时,面对黑箱问题和白箱问题,即使前者实现起来更漂亮简洁,还不如后者更实在靠谱
  • open()解决重复创建的那段我觉得我写的不够简洁,或许还能优化下,但加个函数就太做作了。
  • printf真好用,但用多了,就想着或许打断点更方便,只是我不熟悉打断点
  • 最近在听 I Don't Want To Say Goodbye Teddy Thompson
posted @ 2022-07-07 20:33  duile  阅读(451)  评论(0编辑  收藏  举报