kfifo 源码剖析 [转wishfly]

从2.6.10开始,Linux内核提供了一个通用的环形缓存(我喜欢称为环形队列);它的头文件是<linux/kfifo.h>,kfifo.c是实现代码。
在设备驱动中环形缓存出现相当多. 网络适配器, 特别地, 常常使用环形缓存来与处理器交换数据(报文)[LDD3]。
见下面的图“LDD3中描述的队列”。

我们来看下kfifo的数据结构:

struct kfifo {
unsigned char *buffer; /* the buffer holding the data */
unsigned int size; /* the size of the allocated buffer */
unsigned int in; /* data is added at offset (in % size) */
unsigned int out; /* data is extracted from off. (out % size) */
    spinlock_t *lock; /* protects concurrent modifications */
};

如E文注释所解,buffer指向队列空间,对于队列的操作,一般会涉及到读写,如果是多线程的话,就有可能涉及到生产者/消费者,也称读者/写者问题;
size表示空间大小,必须为2^k,如果不是2^k大小,kfifo将会帮助用户扩展到一个合适的大小;
in 表示写者指向的索引,在kfifo体统的put函数中,被处理为一个虚拟索引(我暂这么称呼)。之所以称之为虚拟索引,是因为该索引并不一定真正指向有效的地址空间,而是要通过一定运算才能得到真实的索引。下面的put/get代码分析将会看到内核是kfifo是怎么运算的。
out表示读者指向的索引,out的更新与in一样,都是使用了虚拟索引的概念,out总是小于等于in。
我们直接来看代码(保留了原来的注释,//后面的内容为笔者的解释)

/**
* kfifo_init - allocates a new FIFO using a preallocated buffer
* @buffer: the preallocated buffer to be used.
* @size: the size of the internal buffer, this have to be a power of 2.
* @gfp_mask: get_free_pages mask, passed to kmalloc()
* @lock: the lock to be used to protect the fifo buffer
*
* Do NOT pass the kfifo to kfifo_free() after use! Simply free the
* &struct kfifo with kfree().
*/
struct kfifo *kfifo_init(unsigned char *buffer, unsigned int size,
             gfp_t gfp_mask, spinlock_t *lock)
{
struct kfifo *fifo;
/* size must be a power of 2 */
    BUG_ON(size & (size - 1)); //大小必须为2的k次方(k>0)的目的在于put/get中从虚拟索引计算真实索引,size & (size - 1)是常用判断技巧
    fifo = kmalloc(sizeof(struct kfifo), gfp_mask); //分配kfifo数据结构
if (!fifo)
return ERR_PTR(-ENOMEM);
    fifo->buffer = buffer;
    fifo->size = size;
    fifo->in = fifo->out = 0; //当fifo->in == fifo->out 时,表示空队列
    fifo->lock = lock;
return fifo;
}
/**
* kfifo_alloc - allocates a new FIFO and its internal buffer
* @size: the size of the internal buffer to be allocated.
* @gfp_mask: get_free_pages mask, passed to kmalloc()
* @lock: the lock to be used to protect the fifo buffer
*
* The size will be rounded-up to a power of 2.
*/
//通过调用kfifo_alloc分配队列空间,该函数会调用kfifo_init初始化kfifo结构体,并调整size的大小以适应运算
struct kfifo *kfifo_alloc(unsigned int size, gfp_t gfp_mask, spinlock_t *lock)
{
unsigned char *buffer;
struct kfifo *ret;
/*
     * round up to the next power of 2, since our 'let the indices
     * wrap' tachnique works only in this case.
     */
if (size & (size - 1)) { //如果size不是2的k次方,代码将size调整最近的2^k次方附近
        BUG_ON(size > 0x80000000);
        size = roundup_pow_of_two(size);
}
    buffer = kmalloc(size, gfp_mask);
if (!buffer)
return ERR_PTR(-ENOMEM);
    ret = kfifo_init(buffer, size, gfp_mask, lock);
if (IS_ERR(ret))
        kfree(buffer);
return ret;
}
/**
* __kfifo_put - puts some data into the FIFO, no locking version
* @fifo: the fifo to be used.
* @buffer: the data to be added.
* @len: the length of the data to be added.
*
* This function copies at most @len bytes from the @buffer into
* the FIFO depending on the free space, and returns the number of
* bytes copied.
*
* Note that with only one concurrent reader and one concurrent
* writer, you don't need extra locking to use these functions.
*/
unsigned int __kfifo_put(struct kfifo *fifo,
unsigned char *buffer, unsigned int len)
{
unsigned int l;
//fifo->size - fifo->in + fifo->out,这段代码计算空闲的空间
//in是写索引,out是读索引,而且put与get操作都是分别增加in与out的值来重新计算虚拟索引
//注意,out 始终不会大于 in,(in - out)是有效数据空间大小,size是总空间的大小
//那么空闲的空间大小就是 size - (int - out)
//如果请求的len大于空闲空间,就使len = size - (int - out)
    len = min(len, fifo->size - fifo->in + fifo->out);
/*
     * Ensure that we sample the fifo->out index -before- we
     * start putting bytes into the kfifo.
     */
    smp_mb();
/* first put the data starting from fifo->in to buffer end */
//(fifo->in & (fifo->size - 1))这段代码计算真实的写索引偏移,笔者假设为real_in
//这是因为in在每次调用put之后都会增加一个len的长度
//由于fifo->size必定是2的k次方,而(fifo->size - 1)就是类似0x00FFFFF的值
//(fifo->in & (fifo->size - 1))的操作从数学角度将就是对长度fifo->size的取模运算
//这里能用AND运算代替取模运算得益于前面申请的空间大小为2^k次方
//l = min(空闲空间大小,从real_in开始到缓冲区结尾的空间)
    l = min(len, fifo->size - (fifo->in & (fifo->size - 1)));
//先从buffer中拷贝l字节到缓冲区剩余空间,l<=len,也<=从real_in开始到缓冲区结尾的空间
//所以这个copy可能没拷贝完,但是不会造成缓冲区越界
memcpy(fifo->buffer + (fifo->in & (fifo->size - 1)), buffer, l);
/* then put the rest (if any) at the beginning of the buffer */
//当len > l时,拷贝buffer中剩余的内容,其实地址当然为buffer + l,而剩余的大小为len - l
//当len == l时,下面的memcpy啥都不干,绝对精妙的算法
memcpy(fifo->buffer, buffer + l, len - l);
/*
     * Ensure that we add the bytes to the kfifo -before-
     * we update the fifo->in index.
     */
    smp_wmb();
//更新in(写者)的逻辑索引
    fifo->in += len;
return len;
}
/**
* __kfifo_get - gets some data from the FIFO, no locking version
* @fifo: the fifo to be used.
* @buffer: where the data must be copied.
* @len: the size of the destination buffer.
*
* This function copies at most @len bytes from the FIFO into the
* @buffer and returns the number of copied bytes.
*
* Note that with only one concurrent reader and one concurrent
* writer, you don't need extra locking to use these functions.
*/
unsigned int __kfifo_get(struct kfifo *fifo,
unsigned char *buffer, unsigned int len)
{
unsigned int l;
//读取的大小不能超过有效空间长度
//经过min运算后len <= 请求的空间len, len <= size
    len = min(len, fifo->in - fifo->out);
/*
     * Ensure that we sample the fifo->in index -before- we
     * start removing bytes from the kfifo.
     */
    smp_rmb();
/* first get the data from fifo->out until the end of the buffer */
//同理,fifo->out & (fifo->size - 1)等于out(读者)的虚拟索引计算出来真实索引real_out
//fifo->size - real_out就等于该索引到缓冲区尾部的空间大小
//经过min运算后,l<=len,l<=real_out至缓冲区尾部的空间大小
    l = min(len, fifo->size - (fifo->out & (fifo->size - 1)));
//从real_out开始拷贝l字节内容到buffer中
memcpy(buffer, fifo->buffer + (fifo->out & (fifo->size - 1)), l);
/* then get the rest (if any) from the beginning of the buffer */
//如果l<len,那么从fifo->buffer的首部开始继续拷贝剩下的内容
//如果l == len,memcpy啥都不干
memcpy(buffer + l, fifo->buffer, len - l);
/*
     * Ensure that we remove the bytes from the kfifo -before-
     * we update the fifo->out index.
     */
    smp_mb();
//更新out(读者)的虚拟索引
    fifo->out += len;
return len;
}

      从上面几个重要的函数可以看出一些特性,就是put函数能放入的数据长度永远不会大于缓冲区的长度,而fifo->in - fifo->out永远小于等于size。
get函数得到的数据永远小于等于size(这个是必然的)。
在读者/写者问题中,发生互斥问题时,使用环形队列是很好的解决方案。
fifo 不太好的地方在于,想写完整的话,必须保证空间足够大,否则不能保证一次写完,特别是在中断调用过程中,如果阻塞了,然后让读者读取内容后腾出空间唤醒继续写,将会出现问题,唯一我能想到的方法就是申请大空间,以避免这种“没写完”的情况发生。不知道还有没有其它方法,写的不对的地方,大家讨论讨论,我是菜鸟,写得也仓促,刚写内核代码不久,请多指教。

posted @ 2010-01-07 14:37  napoleon_liu  阅读(1945)  评论(0编辑  收藏  举报