内核源码分析之软中断(基于3.16-rc4)
1.和软中断相关的数据结构:
softing_vec数组(kernel/softirq.c)
1 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
NR_SOFTIRQS值为10,说明内核支持10个软中断函数。
softirq_action结构体(include/linux/interrupt.h)
1 struct softirq_action 2 { 3 void (*action)(struct softirq_action *); 4 };
action是函数指针变量,指向了某个软中断函数。
irq_cpustat_t结构体(arch/x86/include/asm/hardirq.h)
1 typedef struct { 2 unsigned int __softirq_pending; 3 unsigned int __nmi_count; /* arch dependent */ 4 #ifdef CONFIG_X86_LOCAL_APIC 5 unsigned int apic_timer_irqs; /* arch dependent */ 6 unsigned int irq_spurious_count; 7 unsigned int icr_read_retry_count; 8 #endif 9 #ifdef CONFIG_HAVE_KVM 10 unsigned int kvm_posted_intr_ipis; 11 #endif 12 unsigned int x86_platform_ipis; /* arch dependent */ 13 unsigned int apic_perf_irqs; 14 unsigned int apic_irq_work_irqs; 15 #ifdef CONFIG_SMP 16 unsigned int irq_resched_count; 17 unsigned int irq_call_count; 18 /* 19 * irq_tlb_count is double-counted in irq_call_count, so it must be 20 * subtracted from irq_call_count when displaying irq_call_count 21 */ 22 unsigned int irq_tlb_count; 23 #endif 24 #ifdef CONFIG_X86_THERMAL_VECTOR 25 unsigned int irq_thermal_count; 26 #endif 27 #ifdef CONFIG_X86_MCE_THRESHOLD 28 unsigned int irq_threshold_count; 29 #endif 30 #if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN) 31 unsigned int irq_hv_callback_count; 32 #endif 33 } ____cacheline_aligned irq_cpustat_t;
每个cpu都有一个这样的结构体变量,在软中断中,我们要使用的是第2行的成员,32位的软中断掩码。当有一个软中断被挂起(将要被执行)的时候,会设置该掩码中的相应位。
2.软中断的执行过程
首先使用open_softirq()函数注册软中断函数,代码如下(kernel/softirq.c):
1 void open_softirq(int nr, void (*action)(struct softirq_action *)) 2 { 3 softirq_vec[nr].action = action; 4 }
将软中断函数指针action存入softirq_vec数组的对应元素中。
接着,使用raise_softirq()激活软中断,代码如下(kernel/softirq.c):
1 void raise_softirq(unsigned int nr) 2 { 3 unsigned long flags; 4 5 local_irq_save(flags); 6 raise_softirq_irqoff(nr); 7 local_irq_restore(flags); 8 }
第5行关闭本地中断,第7行恢复中断。第6行激活nr所对应的软中断函数。接着分析 raise_softirq_irqoff(),代码如下(kernel/softirq.c):
1 inline void raise_softirq_irqoff(unsigned int nr) 2 { 3 __raise_softirq_irqoff(nr); 4 5 /* 6 * If we're in an interrupt or softirq, we're done 7 * (this also catches softirq-disabled code). We will 8 * actually run the softirq once we return from 9 * the irq or softirq. 10 * 11 * Otherwise we wake up ksoftirqd to make sure we 12 * schedule the softirq soon. 13 */ 14 if (!in_interrupt()) 15 wakeup_softirqd(); 16 }
第3行__raise_softirq_irqoff函数设置了软中断掩码的相应位,代码如下(kernel/softirq.c)。然后第14行判断软中断是否已经激活或者被禁用,如果没有,那么在15行激活内核线程ksoftirq,去执行软中断。
1 void (unsigned int nr) 2 { 3 trace_softirq_raise(nr); 4 or_softirq_pending(1UL << nr); 5 }
具体而言,在第4行的函数中设置掩码位。
3.下面分析下在哪些地方都可以进入软中断。
第一个地方,当然就是上边所提到的,内核线程ksoftirq被激活的时候。下面看看ksoftirq线程(kernel/softirq.c)。
1 DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
给每个cpu都定义一个指向struct task_struct类型的结构体变量,很显然,该变量存的是ksoftirq线程的进程描述符。(由此也说明,linux的线程和进程是一个东西)
接着,我们要看看ksoftirq线程要执行的函数(kernel/softirq.c)。
1 static void run_ksoftirqd(unsigned int cpu) 2 { 3 local_irq_disable(); 4 if (local_softirq_pending()) { 5 /* 6 * We can safely run softirq on inline stack, as we are not deep 7 * in the task stack here. 8 */ 9 __do_softirq(); 10 rcu_note_context_switch(cpu); 11 local_irq_enable(); 12 cond_resched(); 13 return; 14 } 15 local_irq_enable(); 16 }
该函数就是ksoftirq线程的线程体。第9行__do_softirq()中调用所有软中断函数。
我们回过头来再分析下wakeup_softirqd(),看看ksoftirq线程怎样被唤醒(kernel/softirq.c)。
1 static void wakeup_softirqd(void) 2 { 3 /* Interrupts are disabled: no need to stop preemption */ 4 struct task_struct *tsk = __this_cpu_read(ksoftirqd); 5 6 if (tsk && tsk->state != TASK_RUNNING) 7 wake_up_process(tsk); 8 }
第4行把本地cpu的ksoftirq线程的描述符读到tsk变量中,第6行中判断ksoftirq线程如果没有运行的话,第7行唤醒该线程。
第二个地方,中断处理程序do_IRQ完成处理或者调用irq_exit函数时。下面看看irq_exit代码(kernel/softirq.c)。
1 void irq_exit(void) 2 { 3 #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED 4 local_irq_disable(); 5 #else 6 WARN_ON_ONCE(!irqs_disabled()); 7 #endif 8 9 account_irq_exit_time(current); 10 preempt_count_sub(HARDIRQ_OFFSET); 11 if (!in_interrupt() && local_softirq_pending()) 12 invoke_softirq(); 13 14 tick_irq_exit(); 15 rcu_irq_exit(); 16 trace_hardirq_exit(); /* must be last! */ 17 }
不用我说了吧,我觉得你一眼就能瞄见了第12行。看下invoke_softirq函数(kernel/softirq.c)。
1 static inline void invoke_softirq(void) 2 { 3 if (!force_irqthreads) { 4 #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK 5 /* 6 * We can safely execute softirq on the current stack if 7 * it is the irq stack, because it should be near empty 8 * at this stage. 9 */ 10 __do_softirq(); 11 #else 12 /* 13 * Otherwise, irq_exit() is called on the task stack that can 14 * be potentially deep already. So call softirq in its own stack 15 * to prevent from any overrun. 16 */ 17 do_softirq_own_stack(); 18 #endif 19 } else { 20 wakeup_softirqd(); 21 } 22 }
第3行force_irqthreads值为0,所以该函数也调用了__do_softirq()来执行软中断。
还有几处地方暂时不分析了,以后有空补上。
4.下面来看下__do_softirq()函数(kernel/softirq.c)。
1 asmlinkage __visible void __do_softirq(void) 2 { 3 unsigned long end = jiffies + MAX_SOFTIRQ_TIME; 4 unsigned long old_flags = current->flags; 5 int max_restart = MAX_SOFTIRQ_RESTART; 6 struct softirq_action *h; 7 bool in_hardirq; 8 __u32 pending; 9 int softirq_bit; 10 11 /* 12 * Mask out PF_MEMALLOC s current task context is borrowed for the 13 * softirq. A softirq handled such as network RX might set PF_MEMALLOC 14 * again if the socket is related to swap 15 */ 16 current->flags &= ~PF_MEMALLOC; 17 18 pending = local_softirq_pending(); 19 account_irq_enter_time(current); 20 21 __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); 22 in_hardirq = lockdep_softirq_start(); 23 24 restart: 25 /* Reset the pending bitmask before enabling irqs */ 26 set_softirq_pending(0); 27 28 local_irq_enable(); 29 30 h = softirq_vec; 31 32 while ((softirq_bit = ffs(pending))) { 33 unsigned int vec_nr; 34 int prev_count; 35 36 h += softirq_bit - 1; 37 38 vec_nr = h - softirq_vec; 39 prev_count = preempt_count(); 40 41 kstat_incr_softirqs_this_cpu(vec_nr); 42 43 trace_softirq_entry(vec_nr); 44 h->action(h); 45 trace_softirq_exit(vec_nr); 46 if (unlikely(prev_count != preempt_count())) { 47 pr_err("huh, entered softirq %u %s %p with preempt_count %08x, exited with %08x?\n", 48 vec_nr, softirq_to_name[vec_nr], h->action, 49 prev_count, preempt_count()); 50 preempt_count_set(prev_count); 51 } 52 h++; 53 pending >>= softirq_bit; 54 } 55 56 rcu_bh_qs(smp_processor_id()); 57 local_irq_disable(); 58 59 pending = local_softirq_pending(); 60 if (pending) { 61 if (time_before(jiffies, end) && !need_resched() && 62 --max_restart) 63 goto restart; 64 65 wakeup_softirqd(); 66 } 67 68 lockdep_softirq_end(in_hardirq); 69 account_irq_exit_time(current); 70 __local_bh_enable(SOFTIRQ_OFFSET); 71 WARN_ON_ONCE(in_interrupt()); 72 tsk_restore_flags(current, old_flags, PF_MEMALLOC); 73 }
在该函数中循环调用的所有的被激活的软中断函数。第5行MAX_SOFTIRQ_RESTART值为10,表示最多循环10次(不能让其他进程等太久),第32行获取pending表中第一被设置的比特位,第44行开始执行设置过的软中断函数。第53行对pending进行右移运算,然后进入下次循环。直到将本轮所有已设置的软中断函数全部执行完,退出循环。第59行重新获得本地cpu的软中断掩码,第61行如果时间没有超出end而且没有出现更高优先级的进程并且10次寻环未用完,那么跳回restart,重新for循环。否则,第65行唤醒softirqd内核线程。然后退出本函数。
至此,软中断的处理过程就分析完了。