Linux mips64r2 PCI中断路由机制分析
Linux mips64r2 PCI中断路由机制分析
本文主要分析mips64r2 PCI设备中断路由原理和irq号分配实现方法,并尝试回答如下问题:
- PCI设备驱动中断注册(request_irq)时的irq#从哪里来?是硬件相关?还是软件相关?
- 中断上报时,CPU是如何获得这个irq#的?
本文主要分析PIC(可编程中断控制器)的工作原理,PIC一般集成在CPU中,不同arch、vendor CPU的PIC实现原理也不尽相同。本文基于kerne3.10 + mips64r2 XXX CPU分析。
mips64r2 PCI设备中断路由原理
如上图所示,硬件实现上,PCI中断路由主要涉及3个设备:PCI设备、PIC、CPU。
PIC作为核心器件,其核心功能如下:
- 160个32位IRT,依次对应160个硬件Interrupt Lines;
- 64个interrupt vector;
- 8个128位ITE(Interrupt Thread Enable),包含了4个节点的128个硬件线程使能位;
- 局部和全局Round-Robin策略分发中断到相应硬件线程;
- 8个系统定时器,2个看门狗定时器(可配置成NMI看门狗定时器,不同的IRT Entry);
- 支持IPI。
3个主要中断信号:
- interrupt pin:PCI设备中断输出信号。PCI设备提供4个中断输出(INTA#, INTB#, INTC#, INTD#),由PCI设备的pci configure space中interrupt pin指定;
- interrupt line:PIC设备输入信号,与PCI设备中断输出信号相连。由PCI设备的pci configure space中interrupt line指定;
- interrupt vector:PIC设备输出信号,与CPU中断输入信号相连。这个interrupt vector即PCI设备驱动中断注册的中断号irq#,可通过CPU的EIRR(extended interrupt request register)寄存器读取。由于mips64r2体系架构的限制,EIRR为64bit,每个bit代表一个vector,所以最多64个vector;
软件实现上,抽象出2个表对象来实现中断路由的管理和处理:
- IRT:interrupt redirection table。硬件表,index为interrupt line#,共160个条目,用于维护interrupt line和interrupt vector的映射关系, interrupt line mask/unmask、enable/disable等控制, interrupt的CPU亲和性设置等;
- IDT:interrupt description table。软件表,index为interrupt vector#,共64个条目,用与中断处理。
中断处理过程
如上图所示,中断由产生到结束的整个过程:
handle_int
-> plat_irq_dispatch
-> do_nlm_common_IRQ
-> do_IRQ
-> generic_handle_irq
-> generic_handle_irq_desc
-> __do_IRQ
-> handle_IRQ_event
硬件设备产生中断(request & pending);
- 不同的设备中断请求可能同时到达,PIC通过仲裁规则(如Round-Robin、优先级等)挑选出一个合适的请求(arbiter);
- PIC设置此中断相关的ACK位(assert),分发(delivery)请求到目的硬件线程(通过IRT的配置);
- CPU读取EIRR寄存器获取request irq#后写清除;
- 根据irq#查询IDT表获得此中断的desc;
- 如果是边沿触发中断,为了避免中断丢失,立即调用desc->chip->ack写相关寄存器(INT_ACK)清除本次中断源(de-assert),使相应interrupt line可以再次响应中断;
- 遍历desc->action->handler链处理中断请求;
- 如果是水平触发中断,在处理完中断后,调用desc->chip->end写相关寄存器(INT_ACK)清除本次中断源(de-assert);
- 转2)继续处理下一个中断请求。
IRT配置过程
IRT为PIC控制器的硬件表,主要在pic_init和request_irq中断注册时配置。IRT条目和字段解释如下:
- EN:IRT条目使能配置字段。在request_irq中断注册时使能;
- NMI:NMI中断配置字段。在pic_init中设置为非NMI;
- SCH:中断调度策略配置字段。在pic_init中设为locascheduling;
- RVEC:irq#配置字段。在pic_init中根据irt_irq_table设置;
- DT/DB/DTE:CPU亲和性配置字段。在request_irq中断注册时配置,也可以通过/proc/irq/N/smp_affinity设置,由desc->chip->set_affinity/pic_set_affinity实现。
IDT配置过程
IDT为软件表,主要在init_IRQ和request_irq中断注册时根据irq#配置IDT的相应条目的各个字段,如irq, irqaction handler, irq_chip handler等。2个配置过程区分如下:
- init_IRQ:配置每个条目的desc->chip,设置desc->status为IRQ_NOPROBE;初始化非PIC相关中断IDT条目,如IPI核间中断等。
- request_irq:初始化PIC相关中断IDT条目;
#define NR_IRQS 64 struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = { [0 ... NR_IRQS-1] = { .status = IRQ_DISABLED, .chip = &no_irq_chip, .handle_irq = handle_bad_irq, .depth = 1, .lock = __SPIN_LOCK_UNLOCKED(irq_desc->lock), } }; /** * struct irq_desc - interrupt descriptor * @irq: interrupt number for this descriptor * @timer_rand_state: pointer to timer rand state struct * @kstat_irqs: irq stats per cpu * @irq_2_iommu: iommu with this irq * @handle_irq: highlevel irq-events handler [if NULL, __do_IRQ()] * @chip: low level interrupt hardware access * @msi_desc: MSI descriptor * @handler_data: per-IRQ data for the irq_chip methods * @chip_data: platform-specific per-chip private data for the chip * methods, to allow shared chip implementations * @action: the irq action chain * @status: status information * @depth: disable-depth, for nested irq_disable() calls * @wake_depth: enable depth, for multiple set_irq_wake() callers * @irq_count: stats field to detect stalled irqs * @last_unhandled: aging timer for unhandled count * @irqs_unhandled: stats field for spurious unhandled interrupts * @lock: locking for SMP * @affinity: IRQ affinity on SMP * @node: node index useful for balancing * @pending_mask: pending rebalanced interrupts * @threads_active: number of irqaction threads currently running * @wait_for_threads: wait queue for sync_irq to wait for threaded handlers * @dir: /proc/irq/ procfs entry * @name: flow handler name for /proc/interrupts output */ struct irq_desc { unsigned int irq; struct timer_rand_state *timer_rand_state; unsigned int *kstat_irqs; #ifdef CONFIG_INTR_REMAP struct irq_2_iommu *irq_2_iommu; #endif irq_flow_handler_t handle_irq; struct irq_chip *chip; struct msi_desc *msi_desc; void *handler_data; void *chip_data; struct irqaction *action; /* IRQ action list */ unsigned int status; /* IRQ status */ unsigned int depth; /* nested irq disables */ unsigned int wake_depth; /* nested wake enables */ unsigned int irq_count; /* For detecting broken IRQs */ unsigned long last_unhandled; /* Aging timer for unhandled count */ unsigned int irqs_unhandled; spinlock_t lock; #ifdef CONFIG_SMP cpumask_var_t affinity; unsigned int node; #ifdef CONFIG_GENERIC_PENDING_IRQ cpumask_var_t pending_mask; #endif #endif atomic_t threads_active; #ifdef CONFIG_PREEMPT_HARDIRQS unsigned long forced_threads_active; #endif wait_queue_head_t wait_for_threads; #ifdef CONFIG_PROC_FS struct proc_dir_entry *dir; #endif const char *name; } ____cacheline_internodealigned_in_smp; struct irq_chip { const char *name; unsigned int (*startup)(unsigned int irq); void (*shutdown)(unsigned int irq); void (*enable)(unsigned int irq); void (*disable)(unsigned int irq); void (*ack)(unsigned int irq); void (*mask)(unsigned int irq); void (*mask_ack)(unsigned int irq); void (*unmask)(unsigned int irq); void (*eoi)(unsigned int irq); void (*end)(unsigned int irq); int (*set_affinity)(unsigned int irq, const struct cpumask *dest); int (*retrigger)(unsigned int irq); int (*set_type)(unsigned int irq, unsigned int flow_type); int (*set_wake)(unsigned int irq, unsigned int on); void (*bus_lock)(unsigned int irq); void (*bus_sync_unlock)(unsigned int irq); /* Currently used only by UML, might disappear one day.*/ #ifdef CONFIG_IRQ_RELEASE_METHOD void (*release)(unsigned int irq, void *dev_id); #endif /* * For compatibility, ->typename is copied into ->name. * Will disappear. */ const char *typename; }; static struct irq_chip nlm_common_pic = { .unmask = pic_unmask, .mask = pic_shutdown, .ack = pic_ack, .end = pic_end, .set_affinity = pic_set_affinity };
--EOF--