ChCore-lab4

lab 4: 多核调度与IPC

结合IPADS OS Lab Manual一起阅读，风味更佳！

多核启动支持：使ChCore通过树莓派厂商所提供的固件唤醒多核执行
多核调度: 使ChCore实现在多核上进行round-robin调度。
IPC：使ChCore支持进程间通信
IPC调优：为ChCore的IPC针对测试的特点进行调优。

踩坑1: 请记得在试验前进行git fetch和git pull更新。
踩坑2: 请注意检查以下文件（链接）是否存在：

Lab4/user/chcore-libc/musl-libc/src/thread/aarch64/__thread_exit.S
Lab4/user/chcore-libc/musl-libc/src/thread/aarch64/__unmapself.S

不存在可以手动创造一个空的文件链接即可。（原因目前不清楚）

多核支持

思考题1
阅读Lab1中的汇编代码kernel/arch/aarch64/boot/raspi3/init/start.S。说明ChCore是如何选定主CPU，并阻塞其他其他CPU的执行的。

Ans: 源码位于：Lab1/kernel/arch/aarch64/boot/raspi3/init/start.S。
我们可以看到，在start.S中，先执行以下三行汇编代码：

     mrs x8, mpidr_el1
    and x8, x8, #0xFF
    cbz x8, primary

mpidr_el1系统寄存器内存储着当前cpu的id。通过将其加载到通用寄存器中，判断其是否为cpuid:0。当其为0号cpu时（也就是选定的primary cpu），前往primary函数部分加载内核栈并进入启动流程。否则，进入到wait_for_bss_clear部分。等待primary cpu 完成初始化，清除全局变量后其他cpu再进行内核栈准备。最后，再等待smp(symmetric multi-processor，对称多处理器)发送允许启动的信号(secondary boot flag)后，再进入secondary_init_c部分，启动其他核心。

上图来自：知乎，作者：Linux内核远航者

阅读汇编代码kernel/arch/aarch64/boot/raspi3/init/start.S, init_c.c以及kernel/arch/aarch64/main.c，解释用于阻塞其他CPU核心的secondary_boot_flag是物理地址还是虚拟地址？是如何传入函数enable_smp_cores中，又是如何赋值的（考虑虚拟地址/物理地址）？

Ans:
此处需要阅读的源码比较分散。我们将他们的相对路径写在下面：

Lab1/kernel/arch/aarch64/boot/raspi3/init/init_c.c
Lab1/kernel/arch/aarch64/boot/raspi3/init/start.S
Lab2/kernel/arch/aarch64/main.c

以上是我们可以接触到的三个部分。接下来我们来关注secondary_boot_flag。在start.S中，我们有：

 wait_until_smp_enabled:
    /* CPU ID should be stored in x8 from the first line */
    mov x1, #8
    mul x2, x8, x1
    ldr x1, =secondary_boot_flag
    add x1, x1, x2
    ldr x3, [x1]
    cbz x3, wait_until_smp_enabled
 
    /* Set CPU id */
    mov x0, x8
    b   secondary_init_c
 
    /* Should never be here */
    b    .

其中我们会有ldr x1, =secondary_boot_flag执行。这句语句用于将标签 secondary_boot_flag 的地址加载到寄存器 x1 中。接着，将从mpidr_el1中获得的cpuid结合后，等待smp允许该cpu启动。

问题一： secondary_boot_flag是虚拟地址还是真实地址？

回答：虚拟地址。我们可以发现，当primary核心在启动过程中，其他核心都会被阻塞，等待能够启动的flag出现。这里primary_cpu和其他核心将会执行：

注意到start_kernel在init_c.c中并没有定义，而是定义在Lab1/kernel/arch/aarch64/boot/raspi3/include/boot.h中。在这其中，我们找到了函数的声明，但是：

定义在head.S中，但是我们并没有对应的汇编语言文件，只有二进制obj文件，怎么办？不要慌，我们要学会逆向工程猜出这些汇编语言。这时候我们就应该用上反汇编工具aarch64-linux-gnu-objdump了。

我们在根目录下执行：aarch64-linux-gnu-objdump -D Lab1/kernel/arch/aarch64/head.S.dbg.obj > head.S可以看到我们有如下的代码：

因此我们会在start_kernel内处理好基页表寄存器（用于后面配置页表），清除掉可能存在的tlb后，进入到我们的main函数中，随后进入到我们的enable_smp_cores函数中。(Lab4/kernel/arch/aarch64/machine/smp.c内有该函数定义)

接下来我们的第一个问题就很好回答了。我们的secondary_boot_flag将会以虚拟地址的方式传递给其他核心。当其他核心接收到非零的flag时，进行启动。直到唤醒所有的核心。可以参考上面的流程图。

我们可以看见，利用虚拟地址，primary cpu将会一个一个循环地唤醒其他cpu。随后利用flush_dcache_area汇编函数将其传递给其他核心，以此来实现唤醒其他cpu的方式。我们可以通过：

aarch64-linux-gnu-objdump -D Lab1/kernel/arch/aarch64/tools.S.dbg.obj > tools.S

来查看该函数。

我们可以看到Chcore使用缓存类型寄存器来告诉其他核心他们的boot_flag（通过映射）。

多核调度

调度队列初始化

练习1: 在 kernel/sched/policy_rr.c 中完善 rr_sched_init 函数，对 rr_ready_queue_meta 进行初始化。在完成填写之后，你可以看到输出“Scheduler metadata is successfully initialized!”并通过 Scheduler metadata initialization 测试点。

首先我们关注需要进行初始化的queue_meta结构体。

因此我们需要初始化每一个cpu的queue_meta结构体，初始化队列链表头和lock，并且确认其队伍长度初始化为0.

对于list_head结构体，我们可以在kernel/include/common/list.h中找到对应的初始化函数init_list_head函数进行初始化。对于lock结构体，我们可以找到kernel/include/common/lock.h中的lock_init函数进行初始化。队列长度每个cpu设置成0. pad部分是用于对齐的，我们可以无需关心。

 // List_head 初始化。
static inline void init_list_head(struct list_head *list);
int lock_init(struct lock *lock);

在kernel/include/arch/aarch64/plat/raspi3/machine.h中可以找到对应的PLAT_CPU_NUM。因此我们有：

Ans:

调度队列入队

和以前的方式一样，根据链表结构找出对应成员后调用list_append即可。不要忘记增加队列长度。我们先观察thread的结构(部分)：

 struct thread {
        struct list_head node; // link threads in a same cap_group
        struct list_head ready_queue_node; // link threads in a ready queue
        struct list_head notification_queue_node; 
        // link threads in a notification waiting queue
        struct thread_ctx *thread_ctx; // thread control block
        ...
};

这样我们就可以做如下操作：

调度队列出队

在 kernel/sched/sched.c 中完善 find_runnable_thread 函数，在就绪队列中找到第一个满足运行条件的线程并返回。在 kernel/sched/policy_rr.c 中完善 __rr_sched_dequeue 函数，将被选中的线程从就绪队列中移除。

我们先完成后面一部分。
如法炮制，调用list_del函数就可以了。注意我们可以从线程的上下文中寻找对应的cpuid。我们有：

我们可以进行如下操作：

Ans2:

接着完成find_runnable_thread函数。

首先，我们需要了解如何使用for_each_in_list接口，定义在kernel/include/common/list.h中：

 #define for_each_in_list(elem, type, field, head) \
    for ((elem) = container_of((head)->next, type, field); \
         &((elem)->field) != (head); \
         (elem) = container_of(((elem)->field).next, type, field))

需要我们传入四个元素：需要返回结构体对应的对象elem，type为需要转换成的结构体，field为等待转换的元素对应结构体的成员信息，head头节点，用于定义循环起点。

我们需要确定四个元素应该怎么填写。首先观察函数struct head * rr_sched_choose_thread(void)是如何传入参数，也就是链表头的：

 if (!(thread = find_runnable_thread(
        &(rr_ready_queue_meta[cpuid].queue_head)))) {
            unlock(&(rr_ready_queue_m[cpuid].queue_lock));
            goto out;
        }

可以发现传入的是ready_queue中的thread链表头。我们需要从中寻找到thread。
继续观察thread，与ready_queue相关的元素为struct list_head ready_queue_node。因此我们的field即为ready_queue_node。type即为我们需要的struct thread。
这样我们就有：

Ans 1:

相当于每次从链表中取出一个节点，转换成thread结构体，判断是否符合条件后返回。

协作式调度

在kernel/sched/sched.c中完善系统调用sys_yield，使用户态程序可以主动让出CPU核心触发线程调度。此外，请在kernel/sched/policy_rr.c 中完善rr_sched函数，将当前运行的线程重新加入调度队列中。

首先我们先观察我们需要进行的系统调用。我们可以看到下面的函数sys_top()和我们的sys_yield一样是系统调用函数。结合《操作系统：原理与实现》上第212页的内容，我们可以在 kernel/include/sched/sched.h 中看到所有的调度操作接口：

 struct sched_ops {
        int (*sched_init)(void); // 初始化调度系统。
        int (*sched)(void); // 触发一次当前cpu核心上的调度。
        int (*sched_periodic)(void); // 与rr调度有关，周期性触发调度。
        int (*sched_enqueue)(struct thread *thread);
        int (*sched_dequeue)(struct thread *thread);// 入队和出队操作。
        /* Debug tools */
        void (*sched_top)(void);
};

这样我们需要调用的则为cur_sched_ops -> sched()函数。这样我们有：

Ans1:

接下来完善我们的rr_sched()函数。我们直接入队即可。

Ans2:

抢占式调度

请根据代码中的注释在kernel/arch/aarch64/plat/raspi3/irq/timer.c中完善plat_timer_init函数，初始化物理时钟。需要完成的步骤有：

读取 CNTFRQ_EL0 寄存器，为全局变量 cntp_freq 赋值。

根据 TICK_MS（由ChCore决定的时钟中断的时间间隔，以ms为单位，ChCore默认每10ms触发一次时钟中断）和cntfrq_el0 （即物理时钟的频率）计算每两次时钟中断之间 system count 的增长量，将其赋值给 cntp_tval 全局变量，并将 cntp_tval 写入 CNTP_TVAL_EL0 寄存器！

根据上述说明配置控制寄存器CNTP_CTL_EL0。

我们一步步来进行。

读取 CNTFRQ_EL0 寄存器，为全局变量 cntp_freq 赋值。

首先我们参考plat_time_init中，Line 41-42，就可以发现其完成的工作是将cntpct_el0的值写入到全局变量cntp_init中。照抄我们就可以实现第一步：

 asm volatile ("mrs %0, cntfrq_el0":"=r" (cntp_freq));
// 用于debug，可以不抄下行。
kdebug("timer init cntpct_el0 = %lu\n", cntp_freq);

接下来进行第二步。

根据 TICK_MS（由ChCore决定的时钟中断的时间间隔，以ms为单位，ChCore默认每10ms触发一次时钟中断）和cntfrq_el0 （即物理时钟的频率）计算每两次时钟中断之间 system count 的增长量，将其赋值给 cntp_tval 全局变量，并将 cntp_tval 写入 CNTP_TVAL_EL0 寄存器。

我们需要对cntp_val进行计算，这取决于cntp_freq和TICK_MS.我们首先先诉诸我们的体系结构资料：

可以发现我们的时钟频率是Hz，则我们需要将其转换成ms制单位,也就是除以1000，将其乘以TICK_MS得到真实的在两次中断中时钟信号的反转次数，也就是system count,最后如法炮制写入系统寄存器 cntp_tval_el0。
这样我们有：

 cntp_tval = cntp_freq / 1000 * TICK_MS;
asm volatile ("msr cntp_tval_el0, %0"::"r" (cntp_tval));

根据上述说明配置控制寄存器CNTP_CTL_EL0。

接下来我么需要计算time_ctl。这里我们还是需要诉诸体系结构。我们有：

我们需要的是启用时钟并且不屏蔽时钟中断。因此将64位的time_ctl设置为1即可。最后写入系统寄存器cntp_ctl_el0。

     /* Calculate the value of timer_ctl */
    timer_ctl = 1;
    /* Write timer_ctl to the control register (cntp_ctl_el0) */
    asm volatile ("msr cntp_ctl_el0, %0"::"r" (timer_ctl));

Ans:

物理时钟中断和抢占

请在kernel/arch/aarch64/plat/raspi3/irq/irq.c中完善plat_handle_irq函数，当中断号irq为INT_SRC_TIMER1（代表中断源为物理时钟）时调用handle_timer_irq并返回。请在kernel/irq/timer.c中完善handle_timer_irq函数，递减当前运行线程的时间片budget。请在kernel/sched/policy_rr.c中完善rr_sched函数，在将当前运行线程重新加入就绪队列之前，恢复其调度时间片budget为DEFAULT_BUDGET。

这是三个问题呀！！！QAQ

首先第一步需要在INT_SRC_TIMER1情况下调用handle_timer_irq。这样我们有：
Ans1:

接下来我们将完善handle_timer_irq。这个函数位于kernel/irq/timer.c内。

我们需要

递减当前运行线程时间片budget.
调用sched函数触发调度。

参考注释，注意到需要判断当前的thread是否为空。如果不为空则需要对budget进行处理，然后调用sched。注意current_thread定义在kernel/include/object/thread.h中。

 /* Per-CPU variable current_thread is only accessed by its owner CPU. */
#define current_thread (current_threads[smp_get_cpu_id()])
// 返回值的类型为：struct thread *.

首先观察thread下的badge在哪里。我们需要观察三个文件：

kernel/include/object/thread.h
kernel/include/sched/context.h

可以发现有：thread -> thread_ctx -> sc -> budget。
参考注释，我们不需要在handle_timer_irq内进行sched()操作。因此我们进行如下操作：

Ans2:

注意！这里不需要在内部进行调用sched！

首先，中断来源于物理时钟。随后陷入kernel/arch/aarch64/irq/irq_entry.S中进行中断处理。我们可以看到在如下进入了handle_irq函数。

其次，将会进入到kernel/arch/aarch64/irq/irq_entry.c中执行plat_handle_irq函数。在这里面，有我们配置好的在中断号irq为INT_SRC_TIMER1（代表中断源为物理时钟）时调用handle_timer_irq并返回。如果handle_timer_irq内我们进行了调度，则会影响到下图157行的调度，导致线程错误。最终导致 segment_fault.

最后是在kernel/sched/policy_rr.c内refill budget后才入队。refill_budget就定义在该文件中。我们就有：

Ans3:

进程间通信IPC

在user/chcore-libc/musl-libc/src/chcore-port/ipc.c与kernel/ipc/connection.c中实现了大多数IPC相关的代码，请根据注释补全kernel/ipc/connection.c中的代码。

五个要完成的部分口瓜！

我们一步步进行。首先，第一步需要在register_server内进行，配置好
config的default_ipc_routine和register_cb_thread字段。于是我们有：

Ans1:

接下来我们需要完成函数create_connection.注意观察传入的参数。

 static int create_connection(struct thread *client,
                            struct thread *server,
                            int shm_cap_client, unsigned long shm_addr_client,
                            struct client_connection_result *res);

因此我们根据注释配置四个字段。

Ans2:

接下来完成 ipc_thread_migrate_to_server.

 static void ipc_thread_migrate_to_server(struct ipc_connection *conn,
                                         unsigned long shm_addr,
                                         size_t shm_size, unsigned int cap_num);

首先，根据注释参考函数sys_ipc_register_cb_return。从中我们可以得到以下信息：

 handler_config->ipc_routine_entry =
    arch_get_thread_next_ip(ipc_server_handler_thread);
handler_config->ipc_routine_stack =
    arch_get_thread_stack(ipc_server_handler_thread);

这样我们就可以获得stack和next_ip在config内的成员是什么。

接下来我们需要了解kernel/user-include/uapi/ipc.h。

这样我们就可以分别填写四个参数。第一个为共享内存的起始地址。因此为shm_addr.第二个为共享内存的大小。因此我们可以填写shm_size.第三个为客户端发送的能力组号。因此我们填写cap_num。最后是通信中的badge号。这样我们就完成了这一部分。

Ans3:

接下来完成sys_register_client函数。

 cap_t sys_register_client(cap_t server_cap, unsigned long shm_config_ptr);

前两行设置线程栈和ip与上面相同。接下来我们需要去观察在user/chcore-libc/musl-libc/src/chcore-port/ipc.c定义的函数.这里主要是设置好服务配置(server_config)中已经填写好的服务端入口。因此应该填写server_config -> declared_ipc_routine_entry。

Ans4:

接下来我们完成sys_ipc_register_cb_return部分。

 int sys_ipc_register_cb_return(cap_t server_handler_thread_cap,
                               unsigned long server_thread_exit_routine,
                               unsigned long server_shm_addr);

填写一行即可。

Ans5:

一時的な終了

到这里就暂时告一段落了，成功的结果你将看到如下所示：

优化部分因人而异，因此后面如果作者有时间将会提供优化部分的一些想法。
~~(是的，作者有99%的概率弃坑不填的说)~~

posted @ 2024-12-20 11:58 木木ちゃん阅读(430) 评论(3) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· ChCore-lab3

· ChCore-lab2

· 《操作系统真象还原》第九章线程（二）多线程轮转调度

· linux驱动移植-中断子系统执行流程

· chapter-1 start_kernel() part-2

阅读排行：
· 【.NET】调用本地 Deepseek 模型
· CSnakes vs Python.NET：高效嵌入与灵活互通的跨语言方案对比
· DeepSeek “源神”启动！「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库

公告

昵称：木木ちゃん
园龄： 1年11个月
粉丝： 8
关注： 4

+加关注

2025年2月

日

一

二

三

四

五

六

mumujun12345

ChCore-lab4

lab 4: 多核调度与IPC

多核支持

多核调度

调度队列初始化

调度队列入队

调度队列出队

协作式调度

抢占式调度

物理时钟中断和抢占

进程间通信IPC

一時的な終了

公告

搜索

常用链接

随笔分类

随笔档案

阅读排行榜

评论排行榜

推荐排行榜

最新评论

	wait_until_smp_enabled:
	/* CPU ID should be stored in x8 from the first line */
	mov x1, #8
	mul x2, x8, x1
	ldr x1, =secondary_boot_flag
	add x1, x1, x2
	ldr x3, [x1]
	cbz x3, wait_until_smp_enabled

	/* Set CPU id */
	mov x0, x8
	b secondary_init_c

	/* Should never be here */
	b .

	// List_head 初始化。
	static inline void init_list_head(struct list_head *list);
	int lock_init(struct lock *lock);

	struct thread {
	struct list_head node; // link threads in a same cap_group
	struct list_head ready_queue_node; // link threads in a ready queue
	struct list_head notification_queue_node;
	// link threads in a notification waiting queue
	struct thread_ctx *thread_ctx; // thread control block
	...
	};

	#define for_each_in_list(elem, type, field, head) \
	for ((elem) = container_of((head)->next, type, field); \
	&((elem)->field) != (head); \
	(elem) = container_of(((elem)->field).next, type, field))

	if (!(thread = find_runnable_thread(
	&(rr_ready_queue_meta[cpuid].queue_head)))) {
	unlock(&(rr_ready_queue_m[cpuid].queue_lock));
	goto out;
	}

	struct sched_ops {
	int (*sched_init)(void); // 初始化调度系统。
	int (*sched)(void); // 触发一次当前cpu核心上的调度。
	int (*sched_periodic)(void); // 与rr调度有关，周期性触发调度。
	int (sched_enqueue)(struct thread thread);
	int (sched_dequeue)(struct thread thread);// 入队和出队操作。
	/* Debug tools */
	void (*sched_top)(void);
	};

	asm volatile ("mrs %0, cntfrq_el0":"=r" (cntp_freq));
	// 用于debug，可以不抄下行。
	kdebug("timer init cntpct_el0 = %lu\n", cntp_freq);

	cntp_tval = cntp_freq / 1000 * TICK_MS;
	asm volatile ("msr cntp_tval_el0, %0"::"r" (cntp_tval));

	/* Calculate the value of timer_ctl */
	timer_ctl = 1;
	/* Write timer_ctl to the control register (cntp_ctl_el0) */
	asm volatile ("msr cntp_ctl_el0, %0"::"r" (timer_ctl));

	/* Per-CPU variable current_thread is only accessed by its owner CPU. */
	#define current_thread (current_threads[smp_get_cpu_id()])
	// 返回值的类型为：struct thread *.

	static int create_connection(struct thread *client,
	struct thread *server,
	int shm_cap_client, unsigned long shm_addr_client,
	struct client_connection_result *res);

	static void ipc_thread_migrate_to_server(struct ipc_connection *conn,
	unsigned long shm_addr,
	size_t shm_size, unsigned int cap_num);

	handler_config->ipc_routine_entry =
	arch_get_thread_next_ip(ipc_server_handler_thread);
	handler_config->ipc_routine_stack =
	arch_get_thread_stack(ipc_server_handler_thread);

	int sys_ipc_register_cb_return(cap_t server_handler_thread_cap,
	unsigned long server_thread_exit_routine,
	unsigned long server_shm_addr);