深入理解Go语言(04):scheduler调度器-GMP里结构体源码分析

在前面一节中简单介绍了golang的调度模型-GPM模型,介绍了他们各自的作用。这篇文章就来看看他们的源码结构。

Go版本:go1.13.9

M结构体#

M结构体是OS线程的一个抽象,主要负责结合P运行G。
它里面有很多字段,差不多有60个字段,我们看看里面主要的字段意思。
/src/runtime/runtime2.go

Copy
type m struct { // 系统管理的一个g,执行调度代码时使用的。比如执行用户的goroutine时,就需要把把用户 // 的栈信息换到内核线程的栈,以便能够执行用户goroutine g0 *g // goroutine with scheduling stack morebuf gobuf // gobuf arg to morestack divmod uint32 // div/mod denominator for arm - known to liblink // Fields not known to debuggers. procid uint64 // for debuggers, but offset not hard-coded //处理signal的 g gsignal *g // signal-handling g goSigStack gsignalStack // Go-allocated signal handling stack sigmask sigset // storage for saved signal mask //线程的本地存储TLS,这里就是为什么OS线程能运行M关键地方 tls [6]uintptr // thread-local storage (for x86 extern register) //go 关键字运行的函数 mstartfn func() //当前运行的用户goroutine的g结构体对象 curg *g // current running goroutine caughtsig guintptr // goroutine running during fatal signal //当前工作线程绑定的P,如果没有就为nil p puintptr // attached p for executing go code (nil if not executing go code) //暂存与当前M潜在关联的P nextp puintptr //M之前调用的P oldp puintptr // the p that was attached before executing a syscall id int64 mallocing int32 throwing int32 //当前M是否关闭抢占式调度 preemptoff string // if != "", keep curg running on this m locks int32 dying int32 profilehz int32 //M的自旋状态,为true时M处于自旋状态,正在从其他线程偷G; 为false,休眠状态 spinning bool // m is out of work and is actively looking for work blocked bool // m is blocked on a note newSigstack bool // minit on C thread called sigaltstack printlock int8 incgo bool // m is executing a cgo call freeWait uint32 // if == 0, safe to free g0 and delete m (atomic) fastrand [2]uint32 needextram bool traceback uint8 ncgocall uint64 // number of cgo calls in total ncgo int32 // number of cgo calls currently in progress cgoCallersUse uint32 // if non-zero, cgoCallers in use temporarily cgoCallers *cgoCallers // cgo traceback if crashing in cgo call //没有goroutine运行时,工作线程睡眠 //通过这个来唤醒工作线程 park note // 休眠锁 //记录所有工作线程的链表 alllink *m // on allm schedlink muintptr //当前线程内存分配的本地缓存 mcache *mcache //当前M锁定的G, lockedg guintptr createstack [32]uintptr // stack that created this thread. lockedExt uint32 // tracking for external LockOSThread lockedInt uint32 // tracking for internal lockOSThread nextwaitm muintptr // next m waiting for lock waitunlockf func(*g, unsafe.Pointer) bool waitlock unsafe.Pointer waittraceev byte waittraceskip int startingtrace bool syscalltick uint32 //操作系统线程id thread uintptr // thread handle freelink *m // on sched.freem // these are here because they are too large to be on the stack // of low-level NOSPLIT functions. libcall libcall libcallpc uintptr // for cpu profiler libcallsp uintptr libcallg guintptr syscall libcall // stores syscall parameters on windows vdsoSP uintptr // SP for traceback while in VDSO call (0 if not in call) vdsoPC uintptr // PC for traceback while in VDSO call dlogPerM mOS }

看看几个比较重要的字段:
g0:用于执行调度器的g0
gsignal:用于信号处理
tls:线程本地存储的tls
p:goroutine绑定的本地资源


P结构体#

一个M要运行,必须绑定P才能运行goroutine,M阻塞时,P会被传给其他M。

/src/runtime/runtime2.go

Copy
type p struct { //allp中的索引 id int32 //p的状态 status uint32 // one of pidle/prunning/... link puintptr schedtick uint32 // incremented on every scheduler call->每次scheduler调用+1 syscalltick uint32 // incremented on every system call->每次系统调用+1 sysmontick sysmontick // last tick observed by sysmon //指向绑定的 m,如果 p 是 idle 的话,那这个指针是 nil m muintptr // back-link to associated m (nil if idle) mcache *mcache raceprocctx uintptr //不同大小可用defer结构池 deferpool [5][]*_defer // pool of available defer structs of different sizes (see panic.go) deferpoolbuf [5][32]*_defer // Cache of goroutine ids, amortizes accesses to runtime·sched.goidgen. goidcache uint64 goidcacheend uint64 //本地运行队列,可以无锁访问 // Queue of runnable goroutines. Accessed without lock. runqhead uint32 //队列头 runqtail uint32 //队列尾 //数组实现的循环队列 runq [256]guintptr // runnext, if non-nil, is a runnable G that was ready'd by // the current G and should be run next instead of what's in // runq if there's time remaining in the running G's time // slice. It will inherit the time left in the current time // slice. If a set of goroutines is locked in a // communicate-and-wait pattern, this schedules that set as a // unit and eliminates the (potentially large) scheduling // latency that otherwise arises from adding the ready'd // goroutines to the end of the run queue. // runnext 非空时,代表的是一个 runnable 状态的 G, //这个 G 被 当前 G 修改为 ready 状态,相比 runq 中的 G 有更高的优先级。 //如果当前 G 还有剩余的可用时间,那么就应该运行这个 G //运行之后,该 G 会继承当前 G 的剩余时间 runnext guintptr // Available G's (status == Gdead) //空闲的g gFree struct { gList n int32 } sudogcache []*sudog sudogbuf [128]*sudog tracebuf traceBufPtr // traceSweep indicates the sweep events should be traced. // This is used to defer the sweep start event until a span // has actually been swept. traceSweep bool // traceSwept and traceReclaimed track the number of bytes // swept and reclaimed by sweeping in the current sweep loop. traceSwept, traceReclaimed uintptr palloc persistentAlloc // per-P to avoid mutex _ uint32 // Alignment for atomic fields below // Per-P GC state gcAssistTime int64 // Nanoseconds in assistAlloc gcFractionalMarkTime int64 // Nanoseconds in fractional mark worker (atomic) gcBgMarkWorker guintptr // (atomic) gcMarkWorkerMode gcMarkWorkerMode // gcMarkWorkerStartTime is the nanotime() at which this mark // worker started. gcMarkWorkerStartTime int64 // gcw is this P's GC work buffer cache. The work buffer is // filled by write barriers, drained by mutator assists, and // disposed on certain GC state transitions. gcw gcWork // wbBuf is this P's GC write barrier buffer. // // TODO: Consider caching this in the running G. wbBuf wbBuf runSafePointFn uint32 // if 1, run sched.safePointFn at next safe point pad cpu.CacheLinePad }

其他的一些字段就是gc,trace,debug信息


G结构体#

G就是goroutine。主要保存 goroutine 的所有信息以及栈信息,gobuf结构体:cpu里的寄存器信息,以便在轮到本 goroutine 执行时,知道从哪里开始执行。

/src/runtime/runtime2.go

Copy
type stack struct { lo uintptr //栈顶,指向内存低地址 hi uintptr //栈底,指向内存搞地址 } type g struct { // Stack parameters. // stack describes the actual stack memory: [stack.lo, stack.hi). // stackguard0 is the stack pointer compared in the Go stack growth prologue. // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption. // stackguard1 is the stack pointer compared in the C stack growth prologue. // It is stack.lo+StackGuard on g0 and gsignal stacks. // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash). // 记录该goroutine使用的栈 stack stack // offset known to runtime/cgo //下面两个成员用于栈溢出检查,实现栈的自动伸缩,抢占调度也会用到stackguard0 stackguard0 uintptr // offset known to liblink stackguard1 uintptr // offset known to liblink _panic *_panic // innermost panic - offset known to liblink _defer *_defer // innermost defer // 此goroutine正在被哪个工作线程执行 m *m // current m; offset known to arm liblink //这个字段跟调度切换有关,G切换时用来保存上下文,保存什么,看下面gobuf结构体 sched gobuf syscallsp uintptr // if status==Gsyscall, syscallsp = sched.sp to use during gc syscallpc uintptr // if status==Gsyscall, syscallpc = sched.pc to use during gc stktopsp uintptr // expected sp at top of stack, to check in traceback param unsafe.Pointer // passed parameter on wakeup,wakeup唤醒时传递的参数 // 状态Gidle,Grunnable,Grunning,Gsyscall,Gwaiting,Gdead atomicstatus uint32 stackLock uint32 // sigprof/scang lock; TODO: fold in to atomicstatus goid int64 //schedlink字段指向全局运行队列中的下一个g, //所有位于全局运行队列中的g形成一个链表 schedlink guintptr waitsince int64 // approx time when the g become blocked waitreason waitReason // if status==Gwaiting,g被阻塞的原因 //抢占信号,stackguard0 = stackpreempt,如果需要抢占调度,设置preempt为true preempt bool // preemption signal, duplicates stackguard0 = stackpreempt paniconfault bool // panic (instead of crash) on unexpected fault address preemptscan bool // preempted g does scan for gc gcscandone bool // g has scanned stack; protected by _Gscan bit in status gcscanvalid bool // false at start of gc cycle, true if G has not run since last scan; TODO: remove? throwsplit bool // must not split stack raceignore int8 // ignore race detection events sysblocktraced bool // StartTrace has emitted EvGoInSyscall about this goroutine sysexitticks int64 // cputicks when syscall has returned (for tracing) traceseq uint64 // trace event sequencer tracelastp puintptr // last P emitted an event for this goroutine // 如果调用了 LockOsThread,那么这个 g 会绑定到某个 m 上 lockedm muintptr sig uint32 writebuf []byte sigcode0 uintptr sigcode1 uintptr sigpc uintptr // 创建这个goroutine的go表达式的pc gopc uintptr // pc of go statement that created this goroutine ancestors *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors) startpc uintptr // pc of goroutine function racectx uintptr waiting *sudog // sudog structures this g is waiting on (that have a valid elem ptr); in lock order cgoCtxt []uintptr // cgo traceback context labels unsafe.Pointer // profiler labels timer *timer // cached timer for time.Sleep,为 time.Sleep 缓存的计时器 selectDone uint32 // are we participating in a select and did someone win the race? // Per-G GC state // gcAssistBytes is this G's GC assist credit in terms of // bytes allocated. If this is positive, then the G has credit // to allocate gcAssistBytes bytes without assisting. If this // is negative, then the G must correct this by performing // scan work. We track this in bytes to make it fast to update // and check for debt in the malloc hot path. The assist ratio // determines how this corresponds to scan work debt. gcAssistBytes int64 }

gobuf#

gobuf结构体用于保存goroutine的调度信息,主要包括CPU的几个寄存器的值。

要了解寄存器是什么,可以点击这里:
寄存器1
寄存器2

/src/runtime/runtime2.go

Copy
type gobuf struct { // The offsets of sp, pc, and g are known to (hard-coded in) libmach. // // ctxt is unusual with respect to GC: it may be a // heap-allocated funcval, so GC needs to track it, but it // needs to be set and cleared from assembly, where it's // difficult to have write barriers. However, ctxt is really a // saved, live register, and we only ever exchange it between // the real register and the gobuf. Hence, we treat it as a // root during stack scanning, which means assembly that saves // and restores it doesn't need write barriers. It's still // typed as a pointer so that any other writes from Go get // write barriers. sp uintptr // 保存CPU的rsp寄存器的值 pc uintptr // 保存CPU的rip寄存器的值 g guintptr // 记录当前这个gobuf对象属于哪个goroutine ctxt unsafe.Pointer //保存系统调用的返回值,因为从系统调用返回之后如果p被其它工作线程抢占, //则这个goroutine会被放入全局运行队列被其它工作线程调度,其它线程需要知道系统调用的返回值。 ret sys.Uintreg // 保存系统调用的返回值 lr uintptr //保存CPU的rip寄存器的值 bp uintptr // for GOEXPERIMENT=framepointer }

调度器sched结构#

所有的gorouteine都是被调度器调度运行,调度器持有全局资源

sched#

/src/runtime/runtime2.go

Copy
type schedt struct { // accessed atomically. keep at top to ensure alignment on 32-bit systems. // 需以原子访问访问。 // 保持在 struct 顶部,以使其在 32 位系统上可以对齐 goidgen uint64 lastpoll uint64 lock mutex // When increasing nmidle, nmidlelocked, nmsys, or nmfreed, be // sure to call checkdead(). //由空闲的工作线程组成的链表 midle muintptr // idle m's waiting for work //空闲的工作线程的数量 nmidle int32 // number of idle m's waiting for work //空闲的且被 lock 的 m 计数 nmidlelocked int32 // number of locked m's waiting for work //已经创建的多个m,下一个m id mnext int64 // number of m's that have been created and next M ID //被允许创建的最大m线程数量 maxmcount int32 // maximum number of m's allowed (or die) nmsys int32 // number of system m's not counted for deadlock //累积空闲的m数量 nmfreed int64 // cumulative number of freed m's //系统goroutine的数量,自动更新 ngsys uint32 // number of system goroutines; updated atomically //由空闲的 p 结构体对象组成的链表 pidle puintptr // idle p's //空闲的 p 结构体对象的数量 npidle uint32 nmspinning uint32 // See "Worker thread parking/unparking" comment in proc.go. // Global runnable queue. //全局运行队列 G队列 runq gQueue //这个结构体在proc.go里 //元素数量 runqsize int32 // disable controls selective disabling of the scheduler. // // Use schedEnableUser to control this. // // disable is protected by sched.lock. disable struct { // user disables scheduling of user goroutines. user bool runnable gQueue // pending runnable Gs n int32 // length of runnable } // Global cache of dead G's. 有效 dead G 全局缓存 gFree struct { lock mutex stack gList // Gs with stacks noStack gList // Gs without stacks n int32 } // Central cache of sudog structs. sudog结构的集中缓存 sudoglock mutex sudogcache *sudog // Central pool of available defer structs of different sizes. 不同大小有效的defer结构的池 deferlock mutex deferpool [5]*_defer // freem is the list of m's waiting to be freed when their // m.exited is set. Linked through m.freelink. freem *m gcwaiting uint32 // gc is waiting to run stopwait int32 stopnote note sysmonwait uint32 sysmonnote note // safepointFn should be called on each P at the next GC // safepoint if p.runSafePointFn is set. safePointFn func(*p) safePointWait int32 safePointNote note profilehz int32 // cpu profiling rate procresizetime int64 // nanotime() of last change to gomaxprocs totaltime int64 // ∫gomaxprocs dt up to procresizetime }

gQueue#

/src/runtime/proc.go

Copy
type gQueue struct { head guintptr //队列头 tail guintptr //队列尾 }

一些重要全局变量#

/src/runtime/proc.go

Copy
m0 m //代表主线程 g0 g //m0绑定的g0,也就是M结构体中m0.g0=&g0 allgs []*g //保存所有的g

/src/runtime/runtime2.go

Copy
allm *m //所有的m构成的一个链表,包括上面的m0 allp []*p //保存所有的p, len(allp) == gomaxprocs sched schedt //调度器的结构体,保存了调度器的各种信息 ncpu int32 //系统cpu核的数量,程序启动时由runtime初始化 gomaxprocs int32 //p 的最大数量,默认等于ncpu,可以通过GOMAXPROCS修改

在程序初始化时,这些变量都会被初始化为0值,指针会被初始化为nil指针,切片初始化为nil切片,int被初始化为数字0,结构体的所有成员变量按其本类型初始化为其类型的0值。


调度器初始化#

调度器初始化有一个主要的函数 schedinit(), 这个函数在 /src/runtime/proc.go 文件中。
函数开头还把初始化的顺序给列出来了:

// The bootstrap sequence is:
//
//  call osinit
//  call schedinit
//  make & queue new G
//  call runtime·mstart
//
// The new G calls runtime·main.

Copy
func schedinit() { // raceinit must be the first call to race detector. // In particular, it must be done before mallocinit below calls racemapshadow. _g_ := getg() //getg() 在 src/runtime/stubs.go 中声明,真正的代码由编译器生成 if raceenabled { _g_.racectx, raceprocctx0 = raceinit() } //设置最大M的数量 sched.maxmcount = 10000 tracebackinit() moduledataverify() //初始化栈空间常用管理链表 stackinit() mallocinit() //初始化当前m mcommoninit(_g_.m) cpuinit() // must run before alginit alginit() // maps must not be used before this call modulesinit() // provides activeModules typelinksinit() // uses maps, activeModules itabsinit() // uses activeModules msigsave(_g_.m) initSigmask = _g_.m.sigmask goargs() goenvs() parsedebugvars() gcinit() sched.lastpoll = uint64(nanotime()) // 把p数量从1调整到默认的CPU Core数量 procs := ncpu if n, ok := atoi32(gogetenv("GOMAXPROCS")); ok && n > 0 { procs = n } //调整P数量 //这里的P都是新建的,所以不返回有本地任务的p if procresize(procs) != nil { throw("unknown runnable goroutine during bootstrap") } // For cgocheck > 1, we turn on the write barrier at all times // and check all pointer writes. We can't do this until after // procresize because the write barrier needs a P. if debug.cgocheck > 1 { writeBarrier.cgo = true writeBarrier.enabled = true for _, p := range allp { p.wbBuf.reset() } } if buildVersion == "" { // Condition should never trigger. This code just serves // to ensure runtime·buildVersion is kept in the resulting binary. buildVersion = "unknown" } if len(modinfo) == 1 { // Condition should never trigger. This code just serves // to ensure runtime·modinfo is kept in the resulting binary. modinfo = "" } }

开头的这个函数getg(),跳转到了 func getg() *g  ,定义这么一个形式,什么意思?
函数首先调用 getg() 函数获取当前正在运行的 ggetg()src/runtime/stubs.go 中声明,真正的代码由编译器生成。

Copy
// getg returns the pointer to the current g. // The compiler rewrites calls to this function into instructions // that fetch the g directly (from TLS or from the dedicated register). func getg() *g

注释里也说了,getg 返回当前正在运行的 goroutine 的指针,它会从 tls 里取出 tls[0],也就是当前运行的 goroutine 的地址。编译器插入类似下面的代码:

Copy
get_tls(CX) MOVQ g(CX), BX; // BX存器里面现在放的是当前g结构体对象的地址

原来是这么个意思。

调度器初始化大致过程:
M初始化            -->   P 初始化          - -> G初始化
mcommoninit           Procresize                newproc
-------------------------------------------------------
allm 池                     allp池                       g.sched执行现场
                                                               p.runq 调度队列

MPG初始化过程。 M/P/G 初始化:mcommoninit、procresize、newproc,他们负责M资源池(allm)、p资源池(allp)、G的运行现场(g.sched) 以及调度队列(p.runq)


 

调度循环#

所有的工作初始化完成后,就要启动运行器了。准备工作做好了,就要启动mstart了。
这个工作在汇编语言中也可以看出来

/src/runtime/asm_amd64.s  (在linux下)

Copy
TEXT runtime·rt0_go(SB),NOSPLIT,$0 ... ... ... MOVL 16(SP), AX // copy argc MOVL AX, 0(SP) MOVQ 24(SP), AX // copy argv MOVQ AX, 8(SP) CALL runtime·args(SB) CALL runtime·osinit(SB) //OS初始化 CALL runtime·schedinit(SB) //调度器初始化 // create a new goroutine to start program MOVQ $runtime·mainPC(SB), AX // entry PUSHQ AX PUSHQ $0 // arg size CALL runtime·newproc(SB) // G 初始化 POPQ AX POPQ AX // start this M , 启动M CALL runtime·mstart(SB) CALL runtime·abort(SB) // mstart should never return RET

参考#

  1. 雨痕 《Go语言学习笔记》 https://book.douban.com/subject/26832468/
  2. 深度解密Go语言 https://qcrao.com/2019/09/02/dive-into-go-scheduler/
  3. https://blog.csdn.net/u010853261/article/details/84790392
posted @   九卷  阅读(2226)  评论(1编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
阅读排行:
· winform 绘制太阳,地球,月球 运作规律
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 上周热点回顾(3.3-3.9)
历史上的今天:
2019-04-02 五大常见的MySQL高可用方案
点击右上角即可分享
微信分享提示
CONTENTS