errgroup：并发任务 goroutine 的传播控制

1、初识 errgroup

WaitGroup 主要用于控制任务组下的并发子任务。它的具体做法就是，子任务 goroutine 执行前通过 Add 方法添加任务数目，子任务 goroutine 结束时调用 Done 标记已完成任务数，主任务 goroutine 通过 Wait 方法等待所有的任务完成后才能执行后续逻辑。

 package main
 
 import (
     "net/http"
     "sync"
 )
 
 func main() {
    var wg sync.WaitGroup
    var urls = []string{
        "http://www.golang.org/",
        "http://www.baidu.com/",
        "http://www.bokeyuan12111.com/",
    }
    for _, url := range urls {
        wg.Add(1)
        go func(url string) {
            defer wg.Done()
            resp, err := http.Get(url)
            if err != nil {
                return
            }
            resp.Body.Close()
        }(url)
    }
    wg.Wait()
}

在以上示例代码中，我们通过三个 goroutine 去并发的请求 url，直到所有的子任务 goroutine 均完成访问，主任务 goroutine 下的 wg.Wait 才会停止阻塞。

但在实际的项目代码中，子任务 goroutine 的执行并不总是顺风顺水，它们也许会产生 error。而 WaitGroup 并没有告诉我们在子 goroutine 发生错误时，如何将其抛给主任务 groutine。

这个时候可以考虑使用 errgroup

 package main
 
 import (
     "fmt"
     "net/http"
 
     "golang.org/x/sync/errgroup"
 )
 
func main() {
    var urls = []string{
        "http://www.golang.org/",
        "http://www.baidu.com/",
        "http://www.bokeyuan12111.com/",
    }
    g := new(errgroup.Group)
    for _, url := range urls {
        url := url
        g.Go(func() error {
            resp, err := http.Get(url)
            if err != nil {
                fmt.Println(err)
                return err
            }
            fmt.Printf("get [%s] success: [%d] \n", url, resp.StatusCode)
            return resp.Body.Close()
        })
    }
    if err := g.Wait(); err != nil {
        fmt.Println(err)
    } else {
        fmt.Println("All success!")
    }
}

结果如下：

get [http://www.baidu.com/] success: [200] 
Get "http://www.bokeyuan12111.com/": dial tcp: lookup www.bokeyuan12111.com: no such host
Get "http://www.golang.org/": dial tcp 142.251.42.241:80: i/o timeout
Get "http://www.bokeyuan12111.com/": dial tcp: lookup www.bokeyuan12111.com: no such host

可以看到，执行获取www.bokeyuan12111.com和www.golang.org两个 url 的子 groutine 均发生了错误，在主任务 goroutine 中成功捕获到了第一个错误信息。

除了拥有 WaitGroup 的控制能力和错误传播的功能之外，errgroup 还有最重要的 context 反向传播机制，我们来看一下它的设计。

2、errgroup 源码解析

errgroup 的设计非常精练，全部代码如下

type Group struct {
     cancel func()
 
     wg sync.WaitGroup
 
     errOnce sync.Once
     err     error
 }
 
func WithContext(ctx context.Context) (*Group, context.Context) {
    ctx, cancel := context.WithCancel(ctx)
    return &Group{cancel: cancel}, ctx
}

func (g *Group) Wait() error {
    g.wg.Wait()
    if g.cancel != nil {
        g.cancel()
    }
    return g.err
}

func (g *Group) Go(f func() error) {
    g.wg.Add(1)

    go func() {
        defer g.wg.Done()

        if err := f(); err != nil {
            g.errOnce.Do(func() {
                g.err = err
                if g.cancel != nil {
                    g.cancel()
                }
            })
        }
    }()
}

可以看到，errgroup 的实现依靠于结构体 Group，它通过封装 sync.WaitGroup，继承了 WaitGroup 的特性，在 Go() 方法中新起一个子任务 goroutine，并在 Wait() 方法中通过 sync.WaitGroup 的 Wait 进行阻塞等待。

同时 Group 利用 sync.Once 保证了它有且仅会保留第一个子 goroutine 错误。

最后，Group 通过嵌入 context.WithCancel 方法产生的 cancel 函数，能够在子 goroutine 发生错误时，及时通过调用 cancle 函数，将 Context 的取消信号及时传播出去。当然，这一特性需要用户代码的配合。

3、errgroup 上下文取消

在 errgroup 的文档（https://pkg.go.dev/golang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup#example-Group-Pipeline）中，它基于 Go 官方文档的 pipeline（ https://blog.golang.org/pipelines），实现了一个任务组 goroutine 中上下文取消（Context cancelation）演示的示例。但该 Demo 的前提知识略多，本文这里基于其思想，提供一个易于理解的使用示例。

package main

import (
    "context"
    "fmt"

    "golang.org/x/sync/errgroup"
)

func main() {

    g, ctx := errgroup.WithContext(context.Background())
    dataChan := make(chan int, 20)

    // 数据生产端任务子 goroutine
    g.Go(func() error {
        defer close(dataChan)
        for i := 1; ; i++ {
            if i == 10 {
                return fmt.Errorf("data 10 is wrong")
            }
            dataChan <- i
            fmt.Println(fmt.Sprintf("sending %d", i))
        }
    })

    // 数据消费端任务子 goroutine
    for i := 0; i < 3; i++ {
        g.Go(func() error {
            for j := 1; ; j++ {
                select {
                case <-ctx.Done():
                    return ctx.Err()
                case number := <-dataChan:
                    fmt.Println(fmt.Sprintf("receiving %d", number))
                }
            }
        })
    }

    // 主任务 goroutine 等待 pipeline 结束数据流
    err := g.Wait()
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("main goroutine done!")
}

在以上示例中，我们模拟了一个数据传送管道。在数据的生产与消费任务集中，有四个子任务 goroutine：一个生产数据的 goroutine，三个消费数据的 goroutine。当数据生产方存在错误数据时（数据等于 10 ），我们停止数据的生产与消费，并将错误抛出，回到 main goroutine 的执行逻辑中。

可以看到，因为 errgroup 中的 Context cancle 函数的嵌入，我们在子任务 goroutine 中也能反向控制任务上下文。

程序的某一次运行，输出结果如下：

sending 1
sending 2
sending 3
sending 4
sending 5
sending 6
sending 7
sending 8
sending 9
receiving 1
receiving 3
receiving 2
receiving 4
data 10 is wrong
main goroutine done!

4、总结

errgroup 是 Go 官方的并发原语补充库，相对于标准库中提供的原语而言，显得没那么核心。这里总结一下 errgroup 的特性。

继承了 WaitGroup 的功能
错误传播：能够返回任务组中发生的第一个错误，但有且仅能返回该错误
context 信号传播：如果子任务 goroutine 中有循环逻辑，则可以添加 ctx.Done 逻辑，此时通过 context 的取消信号，提前结束子任务执行。

主要参考：https://blog.csdn.net/slphahaha/article/details/119525401

posted @ 2022-07-20 08:34 人艰不拆_zmc 阅读(333) 评论(0) 编辑收藏举报

刷新页面返回顶部

errgroup：并发任务 goroutine 的传播控制

1、初识 errgroup

2、errgroup 源码解析

3、errgroup 上下文取消

4、总结

公告