【verbs】ibv_create_comp_channel()

原文:ibv_create_comp_channel() - RDMAmojo RDMAmojo

描述
ibv_create_comp_channel() 为 RDMA 设备上下文创建完成事件通道。

这个 Completion  event channel是 libibverbs(库) 引入的抽象,在 InfiniBand 架构verbs规范或 RDMA 协议verbs规范中不存在。 Completion  event channel本质上是文件描述符,用于向用户空间进程传递工作完成通知。当为完成队列 (CQ) 生成‘工作完成’事件时,该事件通过附加到该 CQ 的 Completion  event channel传递。这可能有助于通过使用多个 Completion  event channel将完成事件引导到不同的线程,或者为不同的 CQ 赋予不同的优先级。

一个或多个完成队列(CQ)可以与同一个Completion  event channel相关联。

参数

NameDirectionDescription
contextin ibv_open_device()返回的 RDMA 设备上下文

返回值

ValueDescription
Completion event channel

指向新分配的 Completion event channel的指针

NULL失败时,errno 指示失败原因:

EMFILE此进程打开的文件太多
ENOMEM没有足够的资源来完成此操作

例子

创建一个Completion event channel并销毁它:

struct ibv_comp_channel *event_channel;
 
event_channel = ibv_create_comp_channel(context);
if (!event_channel) {
	fprintf(stderr, "Error, ibv_create_comp_channel() failed\n");
	return -1;
}
 
if (ibv_destroy_comp_channel(event_channel)) {
	fprintf(stderr, "Error, ibv_destroy_comp_channel() failed\n");
	return -1;
}

 常见问题


为什么 Completion event channel无论如何都有用?
Completion event channel是一个对象,它使用事件模式而不是轮询模式帮助处理用户空间进程中的'工作完成'(WC)。

为什么我要在事件模式下read '工作完成'(WC)?

如果您的进程不需要低延迟或需要降低 CPU 使用率,则事件模式更好;在没有事件到来时,read线程将进入睡眠状态,当 CQ 中有新的WC生成时,线程被内核唤醒。

(bandaoyu:估计是把Completion event channel当条件变量,线程在Completion event channel上wait,WC到来时候,唤醒wait

……
        channel_poll[0].fd = tx_cc->get_fd();
        channel_poll[0].events = POLLIN | POLLERR | POLLNVAL | POLLHUP;
        channel_poll[0].revents = 0;
        channel_poll[1].fd = rx_cc->get_fd();
        channel_poll[1].events = POLLIN | POLLERR | POLLNVAL | POLLHUP;
        channel_poll[1].revents = 0;
        r = 0;
        perf_logger->set(l_msgr_rdma_polling, 0);
        while (!done && r == 0) {
          r = poll(channel_poll, 2, 100); <--------------------阻塞等待wc然后写channel_poll触发poll
          if (r < 0) {
            r = -errno;
            lderr(cct) << __func__ << " poll failed " << r << dendl;
            ceph_abort();
          }
        }

……

为什么这种event pollng模式比轮询模式(busy_polling)时延要大,应该是因为poll进入等待状态和唤醒退出有上下文的切换。

是否可以将多个 CQ 与同一个 Completion event channel相关联?
是的。多个 CQ 可以与同一个Completion event channel相关联。

(也)可以使用多个Completion event channel来“聚合”不同的 CQ。这可能有助于通过使用多个Completion event channel将完成事件引导到不同的线程,或者为不同的 CQ 赋予不同的优先级。

(估计是不同的Completion event channel绑定到不同等级的CQ,然后对应等级的线程wait在不同的Completion event channel上)


如何使用多个Completion event channel来为不同的 CQ 赋予不同的优先级?
您可以将所有具有相同优先级的 CQ 关联到同一个Completion event channel,并根据 CQ 组的优先级处理 Work Completion 事件。

Comments

Tell us what do you think.

  1. Jingcha says:September 18, 2013

    Can you please explain what you mean by "You can associate all of the CQs that have the same priority with the same Completion event channel, and handle Work Completion events according to the priority of the CQs group."

    Thanks,

    Reply

    • Dotan Barak says:September 19, 2013

      Hi.

      I'll try to explain what I meant here:
      Let's assume that you have several CQs: CQx1..CQxN and CQy (CQy has higher priority),
      and your application needs to consume as less as CPU as possible
      (i.e. work with events and not with polling when reading Work Completions).

      Instead of using one Completion Event channel, you can have two of them,
      one for the CQx QPs and one for the CQy, each of them will be handled in a different thread.

      This way, as soon as CQy has an event, it will be handled
      (i.e. it gets higher priority than the other CQs).

      If your RDMA device supports multiple Completion vectors,
      CQy may get a different Completion vector than the other,
      and by doing this, get a better performance.

      Instead of using one CQy, you can have CQy1..CQyM but the idea is the same.

      I hope that I was clear with this explanation.

      Thanks
      Dotan

      Reply

  2. Omar says:April 5, 2014

    Dear Dotan

    I want to associate different completion channel to each CQ i create. I have different threads waiting for notification from the CQ assigned to them. I think if i associate the same completion channel to all my CQ's and messages arrive at all of them simultaneously, i get a big performance hit. Should this happen? Can you share some code snippet where you are creating a new completion channel for each CQ. The code i use is
    "cb->channel = ibv_create_comp_channel(cm_id->verbs);
    cb->cq[SEND_CQ_INDEX] = ibv_create_cq(cm_id->verbs, cqe, cb, cb->channel, 0);
    cb->cq[RECV_CQ_INDEX] = ibv_create_cq(cm_id->verbs, cqe, cb, cb->channel, 0);"

    Since the context is the same, for each queue I suppose even if i call ibv_create_comp_channel() for each cq, they are shared. Also i am using the same channel for send/receive CQ. Will this cause a performance drop (due to some internal Locking mechanism).
    I hope i have made my point clear. To summarize: I want all my CQ's to receive notification from the completion channel independent of each other. I have a large number of CQ's.

    Warm regards

    Omar Khan

    Reply

    • Dotan Barak says:April 6, 2014

      Hi Omar.

      I must admit that I never did it and never checked the performance difference.
      But you need to create Completion channels and create every group of CQs with different
      Completion channels.

      IMHO, using multiple completion vectors may provide even better performance improvement than using the same completion vector in all the CQs.

      Thanks
      Dotan

      Reply

  3. Omar Khan says:April 6, 2014

    Dear dotan

    How do you know that your device supports multiple completion vectors? and what does it mean multiple completion vectors? if my device does not support multiple completion vectors, how can i set up a communication channel between multiple processes and have them communicate independently. can i have multiple threads using independent completion channels to notify completions from different processes?

    Warm regards
    Omar

    Reply

    • Dotan Barak says:April 6, 2014

      Hi Omar.

      struct ibv_context contains an attribute called: num_comp_vectors.
      This value specify the number of completion vectors which the RDMA device supports.

      I'm not consider my self an expert in this, but I can try to explain *my* rational why using multiple completion vectors improves the performance:
      Using several completion vectors means that the interrupts will be spread across several vectors (which means that several cores in your system will handle them) and not only one vector.
      I'm sure that you can find information in the internet about this..

      If your device supports only one completion vector, I would try to set the affinity of the processes, that each one will use different core.

      I hope that this answer helped you.

      Thanks
      Dotan

      Reply

  4. Tingyu says:July 7, 2015

    Hi Dotan,

    Can I use the same event channel for both accepting connection and creating connection? In this way, a program can connect to itself, and the server-side event such as RDMA_CM_EVENT_CONNECT_REQUEST and client-side event such as RDMA_CM_EVENT_ADDR_RESOLVED can be polled in the same loop. It is possible?

    Thanks,
    Tingyu

    Reply

    • Dotan Barak says:July 13, 2015

      Hi Tingyu.

      The event channel is a mechanism for handle the events;
      it doesn't have any notion of the side it serves.

      For my understanding, the answer for you question is "yes".
      It should be possible to use the same event channel for both accepting and initiating connections.

      Thanks
      Dotan

      Reply

  5. Lingyan says:January 25, 2017

    Hi Dotan,
    Could multiple threads share one completion event channel associated with multiple CQs?
    Thanks,
    Lingyan

    Reply

    • Dotan Barak says:February 13, 2017

      Yes.

      Multiple threads can share one completion event channel associated with multiple CQs,
      it will be hard to predict though which thread will get the event.

      Thanks
      Dotan

      Reply

  6. HuaiEn says:August 6, 2019

    Hi Dotan,
    I saw your reply above
    ibv_create_comp_channel() - RDMAmojo RDMAmojo
    in your reply, you said "But you need to create Completion channels and create every group of CQs with different Completion channels."
    What does every group of CQs mean?
    As my realize, a completion channel is better to match to each CQ, or it is hard to know which CQ will receive event.
    Do I miss something?
    thanks

    BR,
    HuaiEn

    Reply

    • Dotan Barak says:August 16, 2019

      Hi.

      "Every group of CQs" means that if for example you create 100 CQs and you want to use 2 completion channels,
      you can attach the first 50 CQs to the first completion channel and the last 50 CQs to the second completion channel.

      When you get a CQ event, you get the CQ handle - so it is easy to know which CQ received the event.
      However, depend on your flow, maybe you won't be able to predict ahead which CQ will get the event.

      Is it more clear now?

      Thanks
      Dotan

      Reply

  7. HuaiEn says:August 18, 2019

    Hi Dotan,
    Thanks for clearly reply.
    There is another question.
    When do we need multiple cq?
    Thanks a lot

    Reply

    • Dotan Barak says:August 21, 2019

      Hi.

      There may be several reasons for using multiple CQs, here are some reasons that I just thought about:
      * Using *many* QPs and the number of generated completions may cause a CQ overrun to one CQ
      * If you want to provide different QoS (application-wise) to different QPs
      * If you want to provide different functionality to different QPs
      * Separating Send and Receive Queues
      * more?

      It is up to the application to decide this, it isn't a technology decision ..

      Thanks
      Dotan

      Reply

 

posted on 2022-10-04 01:22  bdy  阅读(12)  评论(0编辑  收藏  举报

导航