[HBase] 服务端RPC机制及代码梳理





1. 概要




  1. 客户端API发送的请求将会被RPCServer的Listener线程监听到。

  2. Listener线程将分配Reader给到此Channel用户后续请求的相应。

  3. Reader线程将请求包装成CallRunner实例,并将通过RpcScheduler线程根据请求属性分类dispatch到不同的Executor线程。

  4. Executor线程将会保存这个CallRunner实例到队列。

  5. 每一个Executor队列都被绑定了指定个数的Handler线程进行消费,消费很简单,即拿出队列的CallRunner实例,执行器run()方法。

  6. run()方法将会组装response到Responder线程中。

  7. Responder线程将会不断地将不同Channel的结果返回到客户端。


2. 代码梳理



2.1 组件初始化


  • RpcServer是在master或者regionserver启动时候进行初始化的,关键代码如下:

public HRegionServer(Configuration conf, CoordinatedStateManager csm)
     throws IOException, InterruptedException {
   this.fsOk = true;
   this.conf = conf;
  • rpcServeice声明RSRpcServices类型,为RpcServer类的实现接口。start()方法将会启动三个主要生产和消费 线程

      /** Starts the service.  Must be called before any calls will be handled. */
    public synchronized void start() {
      if (started) return;
      started = true;


2.2 客户端API请求接收和包装



  • Listener对于API请求的接收:

    void doAccept(SelectionKey key) throws IOException, OutOfMemoryError {
     Connection c;
     ServerSocketChannel server = (ServerSocketChannel) key.channel();

     SocketChannel channel;
     while ((channel = server.accept()) != null) {
       try {
// 当一个API请求过来时候将会打开一个Channel,Listener将会分配一个Reader注册。
       // reader实例个数有限,采取顺序分配和复用,即一个reader可能为多个Channel服务。
       Reader reader = getReader();
       try {
         SelectionKey readKey = reader.registerChannel(channel);
         // 同时也将保存这个Channel,用于后续的结果返回等
         c = getConnection(channel, System.currentTimeMillis());
         synchronized (connectionList) {
           connectionList.add(numConnections, c);


this.readThreads = conf.getInt("hbase.ipc.server.read.threadpool.size", 10);



  • Reader读取请求并包装请求


        void doRead(SelectionKey key) throws InterruptedException {
         try {
           // 此时将调用connection的读取和处理方法
           count = c.readAndProcess();
        public int readAndProcess() throws IOException, InterruptedException {
         // 通过connectionPreambleRead标记为判断此链接是否为新连接,如果是新的那么需要读取
         // 头部报文信息,用于判断当前链接属性,比如是当前采取的是哪种安全模式?
         if (!connectionPreambleRead) {
           count = readPreamble();
           if (!connectionPreambleRead) {
             return count;

         count = channelRead(channel, data);
         if (count >= 0 && data.remaining() == 0) { // count==0 if dataLength == 0
           // 实际处理请求,里面也会根据链接的头报文读取时候判断出的两种模式进行不同的处理。

         return count;
        private void process() throws IOException, InterruptedException {
           if (useSasl) {
              // Kerberos安全模式
          } else {
              // AuthMethod.SIMPLE模式


        private void processOneRpc(byte[] buf) throws IOException, InterruptedException {
         if (connectionHeaderRead) {
           // 处理具体请求
        } else {
           // 再次判断链接Header是否读取,未读取则取出头报文用以确定请求的服务和方法等。
           this.connectionHeaderRead = true;
           if (!authorizeConnection()) {
             throw new AccessDeniedException("Connection from " + this + " for service "
               connectionHeader.getServiceName() + " is unauthorized for user: " + user);
      protected void processRequest(byte[] buf) throws IOException, InterruptedException {
         long totalRequestSize = buf.length;
         // 这里将会判断RpcServer做接收到的请求是否超过了maxQueueSize,注意这个值为
         // RpcServer级别的变量
         if ((totalRequestSize + callQueueSize.get()) > maxQueueSize) {
           final Call callTooBig =
             new Call(id, this.service, null, null, null, null, this,
               responder, totalRequestSize, null);
           ByteArrayOutputStream responseBuffer = new ByteArrayOutputStream();
           setupResponse(responseBuffer, callTooBig, new CallQueueTooBigException(),
             "Call queue is full on " + getListenerAddress() +
             ", is hbase.ipc.server.max.callqueue.size too small?");
         Call call = new Call(id, this.service, md, header, param, cellScanner, this, responder,
         // 此时请求段处理结束,将请求包装成CallRunner后发送到不同的Executer的队列中去。
         scheduler.dispatch(new CallRunner(RpcServer.this, call, userProvider));

    注意这个值为 RpcServer级别的变量,默认值为1G,超过此阈值将会出现Call queue is full错误。


    this.maxQueueSize =this.conf.getInt("hbase.ipc.server.max.callqueue.size",DEFAULT_MAX_CALLQUEUE_SIZE);


2.3 请求转发与调度



  public RSRpcServices(HRegionServer rs) throws IOException {
   RpcSchedulerFactory rpcSchedulerFactory;
   try {
     Class<?> rpcSchedulerFactoryClass = rs.conf.getClass(
     rpcSchedulerFactory = ((RpcSchedulerFactory) rpcSchedulerFactoryClass.newInstance());


  • 请求转发



     public void dispatch(CallRunner callTask) throws InterruptedException {
       RpcServer.Call call = callTask.getCall();
        // 取得优先级,一般也是根据请求的内容事先定义好的一些操作作为高优先级
       int level = priority.getPriority(call.getHeader(), call.param);
       if (priorityExecutor != null && level > highPriorityLevel) {
         // 高优先级则进入高优先级执行器内
      } else if (replicationExecutor != null && level == HConstants.REPLICATION_QOS) {
         // replication级别的进入相应的replication执行器内
      } else {
         // 其他的一般请求为一般执行器内,大部分的请求都将落入此执行器
  • 执行器介绍-队列初始化


    1. 高优先请求级执行器

    2. 一般请求执行器

    3. replication请求执行器

        private final RpcExecutor callExecutor;
       private final RpcExecutor priorityExecutor;
       private final RpcExecutor replicationExecutor;



    float callQueuesHandlersFactor = conf.getFloat(CALL_QUEUE_HANDLER_FACTOR_CONF_KEY, 0);
    int numCallQueues = Math.max(1, (int)Math.round(handlerCount * callQueuesHandlersFactor));

    LOG.info("Using " + callQueueType + " as user call queue, count=" + numCallQueues);

    if (numCallQueues > 1 && callqReadShare > 0) {
    // multiple read/write queues
    if (callQueueType.equals(CALL_QUEUE_TYPE_DEADLINE_CONF_VALUE)) {
      CallPriorityComparator callPriority = new CallPriorityComparator(conf, this.priority);
        // 实例化RW读取执行器,构造参数中的为读写比例,其中读取又分为一般读取和scan读取比例
        // 后续将会调用重载的其他构造方法,最终将会计算出各个读取队列的个数和handler的比例数
      callExecutor = new RWQueueRpcExecutor("RW.default", handlerCount, numCallQueues,
          callqReadShare, callqScanShare, maxQueueLength, conf, abortable,
          BoundedPriorityBlockingQueue.class, callPriority);
    } else {


    public RWQueueRpcExecutor(final String name, int writeHandlers, int readHandlers,
       int numWriteQueues, int numReadQueues, float scanShare,
       final Class<? extends BlockingQueue> writeQueueClass, Object[] writeQueueInitArgs,
       final Class<? extends BlockingQueue> readQueueClass, Object[] readQueueInitArgs) {
     super(name, Math.max(writeHandlers, numWriteQueues) + Math.max(readHandlers, numReadQueues));
     int numScanQueues = Math.max(0, (int)Math.floor(numReadQueues * scanShare));
     int scanHandlers = Math.max(0, (int)Math.floor(readHandlers * scanShare));
     if ((numReadQueues - numScanQueues) > 0) {
       numReadQueues -= numScanQueues;
       readHandlers -= scanHandlers;
    } else {
       numScanQueues = 0;
       scanHandlers = 0;
// 确定各个主要队列参数
     this.writeHandlersCount = Math.max(writeHandlers, numWriteQueues);
     this.readHandlersCount = Math.max(readHandlers, numReadQueues);
     this.scanHandlersCount = Math.max(scanHandlers, numScanQueues);
     this.numWriteQueues = numWriteQueues;
     this.numReadQueues = numReadQueues;
     this.numScanQueues = numScanQueues;
     this.writeBalancer = getBalancer(numWriteQueues);
     this.readBalancer = getBalancer(numReadQueues);
     this.scanBalancer = getBalancer(numScanQueues);
     queues = new ArrayList<BlockingQueue<CallRunner>>(writeHandlersCount + readHandlersCount);
     LOG.debug(name + " writeQueues=" + numWriteQueues + " writeHandlers=" + writeHandlersCount +
               " readQueues=" + numReadQueues + " readHandlers=" + readHandlersCount +
              ((numScanQueues == 0) ? "" : " scanQueues=" + numScanQueues +
                 " scanHandlers=" + scanHandlersCount));
// 初始化队列列表,注意queues为有序列表,如下队列位置初始化后不会变动,在后续按照具体的请求
     // 通过具体的getBalancer方法进行查找
     for (int i = 0; i < numWriteQueues; ++i) {
         ReflectionUtils.newInstance(writeQueueClass, writeQueueInitArgs));
     for (int i = 0; i < (numReadQueues + numScanQueues); ++i) {
         ReflectionUtils.newInstance(readQueueClass, readQueueInitArgs));
  • 执行器介绍--handler绑定



     protected void startHandlers(final int port) {
       startHandlers(".write", writeHandlersCount, queues, 0, numWriteQueues, port);
       startHandlers(".read", readHandlersCount, queues, numWriteQueues, numReadQueues, port);
       startHandlers(".scan", scanHandlersCount, queues,
                     numWriteQueues + numReadQueues, numScanQueues, port);


      protected void startHandlers(final String nameSuffix, final int numHandlers,
         final List<BlockingQueue<CallRunner>> callQueues,
         final int qindex, final int qsize, final int port) {
       final String threadPrefix = name + Strings.nullToEmpty(nameSuffix);
       for (int i = 0; i < numHandlers; i++) {
         final int index = qindex + (i % qsize);
         Thread t = new Thread(new Runnable() {
           public void run() {
             // 值处理指定队列的请求
         t.setName(threadPrefix + "RpcServer.handler=" + handlers.size() +
           ",queue=" + index + ",port=" + port);
         LOG.debug(threadPrefix + " Start Handler index=" + handlers.size() + " queue=" + index);
  • 执行器介绍--handler消费


      protected void consumerLoop(final BlockingQueue<CallRunner> myQueue) {
         while (running) {
           try {
             // 请求取得
             CallRunner task = myQueue.take();
             try {
               // 指定callrunner的run方法


      public void run() {
           // 执行具体操作
           // make the call
           resultPair = this.rpcServer.call(call.service, call.md, call.param, call.cellScanner,
         // Set the response for undelayed calls and delayed calls with
         // undelayed responses.
         // 将response放入实例中
         if (!call.isDelayed() || !call.isReturnValueDelayed()) {
           Message param = resultPair != null ? resultPair.getFirst() : null;
           CellScanner cells = resultPair != null ? resultPair.getSecond() : null;
           call.setResponse(param, cells, errorThrowable, error);
         // call中有connection的句柄,将response放入具体connection的返回队列中



  // If there is already a write in progress, we don't wait. This allows to free the handlers
 // immediately for other tasks.
 if (call.connection.responseQueue.isEmpty() && call.connection.responseWriteLock.tryLock()) {
   try {
     if (call.connection.responseQueue.isEmpty()) {
       // If we're alone, we can try to do a direct call to the socket. It's
       // an optimisation to save on context switches and data transfer between cores..
       if (processResponse(call)) {
         return; // we're done.
       // Too big to fit, putting ahead.
       added = true; // We will register to the selector later, outside of the lock.
  } finally {

 if (!added) {

 // set the serve time when the response has to be sent later
 call.timestamp = System.currentTimeMillis();


2.4 Response返回



 private void doAsyncWrite(SelectionKey key) throws IOException {
     Connection connection = (Connection) key.attachment();
     if (connection == null) {
       throw new IOException("doAsyncWrite: no connection");
     if (key.channel() != connection.channel) {
       throw new IOException("doAsyncWrite: bad channel");

     if (processAllResponses(connection)) {
       try {
         // We wrote everything, so we don't need to be told when the socket is ready for
         // write anymore.
      } catch (CancelledKeyException e) {
         /* The Listener/reader might have closed the socket.
          * We don't explicitly cancel the key, so not sure if this will
          * ever fire.
          * This warning could be removed.
         LOG.warn("Exception while changing ops : " + e);



3. 结束语









