一、背景

由于HBase版本从0.94.6迁移到0.98.3,使用了以前的HBase 配置,发现无论怎么调整参数hbase.regionserver.handler.count,都无法改变RPC Handler Tasks的个数。

后来通过阅读源码,才发现HBase RPC实现已经重写了,参数的意义不同了,现在PRC Handler的数量由ipc.server.read.threadpool.size控制。

而hbase.regionserver.handler.count其实是server端处理request线程的个数,RPC Handler 与 Request Handler形成了生产者-消费者的关系。

看了下HBase的wiki,发现参数说明没有修改:

hbase.regionserver.handler.count

Count of RPC Listener instances spun up on RegionServers. Same property is used by the Master for count of master handlers.

 

Default. 30

ipc.server.read.threadpool.size 不存在。

二、源代码解析

HRegionServer

HRegionServer makes a set of HRegions available to clients. It checks in with the HMaster. There are many HRegionServers in a single HBase deployment.

HRegionServer 维护了一个RPCServer实例,用于相应clients的各种请求。

An RPC server that hosts protobuf described Services. An RpcServer instance has a Listener that hosts the socket. Listener has fixed number of Readers in an ExecutorPool, 10 by default. The Listener does an accept and then round robin a Reader is chosen to do the read. The reader is registered on Selector. Read does total read off the channel and the parse from which it makes a Call. The call is wrapped in a CallRunner and passed to the scheduler to be run. Reader goes back to see if more to be done and loops till done.

Scheduler can be variously implemented but default simple scheduler has handlers to which it has given the queues into which calls (i.e. CallRunner instances) are inserted. Handlers run taking from the queue. They run the CallRunner#run method on each item gotten from queue and keep taking while the server is up. CallRunner#run executes the call. When done, asks the included Call to put itself on new queue for Responder to pull from and return result to client.

RpcServer 实例持有一个Listener,Listener持有socket。每个Listener有固定数量的Readers,默认10个,Listener循环接受请求,选取Reader来读取socketChannel中的request。每个Reader都要注册到selector。Reader把读取的Request解析成一个call,然后封装成CallRunnner,加入scheduler维护的队列里,等待运行。Handlers从scheduler维护的队列里获取CallRunnner。总之,scheduler起到了生产者-消费者队列的作用。

1.RpcServer 构造函数

/**
   * Constructs a server listening on the named port and address.
   * @param serverInstance hosting instance of {@link Server}. We will do authentications if an
   * instance else pass null for no authentication check.
   * @param name Used keying this rpc servers' metrics and for naming the Listener thread.
   * @param services A list of services.
   * @param isa Where to listen
   * @param conf
   * @throws IOException
   */
  public RpcServer(final Server serverInstance, final String name,
      final List<BlockingServiceAndInterface> services,
      final InetSocketAddress isa, Configuration conf,
      RpcScheduler scheduler)
  throws IOException {
    this.serverInstance = serverInstance;
    this.services = services;
    this.isa = isa;
    this.conf = conf;
    this.socketSendBufferSize = 0;
    this.maxQueueSize =
      this.conf.getInt("ipc.server.max.callqueue.size", DEFAULT_MAX_CALLQUEUE_SIZE);
    this.readThreads = conf.getInt("ipc.server.read.threadpool.size", 10);
    this.maxIdleTime = 2*conf.getInt("ipc.client.connection.maxidletime", 1000);
    this.maxConnectionsToNuke = conf.getInt("ipc.client.kill.max", 10);
    this.thresholdIdleConnections = conf.getInt("ipc.client.idlethreshold", 4000);
    this.purgeTimeout = conf.getLong("ipc.client.call.purge.timeout",
      2 * HConstants.DEFAULT_HBASE_RPC_TIMEOUT);
    this.warnResponseTime = conf.getInt(WARN_RESPONSE_TIME, DEFAULT_WARN_RESPONSE_TIME);
    this.warnResponseSize = conf.getInt(WARN_RESPONSE_SIZE, DEFAULT_WARN_RESPONSE_SIZE);

    // Start the listener here and let it bind to the port
    listener = new Listener(name);
    this.port = listener.getAddress().getPort();

    this.metrics = new MetricsHBaseServer(name, new MetricsHBaseServerWrapperImpl(this));
    this.tcpNoDelay = conf.getBoolean("ipc.server.tcpnodelay", true);
    this.tcpKeepAlive = conf.getBoolean("ipc.server.tcpkeepalive", true);

    this.warnDelayedCalls = conf.getInt(WARN_DELAYED_CALLS, DEFAULT_WARN_DELAYED_CALLS);
    this.delayedCalls = new AtomicInteger(0);
    this.ipcUtil = new IPCUtil(conf);


    // Create the responder here
    responder = new Responder();
    this.authorize = conf.getBoolean(HADOOP_SECURITY_AUTHORIZATION, false);
    this.userProvider = UserProvider.instantiate(conf);
    this.isSecurityEnabled = userProvider.isHBaseSecurityEnabled();
    if (isSecurityEnabled) {
      HBaseSaslRpcServer.init(conf);
    }
    this.scheduler = scheduler;
    this.scheduler.init(new RpcSchedulerContext(this));
  }

2.Listener 构造函数

public Listener(final String name) throws IOException {
      super(name);
      // Create a new server socket and set to non blocking mode
      acceptChannel = ServerSocketChannel.open();
      acceptChannel.configureBlocking(false);

      // Bind the server socket to the local host and port
      bind(acceptChannel.socket(), isa, backlogLength);
      port = acceptChannel.socket().getLocalPort(); //Could be an ephemeral port
      // create a selector;
      selector= Selector.open();

      readers = new Reader[readThreads];
      readPool = Executors.newFixedThreadPool(readThreads,
        new ThreadFactoryBuilder().setNameFormat(
          "RpcServer.reader=%d,port=" + port).setDaemon(true).build());
      for (int i = 0; i < readThreads; ++i) {
        Reader reader = new Reader();
        readers[i] = reader;
        readPool.execute(reader);
      }
      LOG.info(getName() + ": started " + readThreads + " reader(s).");

      // Register accepts on the server socket with the selector.
      acceptChannel.register(selector, SelectionKey.OP_ACCEPT);
      this.setName("RpcServer.listener,port=" + port);
      this.setDaemon(true);
    }

Listener维护ServerSocketChannel,(详见NIO),根据参数ipc.server.read.threadpool.size,创建相应长度的Reader数组。要理解Listener和Reader的关系,就要搞懂ServerSocketChannel 和SocketChannel的关系。

Java NIO中的ServerSocketChannel是一个可以监听新进来的TCP连接的通道。使用观察者模式,有新的连接进来(事件是: acceptChannel.register(selector, SelectionKey.OP_ACCEPT);),Listener获取到通知,执行doAccept()

void doAccept(SelectionKey key) throws IOException, OutOfMemoryError {
      Connection c;
      ServerSocketChannel server = (ServerSocketChannel) key.channel();

      SocketChannel channel;
      while ((channel = server.accept()) != null) {
        try {
          channel.configureBlocking(false);
          channel.socket().setTcpNoDelay(tcpNoDelay);
          channel.socket().setKeepAlive(tcpKeepAlive);
        } catch (IOException ioe) {
          channel.close();
          throw ioe;
        }

        Reader reader = getReader();//从Reader 数组中选取一个Reader
        try {
          reader.startAdd();
          SelectionKey readKey = reader.registerChannel(channel);// 同样使用观察者模式,在把Channel注册进Reader的Selector。
          c = getConnection(channel, System.currentTimeMillis());
          readKey.attach(c);                                      // 将一个对象或者更多信息附着到SlectionKey上,这样就能方便的识别某个给定的通道。
          synchronized (connectionList) {
            connectionList.add(numConnections, c);
            numConnections++;
          }
          if (LOG.isDebugEnabled())
            LOG.debug(getName() + ": connection from " + c.toString() +
                "; # active connections: " + numConnections);
        } finally {
          reader.finishAdd();
        }
      }
    }

accept一个新的连接,把SocketChannel注册进入一个Reader的Selector。因此,一个Reader可能维护着多个SocketChannel。Reader的Selector相应的事件:( channel.register(readSelector, SelectionKey.OP_READ);)。当ReaderSelector接受到OP_READER事件时,观察者模式就起其相应的作用。

public void run() {
        try {
          doRunLoop();
        } finally {
          try {
            readSelector.close();
          } catch (IOException ioe) {
            LOG.error(getName() + ": error closing read selector in " + getName(), ioe);
          }
        }
      }

      private synchronized void doRunLoop() {//reader会一直执行这个循环,一直等待readSelector的通知。
        while (running) {
          SelectionKey key = null;
          try {
            readSelector.select();
            while (adding) {
              this.wait(1000);
            }

            Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();
            while (iter.hasNext()) {
              key = iter.next();
              iter.remove();
              if (key.isValid()) {
                if (key.isReadable()) {
                  doRead(key);
                }
              }
              key = null;
            }
          } catch (InterruptedException e) {
            if (running) {                      // unexpected -- log it
              LOG.info(getName() + ": unexpectedly interrupted: " +
                StringUtils.stringifyException(e));
            }
          } catch (IOException ex) {
            LOG.error(getName() + ": error in Reader", ex);
          }
        }
      }
void doRead(SelectionKey key) throws InterruptedException {
      int count = 0;
      Connection c = (Connection)key.attachment();
      if (c == null) {
        return;
      }
      c.setLastContact(System.currentTimeMillis());
      try {
        count = c.readAndProcess();
      } catch (InterruptedException ieo) {
        throw ieo;
      } catch (Exception e) {
        LOG.warn(getName() + ": count of bytes read: " + count, e);
        count = -1; //so that the (count < 0) block is executed
      }
      if (count < 0) {
        if (LOG.isDebugEnabled()) {
          LOG.debug(getName() + ": DISCONNECTING client " + c.toString() +
            " because read count=" + count +
            ". Number of active connections: " + numConnections);
        }
        closeConnection(c);
        // c = null;
      } else {
        c.setLastContact(System.currentTimeMillis());
      }
    }
/**
     * Read off the wire.
     * @return Returns -1 if failure (and caller will close connection) else return how many
     * bytes were read and processed
     * @throws IOException
     * @throws InterruptedException
     */
    public int readAndProcess() throws IOException, InterruptedException {
      while (true) {
        // Try and read in an int.  If new connection, the int will hold the 'HBas' HEADER.  If it
        // does, read in the rest of the connection preamble, the version and the auth method.
        // Else it will be length of the data to read (or -1 if a ping).  We catch the integer
        // length into the 4-byte this.dataLengthBuffer.
        int count;
        if (this.dataLengthBuffer.remaining() > 0) {
          count = channelRead(channel, this.dataLengthBuffer);
          if (count < 0 || this.dataLengthBuffer.remaining() > 0) {
            return count;
          }
        }
        // If we have not read the connection setup preamble, look to see if that is on the wire.
        if (!connectionPreambleRead) {
          // Check for 'HBas' magic.
          this.dataLengthBuffer.flip();
          if (!HConstants.RPC_HEADER.equals(dataLengthBuffer)) {
            return doBadPreambleHandling("Expected HEADER=" +
              Bytes.toStringBinary(HConstants.RPC_HEADER.array()) +
              " but received HEADER=" + Bytes.toStringBinary(dataLengthBuffer.array()) +
              " from " + toString());
          }
          // Now read the next two bytes, the version and the auth to use.
          ByteBuffer versionAndAuthBytes = ByteBuffer.allocate(2);
          count = channelRead(channel, versionAndAuthBytes);
          if (count < 0 || versionAndAuthBytes.remaining() > 0) {
            return count;
          }
          int version = versionAndAuthBytes.get(0);
          byte authbyte = versionAndAuthBytes.get(1);
          this.authMethod = AuthMethod.valueOf(authbyte);
          if (version != CURRENT_VERSION) {
            String msg = getFatalConnectionString(version, authbyte);
            return doBadPreambleHandling(msg, new WrongVersionException(msg));
          }
          if (authMethod == null) {
            String msg = getFatalConnectionString(version, authbyte);
            return doBadPreambleHandling(msg, new BadAuthException(msg));
          }
          if (isSecurityEnabled && authMethod == AuthMethod.SIMPLE) {
            AccessControlException ae = new AccessControlException("Authentication is required");
            setupResponse(authFailedResponse, authFailedCall, ae, ae.getMessage());
            responder.doRespond(authFailedCall);
            throw ae;
          }
          if (!isSecurityEnabled && authMethod != AuthMethod.SIMPLE) {
            doRawSaslReply(SaslStatus.SUCCESS, new IntWritable(
                SaslUtil.SWITCH_TO_SIMPLE_AUTH), null, null);
            authMethod = AuthMethod.SIMPLE;
            // client has already sent the initial Sasl message and we
            // should ignore it. Both client and server should fall back
            // to simple auth from now on.
            skipInitialSaslHandshake = true;
          }
          if (authMethod != AuthMethod.SIMPLE) {
            useSasl = true;
          }
          connectionPreambleRead = true;
          // Preamble checks out. Go around again to read actual connection header.
          dataLengthBuffer.clear();
          continue;
        }
        // We have read a length and we have read the preamble.  It is either the connection header
        // or it is a request.
        if (data == null) {
          dataLengthBuffer.flip();
          int dataLength = dataLengthBuffer.getInt();
          if (dataLength == RpcClient.PING_CALL_ID) {
            if (!useWrap) { //covers the !useSasl too
              dataLengthBuffer.clear();
              return 0;  //ping message
            }
          }
          if (dataLength < 0) {
            throw new IllegalArgumentException("Unexpected data length "
                + dataLength + "!! from " + getHostAddress());
          }
          data = ByteBuffer.allocate(dataLength);
          incRpcCount();  // Increment the rpc count
        }
        count = channelRead(channel, data);
        if (count < 0) {
          return count;
        } else if (data.remaining() == 0) {
          dataLengthBuffer.clear();
          data.flip();
          if (skipInitialSaslHandshake) {
            data = null;
            skipInitialSaslHandshake = false;
            continue;
          }
          boolean headerRead = connectionHeaderRead;
          if (useSasl) {
            saslReadAndProcess(data.array());
          } else {
            processOneRpc(data.array());
          }
          this.data = null;
          if (!headerRead) {
            continue;
          }
        } else if (count > 0) {
          // We got some data and there is more to read still; go around again.
          if (LOG.isTraceEnabled()) LOG.trace("Continue to read rest of data " + data.remaining());
          continue;
        }
        return count;
      }
    }

processOneRpc(data.array()); 就是处理从SocketChannel读取到的数据。

下面看怎么把数据封装成RunCaller,然后加入Scheduler队列的:

/**
     * @param buf Has the request header and the request param and optionally encoded data buffer
     * all in this one array.
     * @throws IOException
     * @throws InterruptedException
     */
    protected void processRequest(byte[] buf) throws IOException, InterruptedException {
      long totalRequestSize = buf.length;
      int offset = 0;
      // Here we read in the header.  We avoid having pb
      // do its default 4k allocation for CodedInputStream.  We force it to use backing array.
      CodedInputStream cis = CodedInputStream.newInstance(buf, offset, buf.length);
      int headerSize = cis.readRawVarint32();
      offset = cis.getTotalBytesRead();
      RequestHeader header = RequestHeader.newBuilder().mergeFrom(buf, offset, headerSize).build();
      offset += headerSize;
      int id = header.getCallId();
      if (LOG.isTraceEnabled()) {
        LOG.trace("RequestHeader " + TextFormat.shortDebugString(header) +
          " totalRequestSize: " + totalRequestSize + " bytes");
      }
      // Enforcing the call queue size, this triggers a retry in the client
      // This is a bit late to be doing this check - we have already read in the total request.
      if ((totalRequestSize + callQueueSize.get()) > maxQueueSize) {
        final Call callTooBig =
          new Call(id, this.service, null, null, null, null, this,
            responder, totalRequestSize, null);
        ByteArrayOutputStream responseBuffer = new ByteArrayOutputStream();
        setupResponse(responseBuffer, callTooBig, new CallQueueTooBigException(),
          "Call queue is full, is ipc.server.max.callqueue.size too small?");
        responder.doRespond(callTooBig);
        return;
      }
      MethodDescriptor md = null;
      Message param = null;
      CellScanner cellScanner = null;
      try {
        if (header.hasRequestParam() && header.getRequestParam()) {
          md = this.service.getDescriptorForType().findMethodByName(header.getMethodName());
          if (md == null) throw new UnsupportedOperationException(header.getMethodName());
          Builder builder = this.service.getRequestPrototype(md).newBuilderForType();
          // To read the varint, I need an inputstream; might as well be a CIS.
          cis = CodedInputStream.newInstance(buf, offset, buf.length);
          int paramSize = cis.readRawVarint32();
          offset += cis.getTotalBytesRead();
          if (builder != null) {
            param = builder.mergeFrom(buf, offset, paramSize).build();
          }
          offset += paramSize;
        }
        if (header.hasCellBlockMeta()) {
          cellScanner = ipcUtil.createCellScanner(this.codec, this.compressionCodec,
            buf, offset, buf.length);
        }
      } catch (Throwable t) {
        String msg = "Unable to read call parameter from client " + getHostAddress();
        LOG.warn(msg, t);

        // probably the hbase hadoop version does not match the running hadoop version
        if (t instanceof LinkageError) {
          t = new DoNotRetryIOException(t);
        }
        // If the method is not present on the server, do not retry.
        if (t instanceof UnsupportedOperationException) {
          t = new DoNotRetryIOException(t);
        }

        final Call readParamsFailedCall =
          new Call(id, this.service, null, null, null, null, this,
            responder, totalRequestSize, null);
        ByteArrayOutputStream responseBuffer = new ByteArrayOutputStream();
        setupResponse(responseBuffer, readParamsFailedCall, t,
          msg + "; " + t.getMessage());
        responder.doRespond(readParamsFailedCall);
        return;
      }

      TraceInfo traceInfo = header.hasTraceInfo()
          ? new TraceInfo(header.getTraceInfo().getTraceId(), header.getTraceInfo().getParentId())
          : null;
      Call call = new Call(id, this.service, md, header, param, cellScanner, this, responder,
              totalRequestSize,
              traceInfo);
      scheduler.dispatch(new CallRunner(RpcServer.this, call, userProvider));
    }

下面看

SimpleRpcScheduler 是如何管理任务队列的

 

public SimpleRpcScheduler(
      Configuration conf,
      int handlerCount,// handlerCount 由参数hbase.regionserver.handler.count决定。因此
      int priorityHandlerCount,
      int replicationHandlerCount,
      PriorityFunction priority,
      int highPriorityLevel) {
    int maxQueueLength = conf.getInt("ipc.server.max.callqueue.length",
        handlerCount * RpcServer.DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER);
    this.handlerCount = handlerCount;
    this.priorityHandlerCount = priorityHandlerCount;
    this.replicationHandlerCount = replicationHandlerCount;
    this.priority = priority;
    this.highPriorityLevel = highPriorityLevel;
    this.callQueue = new LinkedBlockingQueue<CallRunner>(maxQueueLength);
    this.priorityCallQueue = priorityHandlerCount > 0
        ? new LinkedBlockingQueue<CallRunner>(maxQueueLength)
        : null;
    this.replicationQueue = replicationHandlerCount > 0
        ? new LinkedBlockingQueue<CallRunner>(maxQueueLength)
        : null;
  }
private void startHandlers(
      int handlerCount,
      final BlockingQueue<CallRunner> callQueue,
      String threadNamePrefix) {
    for (int i = 0; i < handlerCount; i++) {// 因此hbase.regionserver.handler.count,其实是处理call线程的数量,和rpc handler数量已没有关系。两者形成了生产者-消费者关系。
      Thread t = new Thread(new Runnable() {
        @Override
        public void run() {
          consumerLoop(callQueue);
        }
      });
      t.setDaemon(true);
      t.setName(Strings.nullToEmpty(threadNamePrefix) + "RpcServer.handler=" + i + ",port=" + port);
      t.start();
      handlers.add(t);
    }
  }
@Override
  public void dispatch(CallRunner callTask) throws InterruptedException {
    RpcServer.Call call = callTask.getCall();
    int level = priority.getPriority(call.header, call.param);
    if (priorityCallQueue != null && level > highPriorityLevel) {
      priorityCallQueue.put(callTask);
    } else if (replicationQueue != null && level == HConstants.REPLICATION_QOS) {
      replicationQueue.put(callTask);
    } else {
      callQueue.put(callTask); // queue the call; maybe blocked here
    }
  }
posted on 2014-06-27 00:13  wangqianbo  阅读(6933)  评论(0编辑  收藏  举报