【转】解读--(1)hbase客户端源代码
转自:https://www.iteye.com/blog/aperise-2372350
1.hbase客户端使用
1.1 在maven工程中引入hbase客户端jar
推荐的客户端使用方式一:
1 Configuration configuration = HBaseConfiguration.create(); 2 configuration.set("hbase.zookeeper.property.clientPort", "2181"); 3 configuration.set("hbase.client.write.buffer", "2097152"); 4 configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35"); 5 //默认connection实现是org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation 6 Connection connection = ConnectionFactory.createConnection(configuration); 7 //默认table实现是org.apache.hadoop.hbase.client.HTable 8 Table table = connection.getTable(TableName.valueOf("tableName")); 9 10 //3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()计算出来的 11 int bestBathPutSize = 3177; 12 13 try { 14 // Use the table as needed, for a single operation and a single thread 15 // construct List<Put> putLists 16 List<Put> putLists = new ArrayList<Put>(); 17 for(int count=0;count<100000;count++){ 18 Put put = new Put(rowkey.getBytes()); 19 put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes()); 20 put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes()); 21 put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes()); 22 put.setDurability(Durability.SKIP_WAL); 23 putLists.add(put); 24 25 if(putLists.size()==bestBathPutSize){ 26 //达到最佳大小值了,马上提交一把 27 table.put(putLists); 28 putLists.clear(); 29 } 30 } 31 //剩下的未提交数据,最后做一次提交 32 table.put(putLists) 33 } finally { 34 table.close(); 35 connection.close(); 36 }
推荐的客户端使用方式二:
1 Configuration configuration = HBaseConfiguration.create(); 2 configuration.set("hbase.zookeeper.property.clientPort", "2181"); 3 configuration.set("hbase.client.write.buffer", "2097152"); 4 configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35"); 5 6 BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf("tableName")); 7 8 //3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()计算出来的 9 int bestBathPutSize = 3177; 10 11 //这里利用jdk1.7里的新特性try(必须实现java.io.Closeable的对象){}catch (Exception e) {} 12 //相当于调用了finally功能,调用(必须实现java.io.Closeable的对象)的close()方法,也即会调用conn.close(),mutator.close() 13 try( 14 //默认connection实现是org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation 15 Connection conn = ConnectionFactory.createConnection(configuration); 16 //默认mutator实现是org.apache.hadoop.hbase.client.BufferedMutatorImpl 17 BufferedMutator mutator = conn.getBufferedMutator(params); 18 ){ 19 List<Put> putLists = new ArrayList<Put>(); 20 for(int count=0;count<100000;count++){ 21 Put put = new Put(rowkey.getBytes()); 22 put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes()); 23 put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes()); 24 put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes()); 25 put.setDurability(Durability.SKIP_WAL); 26 putLists.add(put); 27 28 if(putLists.size()==bestBathPutSize){ 29 //达到最佳大小值了,马上提交一把 30 mutator.mutate(putLists); 31 mutator.flush(); 32 putLists.clear(); 33 } 34 } 35 //剩下的未提交数据,最后做一次提交 36 mutator.mutate(putLists); 37 mutator.flush(); 38 }catch(IOException e) { 39 LOG.info("exception while creating/destroying Connection or BufferedMutator", e); 40 }
Table.put(List<Put>) | BufferedMutator.mutate(List<Put>) |
Table.put(List<Put>)源代码本质是将BufferedMutator.mutate(List<Put>)进行了包装,多了个autoFlush标志,首先调用BufferedMutator.mutate(List<Put>)按照设定的hbase.client.write.buffer(默认2MB)不断提交,最后因为默认的autoFlush=true,所以每次都会提交 |
BufferedMutator.mutate(List<Put>)会计算所给集合所占内存,如果超过hbase.client.write.buffer(默认2MB)就提交一次,直到不超过就等待,一直等待到表要关闭前再次提交一次 |
1.3 被遗弃的hbase客户端使用代码
被遗弃的创建方式一:直接通过HTable(Configuration conf, final String tableName)创建
被遗弃的方式二:通过HConnectionManager.createConnection(Configuration conf)获取HTableInterface
2.hbase客户端源码解读
前面我们说过,推荐的使用hbase客户端的方式如下:
那源代码的查看就从这两行代码开始,先来看下ConnectionFactory.createConnection(configuration)
2.1 ConnectionFactory.createConnection(Configuration conf)
先看下createConnection(Configuration conf)的源代码,如下:
传入我们构造的Configuration对象,然后调用了ConnectionFactory.createConnection(Configuration conf, ExecutorService pool, User user),继续看ConnectionFactory.createConnection(Configuration conf, ExecutorService pool, User user)的源代码,如下:
1 public static Connection createConnection(Configuration conf, ExecutorService pool, User user) 2 throws IOException { 3 //因为上面传入的user为null,这里代码不会执行 4 if (user == null) { 5 UserProvider provider = UserProvider.instantiate(conf); 6 user = provider.getCurrent(); 7 } 8 9 return createConnection(conf, false, pool, user); 10 }
这里继续调用了ConnectionFactory.createConnection(final Configuration conf, final boolean managed, final ExecutorService pool, final User user),那么我们继续看下相关代码,如下:
1 static Connection createConnection(final Configuration conf, final boolean managed, final ExecutorService pool, final User user) 2 throws IOException { 3 //默认HBASE_CLIENT_CONNECTION_IMPL = "hbase.client.connection.impl" 4 //hbase.client.connection.impl供hbase使用者实现自己的hbase链接实现类并配置进来使用 5 //默认hbase已经提供了实现,无需实现,那么这里就取默认实现ConnectionManager.HConnectionImplementation.class.getName() 6 //默认hbase的connection实现类也即HConnectionImplementation类 7 String className = conf.get(HConnection.HBASE_CLIENT_CONNECTION_IMPL,ConnectionManager.HConnectionImplementation.class.getName()); 8 Class<?> clazz = null; 9 try { 10 clazz = Class.forName(className); 11 } catch (ClassNotFoundException e) { 12 throw new IOException(e); 13 } 14 try { 15 // Default HCM#HCI is not accessible; make it so before invoking. 16 //这里调用HConnectionImplementation类的构造方法HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user) 17 Constructor<?> constructor = clazz.getDeclaredConstructor(Configuration.class, boolean.class, ExecutorService.class, User.class); 18 constructor.setAccessible(true); 19 return (Connection) constructor.newInstance(conf, managed, pool, user); 20 } catch (Exception e) { 21 throw new IOException(e); 22 } 23 } 24 }
上面的代码默认调用ConnectionManager.HConnectionImplementation类返回Connection对象,继续跟踪HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user)代码:
上面的代码我们主要关注this(conf);另外一个需要注意的就是方法setupRegistry(),setupRegistry()这里默认设置的是org.apache.hadoop.hbase.client.ZooKeeperRegistry,这一行并将在后面继续分析,其它的代码都比较简单,我在上面代码中已经做代码注释,继续看this(conf)代码:
上面代码比较重要的一点是,尽管客户端传入了Configuration,但是HConnectionImplementation不会直接使用客户端传入的Configuration,而是基于客户端传入的Configuration构建了自己的Configuration对象,原因是客户端传入的Configuration对象只给了部分值,很多其它值都未给出,那么HConnectionImplementation就有必要创建自己的Configuration,首先构建自己默认的Configuration,然后把客户端已经设置的Configuration的相关值覆盖那些默认值,客户端没设置的值就使用默认值,我们继续看下this.connectionConfig = new ConnectionConfiguration(conf)的源代码:
1 ConnectionConfiguration(Configuration conf) { 2 //客户端的Configuration没有配置hbase.client.pause,那么就设置默认值this.writeBufferSize=2097152 3 this.writeBufferSize = conf.getLong(WRITE_BUFFER_SIZE_KEY, WRITE_BUFFER_SIZE_DEFAULT); 4 5 //客户端的Configuration没有配置hbase.client.write.buffer,那么就设置默认值this.metaOperationTimeout=1200000 6 this.metaOperationTimeout = conf.getInt(HConstants.HBASE_CLIENT_META_OPERATION_TIMEOUT, HConstants.DEFAULT_HBASE_CLIENT_OPERATION_TIMEOUT); 7 8 //客户端的Configuration没有配置hbase.client.meta.operation.timeout,那么就设置默认值this.operationTimeout=1200000 9 this.operationTimeout = conf.getInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, HConstants.DEFAULT_HBASE_CLIENT_OPERATION_TIMEOUT); 10 11 //客户端的Configuration没有配置hbase.client.operation.timeout,那么就设置默认值this.scannerCaching=Integer.MAX_VALUE 12 this.scannerCaching = conf.getInt(HConstants.HBASE_CLIENT_SCANNER_CACHING, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING); 13 14 //客户端的Configuration没有配置hbase.client.scanner.max.result.size,那么就设置默认值this.scannerMaxResultSize=2 * 1024 * 1024 15 this.scannerMaxResultSize = conf.getLong(HConstants.HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE); 16 17 //客户端的Configuration没有配置hbase.client.primaryCallTimeout.get,那么就设置默认值this.primaryCallTimeoutMicroSecond=10000 18 this.primaryCallTimeoutMicroSecond = conf.getInt("hbase.client.primaryCallTimeout.get", 10000); // 10000ms 19 20 //客户端的Configuration没有配置hbase.client.replicaCallTimeout.scan,那么就设置默认值this.replicaCallTimeoutMicroSecondScan=1000000 21 this.replicaCallTimeoutMicroSecondScan = conf.getInt("hbase.client.replicaCallTimeout.scan", 1000000); // 1000000ms 22 23 //客户端的Configuration没有配置hbase.client.retries.number,那么就设置默认值this.retries=31 24 this.retries = conf.getInt(HConstants.HBASE_CLIENT_RETRIES_NUMBER, HConstants.DEFAULT_HBASE_CLIENT_RETRIES_NUMBER); 25 26 //客户端的Configuration没有配置hbase.client.keyvalue.maxsize,那么就设置默认值this.maxKeyValueSize=-1 27 this.maxKeyValueSize = conf.getInt(MAX_KEYVALUE_SIZE_KEY, MAX_KEYVALUE_SIZE_DEFAULT); 28 }
上面的代码主要是初始化HConnectionImplementation自己的Configuration类型属性this.connectionConfig,默认客户端不设置属性值,这里创建的this.connectionConfig就使用默认值,这里将hbase客户端默认值抽取如下:
- hbase.client.write.buffer 默认2097152Byte,也即2MB
- hbase.client.meta.operation.timeout 默认1200000毫秒
- hbase.client.operation.timeout 默认1200000毫秒
- hbase.client.scanner.caching 默认Integer.MAX_VALUE
- hbase.client.scanner.max.result.size 默认2MB
- hbase.client.primaryCallTimeout.get 默认10000毫秒
- hbase.client.replicaCallTimeout.scan 默认1000000毫秒
- hbase.client.retries.number 默认31次
- hbase.client.keyvalue.maxsize 默认-1,不限制
- hbase.client.ipc.pool.type
- hbase.client.ipc.pool.size
- hbase.client.pause 100
- hbase.client.max.total.tasks 100
- hbase.client.max.perserver.tasks 2
- hbase.client.max.perregion.tasks 1
- hbase.client.instance.id
- hbase.client.scanner.timeout.period 60000
- hbase.client.rpc.codec
- hbase.regionserver.lease.period 被hbase.client.scanner.timeout.period代替,60000
- hbase.client.fast.fail.mode.enabled FALSE
- hbase.client.fastfail.threshold 60000
- hbase.client.fast.fail.cleanup.duration 600000
- hbase.client.fast.fail.interceptor.impl
- hbase.client.backpressure.enabled false
2.2 与zookeeper交互的ZooKeeperRegistry
上面我们分析知道客户端使用者传入的Configuration只有设置的值才会在客户端上生效,而未设置的值则交由默认值设置,另外一个非常重要的就是刚才所提到的与zookeeper交互的类org.apache.hadoop.hbase.client.ZooKeeperRegistry
1 package org.apache.hadoop.hbase.client; 2 3 import java.io.IOException; 4 import java.io.InterruptedIOException; 5 import java.util.List; 6 7 import org.apache.commons.logging.Log; 8 import org.apache.commons.logging.LogFactory; 9 import org.apache.hadoop.hbase.HRegionInfo; 10 import org.apache.hadoop.hbase.HRegionLocation; 11 import org.apache.hadoop.hbase.RegionLocations; 12 import org.apache.hadoop.hbase.ServerName; 13 import org.apache.hadoop.hbase.TableName; 14 import org.apache.hadoop.hbase.zookeeper.MetaTableLocator; 15 import org.apache.hadoop.hbase.zookeeper.ZKClusterId; 16 import org.apache.hadoop.hbase.zookeeper.ZKTableStateClientSideReader; 17 import org.apache.hadoop.hbase.zookeeper.ZKUtil; 18 import org.apache.zookeeper.KeeperException; 19 20 /** 21 * A cluster registry that stores to zookeeper. 22 */ 23 class ZooKeeperRegistry implements Registry { 24 private static final Log LOG = LogFactory.getLog(ZooKeeperRegistry.class); 25 // hbase连接,在初始化函数中会进行设置 26 ConnectionManager.HConnectionImplementation hci; 27 28 @Override 29 public void init(Connection connection) { 30 if (!(connection instanceof ConnectionManager.HConnectionImplementation)) { 31 throw new RuntimeException("This registry depends on HConnectionImplementation"); 32 } 33 //设置hbase连接 34 this.hci = (ConnectionManager.HConnectionImplementation)connection; 35 } 36 37 @Override 38 public RegionLocations getMetaRegionLocation() throws IOException { 39 //通过hbase连接中的Configuration获取zookeeper地址后,通过hbase连接获取与zookeeper交互的ZooKeeperKeepAliveConnection 40 ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher(); 41 42 try { 43 if (LOG.isTraceEnabled()) { 44 LOG.trace("Looking up meta region location in ZK," + " connection=" + this); 45 } 46 //从zookeeper中获取所有的hbase region元数据信息 47 List<ServerName> servers = new MetaTableLocator().blockUntilAvailable(zkw, hci.rpcTimeout, hci.getConfiguration()); 48 if (LOG.isTraceEnabled()) { 49 if (servers == null) { 50 LOG.trace("Looked up meta region location, connection=" + this + "; servers = null"); 51 } else { 52 StringBuilder str = new StringBuilder(); 53 for (ServerName s : servers) { 54 str.append(s.toString()); 55 str.append(" "); 56 } 57 LOG.trace("Looked up meta region location, connection=" + this + "; servers = " + str.toString()); 58 } 59 } 60 if (servers == null) return null; 61 62 //组装hbase RegionLocations数组进行返回 63 HRegionLocation[] locs = new HRegionLocation[servers.size()]; 64 int i = 0; 65 for (ServerName server : servers) { 66 HRegionInfo h = RegionReplicaUtil.getRegionInfoForReplica(HRegionInfo.FIRST_META_REGIONINFO, i); 67 if (server == null) locs[i++] = null; 68 else locs[i++] = new HRegionLocation(h, server, 0); 69 } 70 return new RegionLocations(locs); 71 } catch (InterruptedException e) { 72 Thread.currentThread().interrupt(); 73 return null; 74 } finally { 75 zkw.close(); 76 } 77 } 78 79 private String clusterId = null; 80 81 @Override 82 public String getClusterId() { 83 if (this.clusterId != null) return this.clusterId; 84 // No synchronized here, worse case we will retrieve it twice, that's 85 // not an issue. 86 ZooKeeperKeepAliveConnection zkw = null; 87 try { 88 zkw = hci.getKeepAliveZooKeeperWatcher(); 89 this.clusterId = ZKClusterId.readClusterIdZNode(zkw); 90 if (this.clusterId == null) { 91 LOG.info("ClusterId read in ZooKeeper is null"); 92 } 93 } catch (KeeperException e) { 94 LOG.warn("Can't retrieve clusterId from Zookeeper", e); 95 } catch (IOException e) { 96 LOG.warn("Can't retrieve clusterId from Zookeeper", e); 97 } finally { 98 if (zkw != null) zkw.close(); 99 } 100 return this.clusterId; 101 } 102 103 @Override 104 public boolean isTableOnlineState(TableName tableName, boolean enabled) 105 throws IOException { 106 ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher(); 107 try { 108 if (enabled) { 109 return ZKTableStateClientSideReader.isEnabledTable(zkw, tableName); 110 } 111 return ZKTableStateClientSideReader.isDisabledTable(zkw, tableName); 112 } catch (KeeperException e) { 113 throw new IOException("Enable/Disable failed", e); 114 } catch (InterruptedException e) { 115 throw new InterruptedIOException(); 116 } finally { 117 zkw.close(); 118 } 119 } 120 121 @Override 122 public int getCurrentNrHRS() throws IOException { 123 ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher(); 124 try { 125 // We go to zk rather than to master to get count of regions to avoid 126 // HTable having a Master dependency. See HBase-2828 127 return ZKUtil.getNumberOfChildren(zkw, zkw.rsZNode); 128 } catch (KeeperException ke) { 129 throw new IOException("Unexpected ZooKeeper exception", ke); 130 } finally { 131 zkw.close(); 132 } 133 } 134 }
这个类非常重要,因为所有的与zookeeper的交互都由它来完成。
2.3 HConnectionImplementation.getTable(TableName tableName)
前面我们说过,推荐的使用hbase客户端的方式如下:
1 Connection connection = ConnectionFactory.createConnection(configuration); 2 Table table = connection.getTable(TableName.valueOf("tableName"));
上面2.1中已经知悉默认connection实现是HConnectionImplementation,那么这里我们继续跟踪HConnectionImplementation.getTable(TableName tableName)方法,代码如下:
1 public HTableInterface getTable(TableName tableName) throws IOException { 2 return getTable(tableName, getBatchPool()); 3 }
继续看HConnectionImplementation.getTable(TableName tableName, ExecutorService pool)的代码:
1 public HTableInterface getTable(TableName tableName, ExecutorService pool) throws IOException { 2 //默认managed=false 3 if (managed) { 4 throw new NeedUnmanagedConnectionException(); 5 } 6 return new HTable(tableName, this, connectionConfig, rpcCallerFactory, rpcControllerFactory, pool); 7 }
继续看HTable的构造方法HTable(TableName tableName, final ClusterConnection connection, final ConnectionConfiguration tableConfig, final RpcRetryingCallerFactory rpcCallerFactory, final RpcControllerFactory rpcControllerFactory, final ExecutorService pool),代码如下:
1 public HTable(TableName tableName, final ClusterConnection connection, final ConnectionConfiguration tableConfig, final RpcRetryingCallerFactory rpcCallerFactory, final RpcControllerFactory rpcControllerFactory, final ExecutorService pool) throws IOException { 2 if (connection == null || connection.isClosed()) { 3 throw new IllegalArgumentException("Connection is null or closed."); 4 } 5 //设置hbase数据表名 6 this.tableName = tableName; 7 //调用close方法时,默认不关闭连接,这一点非常重要,默认调用table.close()是不会关闭之前创建的connection的,这一点在后面的table.close()里会介绍 8 this.cleanupConnectionOnClose = false; 9 //设置this.connection值为HConnectionImplementation创建的connection实现类 10 this.connection = connection; 11 //从HConnectionImplementation获取客户端传入的configuration对象 12 this.configuration = connection.getConfiguration(); 13 //从HConnectionImplementation获取HConnectionImplementation基于客户端传入的configuration创建的configuration对象 14 this.connConfiguration = tableConfig; 15 //从HConnectionImplementation获取pool,HConnectionImplementation的默认pool为this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256) 16 this.pool = pool; 17 if (pool == null) { 18 this.pool = getDefaultExecutor(this.configuration); 19 this.cleanupPoolOnClose = true; 20 } else { 21 //在HConnectionImplementation中已经初始化了this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256),所以这里会设置cleanupPoolOnClose,默认也不会关闭线程池 22 this.cleanupPoolOnClose = false; 23 } 24 25 this.rpcCallerFactory = rpcCallerFactory; 26 this.rpcControllerFactory = rpcControllerFactory; 27 28 //这个方法我们后面重点关注,其根据客户端传入的Configuration初始化HTable的参数 29 this.finishSetup(); 30 }
上面的代码我已经加了注释,需要注意的是cleanupConnectionOnClose属性,该属性默认值为false,在调用table.close()方法时候,只是关闭了table而已但table后面的connection是没有关闭的,再者是属性cleanupPoolOnClose,虽然我们没有传入线程池,但是HConnectionImplementation会自己创建线程池this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256)传过来使用,所以这里会设置this.cleanupPoolOnClose = false,默认在table.close()调用时候,也不会关闭线程池,那么这里这里继续跟踪上面代码最后的this.finishSetup(),代码如下:
1 private void finishSetup() throws IOException { 2 //HTable的属性connConfiguration若为空,就基于客户端传入的Configuration构建新的connConfiguration 3 if (connConfiguration == null) { 4 connConfiguration = new ConnectionConfiguration(configuration); 5 } 6 7 //HTable的属性设置 8 this.operationTimeout = tableName.isSystemTable() ? connConfiguration.getMetaOperationTimeout() : connConfiguration.getOperationTimeout(); 9 this.scannerCaching = connConfiguration.getScannerCaching(); 10 this.scannerMaxResultSize = connConfiguration.getScannerMaxResultSize(); 11 if (this.rpcCallerFactory == null) { 12 this.rpcCallerFactory = connection.getNewRpcRetryingCallerFactory(configuration); 13 } 14 if (this.rpcControllerFactory == null) { 15 this.rpcControllerFactory = RpcControllerFactory.instantiate(configuration); 16 } 17 18 // puts need to track errors globally due to how the APIs currently work. 19 //hbase的异步操作类 20 multiAp = this.connection.getAsyncProcess(); 21 22 this.closed = false; 23 //hbase的region操作工具类 24 this.locator = new HRegionLocator(tableName, connection); 25 }
2.4 HTable.put(final List<Put> puts)
我们已经通过如下代码:
创建了connection,其默认实现类为org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation,然后创建了table,其默认实现类为org.apache.hadoop.hbase.client.HTable,那么接下来就是分析客户端的批量提交方法:HTable.put(final List<Put> puts),代码如下:
这里先看下HTable.getBufferedMutator()源代码:
1 BufferedMutator getBufferedMutator() throws IOException { 2 if (mutator == null) { 3 //从HConnectionImplementation获取pool,HConnectionImplementation的默认pool为this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256) 4 //根据hbase.client.write.buffer设置的值,默认2MB,构造缓冲区 5 this.mutator = (BufferedMutatorImpl) connection.getBufferedMutator( 6 new BufferedMutatorParams(tableName) 7 .pool(pool) 8 .writeBufferSize(connConfiguration.getWriteBufferSize()) 9 .maxKeyValueSize(connConfiguration.getMaxKeyValueSize()) 10 ); 11 } 12 return mutator; 13 }
上面的代码默认构造了一个BufferedMutatorImpl类并返回,继续跟踪BufferedMutatorImpl的方法mutate(List<? extends Mutation> ms)
上面的代码不断循环累计提交的List<Put>记录所占的空间,如果所占空间大于hbase.client.write.buffer设置的值,那么就马上调用backgroundFlushCommits(false)方法,否则啥也不做,如果出错就马上调用一次backgroundFlushCommits(true),所以我们很有必要继续跟踪BufferedMutatorImpl.backgroundFlushCommits(boolean synchronous)代码:
这里会调用ap.submit(tableName, buffer, true, null, false)直接提交,并且不会等待返回结果,而ap.submit(tableName, buffer, true, null, false)会调用AsyncProcess.submit(ExecutorService pool, TableName tableName,List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback,boolean needResults),这里源代码如下:
1 public <CResult> AsyncRequestFuture submit(TableName tableName, List<? extends Row> rows, 2 boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults) 3 throws InterruptedIOException { 4 return submit(null, tableName, rows, atLeastOne, callback, needResults); 5 }
1 public <CResult> AsyncRequestFuture submit(ExecutorService pool, TableName tableName, List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults) throws InterruptedIOException { 2 //如果提交的记录数为0,就直接返回NO_REQS_RESULT 3 if (rows.isEmpty()) { 4 return NO_REQS_RESULT; 5 } 6 7 Map<ServerName, MultiAction<Row>> actionsByServer = new HashMap<ServerName, MultiAction<Row>>(); 8 //依据提交的List<Put>的记录数构建retainedActions 9 List<Action<Row>> retainedActions = new ArrayList<Action<Row>>(rows.size()); 10 11 NonceGenerator ng = this.connection.getNonceGenerator(); 12 long nonceGroup = ng.getNonceGroup(); // Currently, nonce group is per entire client. 13 14 // Location errors that happen before we decide what requests to take. 15 List<Exception> locationErrors = null; 16 List<Integer> locationErrorRows = null; 17 //只要retainedActions不为空,那么就一直执行 18 do { 19 // Wait until there is at least one slot for a new task. 20 // 默认maxTotalConcurrentTasks=100,即最多100个异步线程用于处理元数据获取任务,如果超过100,就等待 21 waitForMaximumCurrentTasks(maxTotalConcurrentTasks - 1); 22 23 // Remember the previous decisions about regions or region servers we put in the 24 // final multi. 25 // 记录本次提交的List<Put>对应的region和regionserver 26 Map<HRegionInfo, Boolean> regionIncluded = new HashMap<HRegionInfo, Boolean>(); 27 Map<ServerName, Boolean> serverIncluded = new HashMap<ServerName, Boolean>(); 28 29 int posInList = -1; 30 Iterator<? extends Row> it = rows.iterator(); 31 while (it.hasNext()) { 32 //这里默认传入一个Put对象,因为Put是Row的继承类 33 Row r = it.next(); 34 //建立变量loc用来存储Put对象对应的region对应的元数据信息 35 HRegionLocation loc; 36 try { 37 if (r == null) { 38 throw new IllegalArgumentException("#" + id + ", row cannot be null"); 39 } 40 // Make sure we get 0-s replica. 41 //取得Put对象对应的region元数据信息的所有备份信息,第一次调用时候会缓存中是没有元数据信息的,那么就会去链接zookeeper上查找,找到后就加入到缓存,下一次直接从缓存中获取 42 RegionLocations locs = connection.locateRegion( 43 tableName, r.getRow(), true, true, RegionReplicaUtil.DEFAULT_REPLICA_ID); 44 if (locs == null || locs.isEmpty() || locs.getDefaultRegionLocation() == null) { 45 throw new IOException("#" + id + ", no location found, aborting submit for" 46 + " tableName=" + tableName + " rowkey=" + Bytes.toStringBinary(r.getRow())); 47 } 48 //取得Put对象对应的region元数据信息的所有备份信息数组中的第一个 49 loc = locs.getDefaultRegionLocation(); 50 } catch (IOException ex) { 51 locationErrors = new ArrayList<Exception>(); 52 locationErrorRows = new ArrayList<Integer>(); 53 LOG.error("Failed to get region location ", ex); 54 // This action failed before creating ars. Retain it, but do not add to submit list. 55 // We will then add it to ars in an already-failed state. 56 retainedActions.add(new Action<Row>(r, ++posInList)); 57 locationErrors.add(ex); 58 locationErrorRows.add(posInList); 59 it.remove(); 60 break; // Backward compat: we stop considering actions on location error. 61 } 62 63 //这里判断是否可以操作,因为最多也就100个异步线程获取元数据信息,如果都忙就等待 64 if (canTakeOperation(loc, regionIncluded, serverIncluded)) { 65 Action<Row> action = new Action<Row>(r, ++posInList); 66 setNonce(ng, r, action);// 67 retainedActions.add(action); 68 // TODO: replica-get is not supported on this path 69 byte[] regionName = loc.getRegionInfo().getRegionName(); 70 //把同一个区的提交任务进行收集,这里先只获知元数据信息,用于知道数据需要提交到哪个region和regionserver,最后循环外再做提交 71 addAction(loc.getServerName(), regionName, action, actionsByServer, nonceGroup); 72 it.remove(); 73 } 74 } 75 } while (retainedActions.isEmpty() && atLeastOne && (locationErrors == null)); 76 77 if (retainedActions.isEmpty()) return NO_REQS_RESULT; 78 79 // 这里已经知道数据该提交到哪个region和regionserver,就进行批量提交 80 return submitMultiActions(tableName, retainedActions, nonceGroup, callback, null, needResults, locationErrors, locationErrorRows, actionsByServer, pool); 81 }
上面代码会去寻找提交的List<Put>的每个Put对象对应的region是哪个,对应的regionserver是哪个,然后进行批量提交,这里要提到另外一个值hbase.client.max.total.tasks(默认值100,意思为客户端最大处理线程数),如果去请求Put对象对应的region是哪个和对应的regionserver是哪个的操作大于100,那么就要等待,我们回到最初的客户端批量提交代码:
上面的分析可知,如果客户端提交的List<Put>所占空间满足不同条件会进行不同处理,总结如下:
- List<Put>所占空间<hbase.client.write.buffer:getBufferedMutator().mutate(puts)会直接退出,直接执行flushCommits()
- hbase.client.write.buffer<List<Put>所占空间<2*hbase.client.write.buffer:getBufferedMutator().mutate(puts)里面会执行backgroundFlushCommits(false),处理完后执行flushCommits()
- 2*hbase.client.write.buffer<List<Put>所占空间:getBufferedMutator().mutate(puts)里面会执行backgroundFlushCommits(false),多余的未提交数据会保留,然后执行flushCommits()
紧接着,如果HTable的属性autoFlush(默认为true),那么不管剩下的数据多少,也会进行最后一次提交数据到hbase服务端,这时候flushCommits()里调用的是getBufferedMutator().flush(),而getBufferedMutator().flush()调用的是BufferedMutatorImpl.backgroundFlushCommits(true),最后调用上面的ap.submit(tableName, buffer, true, null, false)并且会调用ap.waitForAllPreviousOpsAndReset(null)等待返回结果,至此hbase客户端批量提交的源代码分析完毕。
2.5.HConnectionImplementation.locateRegionInMeta
上面的代码HTable.put(final List<Put> puts)分析中我们需要关注另一个重要的信息,就是org.apache.hadoop.hbase.client.AsyncProcess的方法public <CResult> AsyncRequestFuture submit(TableName tableName, List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults),在这个方法里有这么一段代码:
实质是调用了org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation的方法public RegionLocations locateRegion(final TableName tableName, final byte [] row, boolean useCache, boolean retry, int replicaId),这个方法加载了我们的hbase数据表的region信息,代码解释如下:
我们继续关注locateRegionInMeta(tableName, row, useCache, retry, replicaId),代码注释如下:
上述代码我们可以得知在首次org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation是如何加载我们需要的hbase数据表的信息的,我们看到hbase有个元数据表hbase:meta,这里hbase是namespace而meta是表名,我们自己创建的数据表的元数据信息都存储在这个元数据表hbase:meta中,第一次的时候会去元数据表hbase:meta中查找,找到后就加入缓存,第二次的时候直接从缓存获取我们的数据表的region信息
3.从分析源码中学到的对于hbase客户端的优化知识
- hbase客户端里传入hbase.client.write.buffer(默认2MB),加到客户端提交的缓存大小;
- hbase客户端提交采用批量提交,批量提交的List<Put>的size计算公式=hbase.client.write.buffer*2/Put大小,Put大小可通过put.heapSize()获取,以hbase.client.write.buffer=2097152,put.heapSize()=1320举例,最佳的批量提交记录大小=2*2097152/1320=3177;
- hbase客户端尽量采用多线程并发写
- hbase客户端所在机器性能要好,不然速度上不去
- 能接受关闭WAL的话尽量关闭,速度也会相应提升
4.hbase性能调研写入速度测试记录