MyCat源码分析系列之——BufferPool与缓存机制
更多MyCat源码分析,请戳MyCat源码分析系列
BufferPool
MyCat的缓冲区采用的是java.nio.ByteBuffer,由BufferPool类统一管理,相关的设置在SystemConfig中。先明确一下相关概念和配置:
- 每个Buffer单元称之为一个chunk,默认chunk的大小(DEFAULT_BUFFER_CHUNK_SIZE)为4096字节
- BufferPool的总大小为DEFAULT_BUFFER_CHUNK_SIZE * processors * 1000,其中processors为处理器数量(Runtime.getRuntime().availableProcessors())
- 缓冲区有两种类型:本地缓存线程缓冲区和其他缓冲区,其中本地缓存线程指的是线程名以"$_"开头的线程(坦白说,我并不清楚这类线程是如何产生的,求各位指导)
BufferPool中核心的变量如下:
private final ThreadLocalBufferPool localBufferPool; private final int chunkSize; private final ConcurrentLinkedQueue<ByteBuffer> items = new ConcurrentLinkedQueue<ByteBuffer>(); private final long threadLocalCount; private final long capactiy;
这些变量代表的含义分别如下:
- chunkSize:每个chunk的大小
- capacity:chunk的个数,计算方式为BufferPool的总大小/chunkSize
- items:ByteBuffer队列,初始大小为capacity,其中每个ByteBuffer由ByteBuffer.allocateDirect(chunkSize)创建
- threadLocalCount:本地线程数量,由capacity的某个比例计算得出,看起来相当于每个处理器分到的chunk个数
- localBufferPool:本地线程缓冲区,类型为继承了ThreadLocal<BufferQueue>的ThreadLocalBufferPool,BufferQueue中包含了类似的ByteBuffer链表items,其容量固定为threadLocalCount
接下来重点介绍分配Buffer和回收Buffer的过程。
1. 分配Buffer
分配Buffer时可以指定Buffer的大小,也可缺省该值,分别对应两个方法public ByteBuffer allocate(int size)和public ByteBuffer allocate(),实现如下:
public ByteBuffer allocate(int size) { if (size <= this.chunkSize) { return allocate(); } else { LOGGER.warn("allocate buffer size large than default chunksize:" + this.chunkSize + " he want " + size); return createTempBuffer(size); } } public ByteBuffer allocate() { ByteBuffer node = null; if (isLocalCacheThread()) { // allocate from threadlocal node = localBufferPool.get().poll(); if (node != null) { return node; } } node = items.poll(); if (node == null) { //newCreated++; newCreated.incrementAndGet(); node = this.createDirectBuffer(chunkSize); } return node; }
- allocate():当执行线程为本地缓存线程时(isLocalCacheThread()返回true),先尝试从localBufferPool中获取一个可用的ByteBuffer;反之,从items中获取一个可用的ByteBuffer,若还是失败,则调用createDirectBuffer(size)创建新的ByteBuffer
private ByteBuffer createDirectBuffer(int size) { // for performance return ByteBuffer.allocateDirect(size); }
- allocate(size):如果用户指定的size不大于chunkSize,则调用allocate()进行分配;反之则调用createTempBuffer(size)创建临时缓冲区,代码如下:
private ByteBuffer createTempBuffer(int size) { return ByteBuffer.allocate(size); }
2. 回收Buffer
回收Buffer时调用方法recycle(),相关代码如下:
public void recycle(ByteBuffer buffer) { if (!checkValidBuffer(buffer)) { return; } if (isLocalCacheThread()) { BufferQueue localQueue = localBufferPool.get(); if (localQueue.snapshotSize() < threadLocalCount) { localQueue.put(buffer); } else { // recyle 3/4 thread local buffer items.addAll(localQueue.removeItems(threadLocalCount * 3 / 4)); items.offer(buffer); sharedOptsCount++; } } else { sharedOptsCount++; items.offer(buffer); } } private boolean checkValidBuffer(ByteBuffer buffer) { // 拒绝回收null和容量大于chunkSize的缓存 if (buffer == null || !buffer.isDirect()) { return false; } else if (buffer.capacity() > chunkSize) { LOGGER.warn("cant' recycle a buffer large than my pool chunksize " + buffer.capacity()); return false; } totalCounts++; totalBytes += buffer.limit(); buffer.clear(); return true; }
首先调用checkValidBuffer()进行Buffer的有效性检测,该检测的目的是判断Buffer是否满足被回收(后续重用)的条件,以下3种情况不符合:
- Buffer为null
- Buffer不是Direct Buffer,即在分配时是通过createTempBuffer()创建出来的,而不是createDirectBuffer()
- Buffer的容量大于chunkSize
满足回收条件后,判断执行线程如果是本地缓存线程(isLocalCacheThread()返回true),若localBufferPool还有空余容量则将其放入,反之将localBufferPool中3/4的Buffer转移到items中并放入该Buffer;如果不是本地缓存线程直接放入items中
缓冲区的分配与回收机制如上所述,但单独设置所谓的本地缓存线程缓冲区的意义以及回收时出现的3/4转移的设置本人暂不清楚。
缓存机制
MyCat的缓存机制用于路由信息计算时为某些特定场景节省二次计算的开销,直接从相应的缓存中获取结果。
配置文件为cacheservice.properties,里面可以配置各类缓存统一的类型、大小、过期时间等,也可为每张表独立设置参数,其中提供3类缓存类型:ehcache、leveldb和mapdb。
缓存池为CachePool,它是一个接口,具体每个CachePool实现类由对应CachePoolFactory创建:
public interface CachePool { public void putIfAbsent(Object key, Object value); public Object get(Object key); public void clearCache(); public CacheStatic getCacheStatic(); public long getMaxSize(); }
CacheService作为缓存服务类存在,其init()方法负责读取缓存配置文件并创建相应的CachePoolFactory和CachePool:
private void init() throws Exception { Properties props = new Properties(); props.load(CacheService.class .getResourceAsStream("/cacheservice.properties")); final String poolFactoryPref = "factory."; final String poolKeyPref = "pool."; final String layedPoolKeyPref = "layedpool."; String[] keys = props.keySet().toArray(new String[0]); Arrays.sort(keys); for (String key : keys) { if (key.startsWith(poolFactoryPref)) { createPoolFactory(key.substring(poolFactoryPref.length()), (String) props.get(key)); } else if (key.startsWith(poolKeyPref)) { String cacheName = key.substring(poolKeyPref.length()); String value = (String) props.get(key); String[] valueItems = value.split(","); if (valueItems.length < 3) { throw new java.lang.IllegalArgumentException( "invalid cache config ,key:" + key + " value:" + value); } String type = valueItems[0]; int size = Integer.valueOf(valueItems[1]); int timeOut = Integer.valueOf(valueItems[2]); createPool(cacheName, type, size, timeOut); } else if (key.startsWith(layedPoolKeyPref)) { String cacheName = key.substring(layedPoolKeyPref.length()); String value = (String) props.get(key); String[] valueItems = value.split(","); int index = cacheName.indexOf("."); if (index < 0) {// root layer String type = valueItems[0]; int size = Integer.valueOf(valueItems[1]); int timeOut = Integer.valueOf(valueItems[2]); createLayeredPool(cacheName, type, size, timeOut); } else { // root layers' children String parent = cacheName.substring(0, index); String child = cacheName.substring(index + 1); CachePool pool = this.allPools.get(parent); if (pool == null || !(pool instanceof LayerCachePool)) { throw new java.lang.IllegalArgumentException( "parent pool not exists or not layered cache pool:" + parent + " the child cache is:" + child); } int size = Integer.valueOf(valueItems[0]); int timeOut = Integer.valueOf(valueItems[1]); ((DefaultLayedCachePool) pool).createChildCache(child, size, timeOut); } } } }
MyCat设置了3种缓存,分别是SQLRouteCache、TableId2DataNodeCache和ER_SQL2PARENTID:
- SQLRouteCache:
- 根据SQL语句查找路由信息的缓存,CachePool类型,key为虚拟库名+SQL语句,value为路由信息RouteResultSet
- 该缓存只针对select语句,如果执行了之前已经执行过的某个SQL语句(缓存命中),那路由信息就不需要重复计算,直接从缓存中获取,RouteService的route()方法中有关于此缓存的相关代码片段:
/** * SELECT 类型的SQL, 检测 */ if (sqlType == ServerParse.SELECT) { cacheKey = schema.getName() + stmt; rrs = (RouteResultset) sqlRouteCache.get(cacheKey); if (rrs != null) { return rrs; } } if (rrs!=null && sqlType == ServerParse.SELECT && rrs.isCacheAble()) { sqlRouteCache.putIfAbsent(cacheKey, rrs); }
- TableId2DataNodeCache:
- 表主键到datanode的缓存,LayerCachePool类型,为双层CachePool,第一层:key为虚拟库名+表名,value为CachePool;第二层:key为主键值,value为datanode名
- 设置该缓存的目的在于当分片字段与主键字段不同时,直接通过主键值查询是无法定位具体分片的(只能全分片下发),所以设置之后就可以利用主键值查找到分片名
- 该缓存的放入过程在MultiNodeQueryHandler的rowResponse()中,代码片段如下:
-
// cache primaryKey-> dataNode if (primaryKeyIndex != -1) { RowDataPacket rowDataPkg = new RowDataPacket(fieldCount); rowDataPkg.read(row); String primaryKey = new String(rowDataPkg.fieldValues.get(primaryKeyIndex)); LayerCachePool pool = MycatServer.getInstance() .getRouterservice().getTableId2DataNodeCache(); pool.putIfAbsent(priamaryKeyTable, primaryKey, dataNode); }
- ER_SQL2PARENTID:ER关系专用,子表插入数据时根据父子关联字段确定子表分片,下次可以直接从缓存中获取所在分片,key为虚拟库名+SQL语句,value是datanode名
缓存查看:通过9066管理端口连接MyCat,执行命令mysql> show @@cache;可以观察目前系统中设置的各类缓存,以及数量、访问次数和命中情况等
为尊重原创成果,如需转载烦请注明本文出处:
http://www.cnblogs.com/fernandolee24/p/5198192.html,特此感谢