Solr4.8.0源码分析(21)之SolrCloud的Recovery策略(二)
2014-12-06 00:08 追风的蓝宝 阅读(1535) 评论(1) 编辑 收藏 举报-
Solr4.8.0源码分析(21)之SolrCloud的Recovery策略(二)
题记: 前文<Solr4.8.0源码分析(20)之SolrCloud的Recovery策略(一)>中提到Recovery有两种策略,一是PeerSync和Replication。本节将具体介绍下PeerSync策略。
PeeySync是Solr的优先选择策略,每当需要进行recovery了,Solr总是会先去判断是否需要进入PeerSync,只有当PeerSync被设置为跳过或者PeerSync时候发现没符合条件才会进入到Replication。这是由PeeySync的特性决定的,PeeySync是面向中断时间短,需要recovery的document个数较少时使用的策略,因此它Recovery的速度较快,对Solr的影响较小。而Replication则是对中断时间长,需要recovery数量多的情况下进行的,耗时较长。
前文已经介绍了Recovery的总体流程,那么本文就直接来介绍PeerSync的流程了,请看下图所示:
- 首先 Solr会向所有Replica发送getversion的请求,来获取最新的nupdate个version(默认是100个)。
1 // Fire off the requests before getting our own recent updates (for better concurrency) 2 // This also allows us to avoid getting updates we don't need... if we got our updates and then got their updates, they would 3 // have newer stuff that we also had (assuming updates are going on and are being forwarded). 4 for (String replica : replicas) { 5 requestVersions(replica); 6 } 7 8 private void requestVersions(String replica) { 9 SyncShardRequest sreq = new SyncShardRequest(); 10 sreq.purpose = 1; 11 sreq.shards = new String[]{replica}; 12 sreq.actualShards = sreq.shards; 13 sreq.params = new ModifiableSolrParams(); 14 sreq.params.set("qt","/get"); 15 sreq.params.set("distrib",false); 16 sreq.params.set("getVersions",nUpdates); 17 shardHandler.submit(sreq, replica, sreq.params); 18 }
- 获取本分片最新的nupdate个version(默认是100个),并对这些version进行排序。
1 recentUpdates = ulog.getRecentUpdates(); 2 try { 3 ourUpdates = recentUpdates.getVersions(nUpdates); 4 } finally { 5 recentUpdates.close(); 6 } 7 8 Collections.sort(ourUpdates, absComparator);
- 获取recovery之前的version信息startingversions。通过比较startingversions与ourUpdates可以来比较recovery期间是否有索引更新。
- 检查ourUpdates和startingversions是否有交集,由于ourUpdates和startingversions的version个数是限制为nUpdates的,也就是判断索引更新的个数是否大于nUpdate。如果需要更新的索引太多即ourUpdates和startingversions无交集,则进入Replication。
1 // now make sure that the starting updates overlap our updates 2 // there shouldn't be reorders, so any overlap will do. 3 4 long smallestNewUpdate = Math.abs(ourUpdates.get(ourUpdates.size()-1)); 5 6 if (Math.abs(startingVersions.get(0)) < smallestNewUpdate) { 7 log.warn(msg() + "too many updates received since start - startingUpdates no longer overlaps with our currentUpdates"); 8 return false; 9 }
- 如果ourUpdates和startingversions有交集,则合并两个列表,即求并集。
1 // let's merge the lists 2 List<Long> newList = new ArrayList<>(ourUpdates); 3 for (Long ver : startingVersions) { 4 if (Math.abs(ver) < smallestNewUpdate) { 5 newList.add(ver); 6 } 7 } 8 9 ourUpdates = newList;
- 本分片的version比别的分片低,则进入Replication策略。这里进行分片version的比较,并没有按version的最大或者最小值,而是比较0.8和0.2比例处的version。
1 long otherHigh = percentile(otherVersions, .2f); 2 long otherLow = percentile(otherVersions, .8f); 3 4 if (ourHighThreshold < otherLow) { 5 // Small overlap between version windows and ours is older 6 // This means that we might miss updates if we attempted to use this method. 7 // Since there exists just one replica that is so much newer, we must 8 // fail the sync. 9 log.info(msg() + " Our versions are too old. ourHighThreshold="+ourHighThreshold + " otherLowThreshold="+otherLow); 10 return false; 11 }
- 如果本分片的version比其他分片高,则说明不需要进行recovery直接退出peersync。
1 if (ourLowThreshold > otherHigh) { 2 // Small overlap between windows and ours is newer. 3 // Using this list to sync would result in requesting/replaying results we don't need 4 // and possibly bringing deleted docs back to life. 5 log.info(msg() + " Our versions are newer. ourLowThreshold="+ourLowThreshold + " otherHigh="+otherHigh); 6 return true; 7 }
- 对本分片的version和其他分片的version求差,获取本分片缺少的version。
1 for (Long otherVersion : otherVersions) { 2 // stop when the entries get old enough that reorders may lead us to see updates we don't need 3 if (!completeList && Math.abs(otherVersion) < ourLowThreshold) break; 4 5 if (ourUpdateSet.contains(otherVersion) || requestedUpdateSet.contains(otherVersion)) { 6 // we either have this update, or already requested it 7 // TODO: what if the shard we previously requested this from returns failure (because it goes 8 // down) 9 continue; 10 } 11 12 toRequest.add(otherVersion); 13 requestedUpdateSet.add(otherVersion); 14 }
- 最后向其他分片发送getupdate命令,根据处理后的version获取相应的document,至此完成peersync过程
1 private boolean requestUpdates(ShardResponse srsp, List<Long> toRequest) { 2 String replica = srsp.getShardRequest().shards[0]; 3 4 log.info(msg() + "Requesting updates from " + replica + "n=" + toRequest.size() + " versions=" + toRequest); 5 6 // reuse our original request object 7 ShardRequest sreq = srsp.getShardRequest(); 8 9 sreq.purpose = 0; 10 sreq.params = new ModifiableSolrParams(); 11 sreq.params.set("qt", "/get"); 12 sreq.params.set("distrib", false); 13 sreq.params.set("getUpdates", StrUtils.join(toRequest, ',')); 14 sreq.params.set("onlyIfActive", onlyIfActive); 15 sreq.responses.clear(); // needs to be zeroed for correct correlation to occur 16 17 shardHandler.submit(sreq, sreq.shards[0], sreq.params); 18 19 return true; 20 }
总结:
本文具体介绍PeerSync的过程,由此可见PeerSync策略的recovery过程还是比较简单的,下一节将具体介绍Replication策略,这个较PeerSync复杂。
转载请注明地址http://www.cnblogs.com/rcfeng/