zookeeper(2) 文件系统
这一节我们主要来看一下zookeeper文件系统的实现。
树结构
为了提高对指定节点的操作,zookeeper使用一个HashMap来存储树结构数据,key为数据路径,value为节点数据。
树节点(DataNode)
1 public class DataNode implements Record { 2 //父节点 3 DataNode parent; 4 //节点数据 5 byte data[]; 6 //节点权限 7 Long acl; 8 //节点状态信息 9 public StatPersisted stat; 10 //子节点名 11 private Set<String> children = null; 12 13 }
数节点状态(StatPersisted)
1 public class StatPersisted implements Record { 2 //该节点创建是的事务xid 3 private long czxid; 4 //该节点最后一次修改的事务id 5 private long mzxid; 6 //创建时间 7 private long ctime; 8 //最后一次修改时间 9 private long mtime; 10 //节点版本号 11 private int version; 12 //子节点版本号 13 private int cversion; 14 //acl版本号 15 private int aversion; 16 //是否为零时节点 17 private long ephemeralOwner; 18 //子列表被修改的zxid 19 private long pzxid; 20 }
配额管理
zookeeper在创建、修改节点时可以设置特定路径上的配额。在实现上,配额也存储在文件系统中,并且还存储了节点当前的信息。配额的控制在特定的路径下:/zookeeper/quota/{节点路径}/zookeeper_limits 节点的限制数据节点;/zookeeper/quota/{节点路径}/zookeeper_stats 节点的当前量节点。一个节点路径中只能有一个配额限制。
当在对某个节点进行操作时,我们需要知道该路径下的哪个节点设置了配额,因为树结构使用hashmap来存储,所以不便于通过路径查找,所以使用了一个树结构来表示那个节点上配置了限额。
配额路径(PathTrie)
1 public class PathTrie { 2 //根节点 3 private final TrieNode rootNode ; 4 //节点 5 static class TrieNode { 6 //是否设置配额,false没有,true有 7 boolean property = false; 8 //子节点 9 final HashMap<String, TrieNode> children; 10 //父节点 11 TrieNode parent = null; 12 13 14 //删除子节点配额 15 void deleteChild(String childName) { 16 synchronized(children) { 17 if (!children.containsKey(childName)) { 18 return; 19 } 20 TrieNode childNode = children.get(childName); 21 //如果子节点没有自己点直接删除,否则设置property为false 22 if (childNode.getChildren().length == 1) { 23 childNode.setParent(null); 24 children.remove(childName); 25 } 26 else { 27 childNode.setProperty(false); 28 } 29 } 30 } 31 } 32 //新增配额节点 33 public void addPath(String path) { 34 if (path == null) { 35 return; 36 } 37 String[] pathComponents = path.split("/"); 38 TrieNode parent = rootNode; 39 String part = null; 40 if (pathComponents.length <= 1) { 41 throw new IllegalArgumentException("Invalid path " + path); 42 } 43 for (int i=1; i<pathComponents.length; i++) { 44 part = pathComponents[i]; 45 if (parent.getChild(part) == null) { 46 parent.addChild(part, new TrieNode(parent)); 47 } 48 parent = parent.getChild(part); 49 } 50 parent.setProperty(true); 51 } 52 53 //删除配额节点 54 public void deletePath(String path) { 55 if (path == null) { 56 return; 57 } 58 String[] pathComponents = path.split("/"); 59 TrieNode parent = rootNode; 60 String part = null; 61 if (pathComponents.length <= 1) { 62 throw new IllegalArgumentException("Invalid path " + path); 63 } 64 for (int i=1; i<pathComponents.length; i++) { 65 part = pathComponents[i]; 66 if (parent.getChild(part) == null) { 67 return; 68 } 69 parent = parent.getChild(part); 70 } 71 TrieNode realParent = parent.getParent(); 72 realParent.deleteChild(part); 73 } 74 75 //获取指定路径上配额节点最大路径 76 public String findMaxPrefix(String path) { 77 if (path == null) { 78 return null; 79 } 80 if ("/".equals(path)) { 81 return path; 82 } 83 String[] pathComponents = path.split("/"); 84 TrieNode parent = rootNode; 85 List<String> components = new ArrayList<String>(); 86 if (pathComponents.length <= 1) { 87 throw new IllegalArgumentException("Invalid path " + path); 88 } 89 int i = 1; 90 String part = null; 91 StringBuilder sb = new StringBuilder(); 92 //最大路径的index 93 int lastindex = -1; 94 while((i < pathComponents.length)) { 95 if (parent.getChild(pathComponents[i]) != null) { 96 part = pathComponents[i]; 97 parent = parent.getChild(part); 98 components.add(part); 99 if (parent.getProperty()) { 100 lastindex = i-1; 101 } 102 } 103 else { 104 break; 105 } 106 i++; 107 } 108 for (int j=0; j< (lastindex+1); j++) { 109 sb.append("/" + components.get(j)); 110 } 111 return sb.toString(); 112 } 113 }
监听器管理
zookeeper可以对指定路径进行监听,当指定路径发生变化时,监听器会执行响应的动作。主要是通过将path和watcher建立关联关系,在对指定路径进行操作是调用相应监听器方法。
监听管理器(WatchManager)
1 public class WatchManager { 2 //key为path value为该path对应的watcher集合 3 private final HashMap<String, HashSet<Watcher>> watchTable = 4 new HashMap<String, HashSet<Watcher>>(); 5 //key为watcher value为该watcher对应的path集合,使用两个hashmap来维护路径和监听器是因为watcher和路径是多对多关系,这样无论通过watcher还是路径都可以很快找到对应的路径和watcher。 6 private final HashMap<Watcher, HashSet<String>> watch2Paths = 7 new HashMap<Watcher, HashSet<String>>(); 8 9 public synchronized void addWatch(String path, Watcher watcher) { 10 //新增watcher到watchTable 11 HashSet<Watcher> list = watchTable.get(path); 12 if (list == null) { 13 list = new HashSet<Watcher>(4); 14 watchTable.put(path, list); 15 } 16 list.add(watcher); 17 //新增watcher到watch2Paths 18 HashSet<String> paths = watch2Paths.get(watcher); 19 if (paths == null) { 20 paths = new HashSet<String>(); 21 watch2Paths.put(watcher, paths); 22 } 23 paths.add(path); 24 } 25 26 public synchronized void removeWatcher(Watcher watcher) { 27 //从watch2Paths和watchTable删除watcher 28 HashSet<String> paths = watch2Paths.remove(watcher); 29 if (paths == null) { 30 return; 31 } 32 for (String p : paths) { 33 HashSet<Watcher> list = watchTable.get(p); 34 if (list != null) { 35 list.remove(watcher); 36 if (list.size() == 0) { 37 watchTable.remove(p); 38 } 39 } 40 } 41 } 42 //触发watcher 43 public Set<Watcher> triggerWatch(String path, EventType type, Set<Watcher> supress) { 44 WatchedEvent e = new WatchedEvent(type, 45 KeeperState.SyncConnected, path); 46 HashSet<Watcher> watchers; 47 synchronized (this) { 48 //zookeeper的通知是一次性的,也就是说如果一个路径触发通知后,相应的watcher会从这两个hashmap中删除。 49 watchers = watchTable.remove(path); 50 for (Watcher w : watchers) { 51 HashSet<String> paths = watch2Paths.get(w); 52 if (paths != null) { 53 paths.remove(path); 54 } 55 } 56 } 57 for (Watcher w : watchers) { 58 if (supress != null && supress.contains(w)) { 59 continue; 60 } 61 w.process(e); 62 } 63 return watchers; 64 } 65 }
临时节点
zookeeper中有一类节点在创建的session结束后会被清除掉,zookeeper在创建这些节点时会记录节点和session 的对应关系,到session结束是,删除这些节点。
结束session(DataTree.killSession)
1 //session与零时节点对应关系 2 private final Map<Long, HashSet<String>> ephemerals = 3 new ConcurrentHashMap<Long, HashSet<String>>(); 4 //关闭session 5 void killSession(long session, long zxid) { 6 //session结束后,删除零时节点 7 HashSet<String> list = ephemerals.remove(session); 8 if (list != null) { 9 for (String path : list) { 10 try { 11 deleteNode(path, zxid); 12 } catch (NoNodeException e) { 13 LOG.warn("Ignoring NoNodeException for path " + path 14 + " while removing ephemeral for dead session 0x" 15 + Long.toHexString(session)); 16 } 17 } 18 } 19 }
权限管理
zookeeper的每个节点都会存储该节点可以访问的用户已经可以执行的操作。
权限(ACL)
1 public class ACL implements Record { 2 //perms即权限,有五种权限:READ(可读);WRITE(可写);CREATE(可创建子节点);DELETE(可删除子节点);ADMIN(管理权限);perms的每一位代表一种权限。 3 private int perms; 4 //id是授权的对象。 5 private org.apache.zookeeper.data.Id id; 6 } 7 public class Id implements Record { 8 //scheme是权限模式,有五种模式:digest(通过用户名密码,id为user:password);auth();ip(通过ip,id为ip地址);world(固定用户为anyone,为所有Client端开放权限 );super(对应的id拥有超级权限)。 9 private String scheme; 10 private String id; 11 }
每个节点可以设置多个权限,实际节点权限只存储一个整数,对应的acl信息保存在两个hashmap中。(DataTree.java)
1 public final Map<Long, List<ACL>> longKeyMap = new HashMap<Long, List<ACL>>(); 2 public final Map<List<ACL>, Long> aclKeyMap = new HashMap<List<ACL>, Long>();
节点操作