zookeeper学习总结
一、集群部署:
zookeeper集群部署建议为奇数台机器(比如3台,5台),这样做是为了方便Parxos算法选举出leader节点和follower节点;
现在假设有三台centerOS7的机器,IP分别为:10.103.16.207, 10.103.16.208, 10.103.16.209,现在分别在这三台机器上部署上zookeeper环境,步骤如下:
1、配置JDK环境:这个就不用多说了;
2、下载zookeeper包:
到https://archive.apache.org/dist/zookeeper/ 下载zookeeper的包文件,我这里下载的是3.4.10版本;
3、解压文件:
tar -zxvf zookeeper-3.4.10.tar.gz -C /usr/local/zookeeper
4、修改配置;
cd /usr/local/zookeeper/conf
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg
在其中增加下面内容(如果项目已经存在则替换就行):
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
server.1=10.103.16.207:2888:3888
server.2=10.103.16.208:2888:3888
server.3=10.103.16.209:2888:3888
在zookeeper下建立目录:
mkdir data
mkdir logs
切换到data目录下,生成myid文件:
echo “1” >> /usr/local/zookeeper/data/myid
三台机器分别指定对应的ID,如果10.103.16.207上的ID为1,10.103.16.208上的ID为2,10.103.16.209上的ID为3;
5、启动zookeeper;
cd /usr/local/zookeeper/bin
./zkServer.sh start
启动后通过命令 ./zkServer.sh status能查看启动是否成功,该节点是leader还是follower;
通过jps命令,如果能看到一个QuorumPeerMain的进程,说明zookeeper进程启动起来了;
6、使用工具操作zookeeper;
cd /usr/local/zookeeper/bin
连接zookeeper:./zkCli.sh 或者 ./zkCli.sh 10.103.16.208:2181;连接成功后就可以进行下面的操作;
查看节点信息:ls / 或者 ls /zookeeper(/zookeeper必须为已经存在的节点);
创建节点:create /zookeeper/subnode “this is sub node data”
查看节点内容:get /zookeeper/subnode
设置节点内容:set /zookeeper/subnode “change node data”
删除节点:delete /zookeeper/subnode
查看命令行帮助:help
退出cli工具:quit
二、Zookeeper API 使用:
1、导入jar包;
<dependency> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> <version>3.4.5</version> </dependency>
2、编写服务端:
1 public class Server { 2 private static final String connectString = "10.103.16.207:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private ZooKeeper zk = null; 6 private static final String hostname = "10.103.16.214"; 7 8 /** 9 * 创建到zk的客户端连接 10 */ 11 public void getConnect() throws Exception { 12 zk = new ZooKeeper(connectString, sessionTimeout, null); 13 } 14 15 /** 16 * 向zk集群注册服务器信息 17 */ 18 public void registerServer(String hostname) throws Exception { 19 String create = zk.create(parentNode + "/server", hostname.getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL); 20 System.out.println(hostname + " is online.." + create); 21 } 22 23 /** 24 * 业务功能 25 * 26 * @throws InterruptedException 27 */ 28 public void handleBussiness(String hostname) throws InterruptedException { 29 System.out.println(hostname + " start working....."); 30 Thread.sleep(Long.MAX_VALUE); 31 } 32 33 public static void main(String[] args) throws Exception { 34 // 获取zk连接 35 Server server = new Server(); 36 server.getConnect(); 37 // 利用zk连接注册服务器信息 38 server.registerServer(hostname); 39 // 启动业务功能 40 server.handleBussiness(hostname); 41 } 42 }
3、编写客户端:
1 public class Client implements Watcher { 2 private static final String connectString = "10.103.16.207:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private volatile List<String> serverList; 6 private ZooKeeper zk = null; 7 8 @Override 9 public void process(WatchedEvent event) { 10 System.out.println(event.getType() + "-------->" + event.getPath()); 11 12 try { 13 getServerList(); 14 } catch (Exception e) { 15 e.printStackTrace(); 16 } 17 } 18 19 /** 20 * 创建到zk的客户端连接 21 */ 22 public void getConnect() throws Exception { 23 zk = new ZooKeeper(connectString, sessionTimeout, this); 24 } 25 26 /** 27 * 获取服务器信息列表 28 */ 29 public void getServerList() throws Exception { 30 // 获取服务器子节点信息,并且对父节点进行监听 31 List<String> children = zk.getChildren(parentNode, true); 32 System.out.println(children); 33 } 34 35 /** 36 * 业务功能 37 */ 38 public void handleBussiness() throws InterruptedException { 39 System.out.println("client start working....."); 40 Thread.sleep(Long.MAX_VALUE); 41 } 42 43 public static void main(String[] args) throws Exception { 44 // 获取zk连接 45 Client client = new Client(); 46 client.getConnect(); 47 // 业务线程启动 48 client.handleBussiness(); 49 } 50 }
需要注意的是zookeeper API这里有个Bug,每次watcher执行后就自动失效了,需要重新绑定watcher一次,而在这个demo中,通过调用zk.getChildren时watch参数传入true来再次绑定(或者调用其他需要传入boolean watch参数的方法来绑定watcher);
三、zkClient使用:
- 导入jar包;
1 <dependency> 2 <groupId>com.github.sgroschupf</groupId> 3 <artifactId>zkclient</artifactId> 4 <version>0.1</version> 5 </dependency>
- 编写服务端:
1 public class Server { 2 private static final String connectString = "10.103.16.207:2181,10.103.16.208:2181,10.103.16.209:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private ZkClient zkClient = null; 6 7 private static final String hostname = "10.103.16.214"; 8 9 public void getConnect() { 10 zkClient = new ZkClient(connectString); 11 } 12 13 public void registerServer(String hostname) { 14 String create = zkClient.create(parentNode + "/child", "hello, child node", CreateMode.EPHEMERAL_SEQUENTIAL); 15 System.out.println(hostname + " is online.." + create); 16 } 17 18 public void handleBusiness(String hostname) throws InterruptedException { 19 System.out.println(hostname + " start working....."); 20 Thread.sleep(Long.MAX_VALUE); 21 } 22 23 public static void main(String[] args) throws InterruptedException { 24 // 获取zk连接 25 Server server = new Server(); 26 server.getConnect(); 27 // 利用zk连接注册服务器信息 28 server.registerServer(hostname); 29 // 启动业务功能 30 server.handleBusiness(hostname); 31 } 32 }
- 编写客户端:
1 public class Client implements IZkChildListener, IZkDataListener, IZkStateListener { 2 private static final String connectString = "10.103.16.207:2181,10.103.16.208:2181,10.103.16.209:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private volatile List<String> serverList; 6 7 private static ZkClient zkClient = null; 8 9 @Override 10 public void handleChildChange(String parentPath, List<String> currentChilds) throws Exception { 11 System.out.println("-----------child change: " + parentPath + " " + currentChilds); 12 } 13 14 @Override 15 public void handleDataChange(String dataPath, Object data) throws Exception { 16 System.out.println("-----------data change: " + dataPath + " " + data); 17 } 18 19 @Override 20 public void handleDataDeleted(String dataPath) throws Exception { 21 System.out.println("-----------data delete: " + dataPath); 22 } 23 24 @Override 25 public void handleStateChanged(Watcher.Event.KeeperState state) throws Exception { 26 System.out.println("-----------state change: " + state); 27 } 28 29 @Override 30 public void handleNewSession() throws Exception { 31 System.out.println("-----------new session***********"); 32 } 33 34 public void getConnect() { 35 zkClient = new ZkClient(connectString); 36 37 zkClient.subscribeChildChanges(parentNode, this); 38 zkClient.subscribeDataChanges(parentNode, this); 39 zkClient.subscribeStateChanges(this); 40 41 } 42 43 public void handleBussiness() throws InterruptedException { 44 System.out.println("client start working....."); 45 Thread.sleep(Long.MAX_VALUE); 46 } 47 48 public static void main(String[] args) throws InterruptedException { 49 // 获取zk连接 50 Client client = new Client(); 51 client.getConnect(); 52 // 业务线程启动 53 client.handleBussiness(); 54 } 55 }
四、Curator使用:
- 导入jar包;
1 <dependency> 2 <groupId>org.apache.curator</groupId> 3 <artifactId>curator-recipes</artifactId> 4 <version>2.7.0</version> 5 </dependency>
- 编写服务端:
1 public class Server1 { 2 private static final String connectString = "10.103.16.207:2181,10.103.16.208:2181,10.103.16.209:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private CuratorFramework client = null; 6 7 private static final String hostname = "10.103.16.214"; 8 9 public void getConnect() { 10 client = CuratorFrameworkFactory.newClient( 11 connectString, 12 new RetryNTimes(10, 5000) 13 ); 14 client.start(); 15 } 16 17 public void registerServer(String hostname) throws Exception { 18 PersistentEphemeralNode node = new PersistentEphemeralNode(client, PersistentEphemeralNode.Mode.EPHEMERAL, 19 parentNode + "/" + hostname, "临时节点".getBytes()); 20 node.start(); 21 } 22 23 public void handleBusiness(String hostname) throws InterruptedException { 24 System.out.println(hostname + " start working....."); 25 Thread.sleep(Long.MAX_VALUE); 26 } 27 28 public static void main(String[] args) throws Exception { 29 // 获取zk连接 30 Server1 server = new Server1(); 31 server.getConnect(); 32 // 利用zk连接注册服务器信息 33 server.registerServer(hostname); 34 // 启动业务功能 35 server.handleBusiness(hostname); 36 } 37 }
- 编写客户端:
1 public class Client1 { 2 private static final String connectString = "10.103.16.207:2181,10.103.16.208:2181,10.103.16.209:2181"; 3 private static final int sessionTimeout = 2000; 4 private static final String parentNode = "/zookeeper"; 5 private CuratorFramework client = null; 6 7 private static final String hostname = "10.103.16.214"; 8 9 public void getConnect() { 10 client = CuratorFrameworkFactory.newClient( 11 connectString, 12 new RetryNTimes(10, 5000) 13 ); 14 client.start(); 15 } 16 17 public void registerServer(String hostname) throws Exception { 18 TreeCache watcher = new TreeCache(client, parentNode + "/" + hostname); 19 watcher.getListenable().addListener((client1, event) -> { 20 ChildData data = event.getData(); 21 if (data == null) { 22 System.out.println("No data in event[" + event + "]"); 23 } else { 24 System.out.println("Receive event: " 25 + "type=[" + event.getType() + "]" 26 + ", path=[" + data.getPath() + "]" 27 + ", data=[" + new String(data.getData()) + "]" 28 + ", stat=[" + data.getStat() + "]"); 29 } 30 }); 31 watcher.start(); 32 } 33 34 public void handleBusiness(String hostname) throws InterruptedException { 35 System.out.println(hostname + " start working....."); 36 Thread.sleep(Long.MAX_VALUE); 37 } 38 39 public static void main(String[] args) throws Exception { 40 // 获取zk连接 41 Client1 client = new Client1(); 42 client.getConnect(); 43 // 利用zk连接注册服务器信息 44 client.registerServer(hostname); 45 // 启动业务功能 46 client.handleBusiness(hostname); 47 } 48 }
五、常见问题:
1、集群部署时,配置好各节点参数后,启动zookeeper总是失败:
在BIN目录下有一个zookeeper.out文件记录了zookeeper启动中的日志,通过查看这个日志可以看到在连接其他节点时失败导致的报错,在确定网络没有问题的情况下,就应该是防火墙的原因导致的,只需要关闭掉防火墙就可以了;
在centerOS7下关闭防火墙:
systemctl stop firewalld.service;
在centerOS7下禁止掉防火墙(下次重启系统时也不再启动防火墙):
systemctl disable firewalld.service;
2、连接zookeeper太久的问题:启动ZK后查看启动状态需要好久才能看到是leader还是follwer,java中使用API连接时卡住比较长时间或者有时经常连接失败的问题:
这个应该是系统的host解析问题导致的,在centerOS7系统中,编辑hosts文件(vim /etc/hosts),删掉127.0.0.1的那一行,重启网卡(/etc/rc.d/init.d/network restart),再次启动zookeeper即可发现连接快了很多;
六、参考资料:
- 书籍:《从PAXOS到ZOOKEEPER分布式一致性原理与实践》;
- https://blog.csdn.net/shuxing520/article/details/79988167
https://blog.csdn.net/u012152619/article/category/6470028
https://blog.csdn.net/u012152619/article/details/53053634
https://www.cnblogs.com/leesf456/p/6036548.html
https://www.jianshu.com/p/24dbc4e7cc0d
https://blog.csdn.net/Regan_Hoo/article/details/78773817
https://blog.csdn.net/gs80140/article/details/51496925
- Curator资料:
https://www.cnblogs.com/seaspring/p/5536338.html
https://blog.csdn.net/qq_15370821/article/details/73692288
https://www.cnblogs.com/sigm/p/6749228.html