hadoop学习笔记:zookeeper学习(上)
在前面的文章里我多次提到zookeeper对于分布式系统开发的重要性,因此对zookeeper的学习是非常必要的。本篇博文主要是讲解zookeeper的安装和zookeeper的一些基本的应用,同时我还会教大家如何安装伪分布式,伪分布式不能在windows下实现,只能在linux下实现,我的伪分布式是通过电脑的虚拟机完成了,好了,不废话了,具体内容如下:
首先我们要下载一个zookeeper,下载地址是:
http://www.apache.org/dyn/closer.cgi/zookeeper/
一般我们会选择一个stable版(稳定版)进行下载,我下载的版本是zookeeper-3.4.5。
我笔记本的操作系统是windows7,windows操作系统可以作为zookeeper的开发平台,但是不能作为zookeeper的生产平台,首先我们在windows下安装一个单机版的zookeeper。
我们先解压zookeeper的安装包,解压后的zookeeper安装包我放置的路径是:
E:\zookeeper\zookeeper-3.4.5
下图是zookeeper的目录结构:
我们进入conf包,将zoo_sample.cfg文件复制一份,并将复制好的文件改名为zoo.cfg。打开新建的zoo.cfg文件,将里面的内容进行修改,修改后的文件内容如下:
#initLimit=10 #syncLimit=5 tickTime=2000 dataDir=E:/zookeeper/zookeeper-3.4.5/data clientPort=2181
下面我来解释下配置文件里的各个参数:
initLimit和syncLimit是针对集群的参数,在我后面讲解伪分布式安装时候我会再讲解。
tickTime:该参数用来定义心跳的间隔时间,zookeeper的客户端和服务端之间也有和web开发里类似的session的概念,而zookeeper里最小的session过期时间就是tickTime的两倍。
dataDir:英文注释可以翻译为存储在内存中的数据库快照功能,我们可以看看运行后dataDir所指向的文件存储了什么样的数据,如下图所示:
看来dataDir里还存储了日志信息,dataDir不能存放在命名为tmp的文件里。
clientPort:是监听客户端连接的端口号。
接下来我们要将zookeeper的安装信息配置到windows的环境变量里,我们在“我的电脑”上点击右键,选择属性,再点击高级系统设置,点击环境变量按钮,在系统变量这一栏,点击新建,添加:
变量名:ZOOKEEPER_HOME 变量值:E:\zookeeper\zookeeper-3.4.5
还是在系统变量这一栏,找到path,点击编辑path,在变量值里添加:% ZOOKEEPER_HOME %\bin; % ZOOKEEPER_HOME %\conf;
Zookeeper使用java编写的,因此安装zookeeper之前一定要先安装好jdk,并且jdk的版本要大于或等于1.6。
这样单机版的zookeeper就安装好了,下面我们将运行zookeeper。
首先我们打开windows的命令行工具,将文件夹转到zookeeper安装目录的下的bin目录,然后输入zkServer命令,回车执行,那么zookeeper服务就启动成功了。
下面我们用客户端连接zookeeper的服务端,我们再打开一个命令行工具,输入命令:
zkCli -server localhost:2181
下面是相关测试,如下图所示:
伪分布式的安装,zookeeper和hadoop一样也可以进行伪分布式的安装,下面我就讲解如何进行伪分布式安装。
我开始尝试在windows下安装伪分布式,但是没有成功,最后是在linux操作系统下才安装好伪分布式,我们首先下载好zookeeper的安装程序,然后新建三个配置文件分别是:
zoo1.cfg:
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=E:/zookeeper/zookeeper-3.4.5/d_1 # the port at which the clients will connect clientPort=2181 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 dataLogDir=E:/zookeeper/zookeeper-3.4.5/log1_2 server.1=localhost:2887:3887 server.2=localhost:2888:3888 server.3=localhost:2889:3889
zoo2.cfg:
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=E:/zookeeper/zookeeper-3.4.5/d_2 # the port at which the clients will connect clientPort=2182 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 dataLogDir=E:/zookeeper/zookeeper-3.4.5/logs_2 server.1=localhost:2887:3887 server.2=localhost:2888:3888 server.3=localhost:2889:3889
zoo3.cfg:
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=E:/zookeeper/zookeeper-3.4.5/d_3 # the port at which the clients will connect clientPort=2183 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 dataLogDir=E:/zookeeper/zookeeper-3.4.5/logs_3 server.1=localhost:2887:3887 server.2=localhost:2888:3888 server.3=localhost:2889:3889
这里我们把每个配置文件里的clientPort做了一定修改,让每个文件之间的clientPort不一样,dataDir属性也做了同样的调整,同时还添加了新配置内容,如下所示:
server.1=localhost:2887:3887 server.2=localhost:2888:3888 server.3=localhost:2889:3889
这里localhost指的是组成zookeeper服务的机器IP的地址,2887是用于进行leader选举的端口,3887是zookeeper集群里各个机器之间的通信接口。
initLimit:是指follower连接并同步到leader的初始化连接,它是通过tickTime的倍数表示,例如我们上面的配置就是10倍的tickTime,当初始化连接时间超过设置的倍数时候则连接失败。
syncLimit:是指follower和leader之间发送消息时请求和应答的时间长度,如果follower在设置的时间范围内不能喝leader通信,那么该follower将会被丢弃,它也是按tickTime的倍数进行设置的。
dataLogDir:这个配置是指zookeeper运行的相关日志写入的目录,设定了配置,那么dataLog里日志的目录将无效,专门的日志存放路径,对zookeeper的性能和稳定性有好处。
这里每一个配置文件都代表一个zookeeper服务器,下面我们启动伪分布式的zookeeper集群。
zkServer.sh start zoo1.cfg zkServer.sh start zoo2.cfg zkServer.sh start zoo3.cfg
下面我写一个java程序,该程序作为客户端调用zookeeper的服务,代码如下:
package cn.com.test; import java.io.IOException; import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException; import org.apache.zookeeper.WatchedEvent; import org.apache.zookeeper.Watcher; import org.apache.zookeeper.ZooDefs.Ids; import org.apache.zookeeper.ZooKeeper; public class zkClient { public static void main(String[] args) throws Exception{ Watcher wh = new Watcher(){ @Override public void process(WatchedEvent event) { System.out.println(event.toString()); } }; ZooKeeper zk = new ZooKeeper("localhost:2181",30000,wh); System.out.println("=========创建节点==========="); zk.create("/sharpxiajun", "znode1".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); System.err.println("=============查看节点是否安装成功==============="); System.out.println(new String(zk.getData("/sharpxiajun", false, null))); System.out.println("=========修改节点的数据=========="); zk.setData("/sharpxiajun", "sharpxiajun130901".getBytes(), -1); System.out.println("========查看修改的节点是否成功========="); System.out.println(new String(zk.getData("/sharpxiajun", false, null))); System.out.println("=======删除节点=========="); zk.delete("/sharpxiajun", -1); System.out.println("==========查看节点是否被删除============"); System.out.println("节点状态:" + zk.exists("/sharpxiajun", false)); zk.close(); } }
执行结果如下:
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper). log4j:WARN Please initialize the log4j system properly. =========创建节点=========== WatchedEvent state:SyncConnected type:None path:null =============查看节点是否安装成功=============== znode1 =========修改节点的数据========== ========查看修改的节点是否成功========= sharpxiajun130901 =======删除节点========== ==========查看节点是否被删除============ 节点状态:null
程序我今天不讲解了,只是给大伙展示下使用zookeeper的方式,本文可能没啥新颖的东西,但是本文是一个基础,有了这个基础我们才能真正操作zookeeper。