ZooKeeper 3.6.X 配置参考
“好记性不如烂笔头。” —— 张溥
0x00 大纲
0x01 前言
部分内容翻译自 ZooKeeper 3.6 Documentation,文末附原文章节,可对照理解。
0x02 独立运行
在独立模式下设置 ZooKeeper 服务很简单。服务包含在单个 JAR 文件中,因此安装包括创建配置。
下载稳定的 ZooKeeper 版本后,将其解压缩并 CD 到根目录。
要启动 ZooKeeper,您需要一个配置文件。这是一个示例,在 conf/zoo.cfg 中创建如下配置:
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
这个文件名字可以随便取,但为了方便这个讨论,把它叫做 conf/zoo.cfg。更改 dataDir 的值以指定一个存在的目录(初始值为空)。以下是每个字段的含义:
-
tickTime:ZooKeeper 使用的基本时间单位(以毫秒为单位)。它用于执行心跳检测,最小会话超时时间将是 tickTime 的两倍。
-
dataDir:存储内存中数据库快照的位置,除非另有指定,否则事务日志也将更新到数据库中。
-
clientPort:侦听客户端连接的端口。
现在你已经创建了配置文件,你可以启动 ZooKeeper:
bin/zkServer.sh start
(如果是 Windows 版本,则执行)
bin/zkServer.cmd
ZooKeeper 使用 log4j 记录消息 - 更多详细信息可在程序员指南的日志记录部分找到。您将看到输出到控制台的日志消息(默认)和/或日志文件,具体取决于 log4j 配置。
此处概述的步骤在独立模式下运行 ZooKeeper。没有复制(集群),因此如果 ZooKeeper 进程出错,服务将关闭。这对于大多数开发情况来说都很好,但是要在复制模式下运行 ZooKeeper,请参阅运行复制的 ZooKeeper。
0x03 集群运行
在独立模式下运行 ZooKeeper 便于评估、开发和测试。但在生产环境中,您应该在复制(集群)模式下运行 ZooKeeper。同一应用程序中的复制服务器组称为 quorum ,在复制模式下, quorum 中的所有服务器都具有同一配置文件的副本。
注意
对于复制模式,至少需要三台服务器,强烈建议您拥有奇数台服务器。如果只有两台服务器,则如果其中一台服务器发生故障,则没有足够的计算机来形成多数仲裁。两台服务器本质上不如一台服务器稳定,因为存在两个单点故障。
复制模式所需的 conf/zoo.cfg 文件与独立模式下使用的配置文件类似,但有一些不同之处。下面是一个示例:
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
新配置选项 initLimit 是 ZooKeeper 用来限制仲裁中的 ZooKeeper 服务器必须连接到领导者的时长。配置选项 syncLimit 限制服务器与领导者之间的超时时间。
对于这两个超时,您可以使用 tickTime 指定时间单位。在此示例中,initLimit 的超时为 5 个时钟周期,每个时钟周期 2000 毫秒,即 10 秒。
服务器配置 server.X 列出了组成 ZooKeeper 服务的服务器。当服务器启动时,它通过在数据目录中查找文件 myid 来知道它是哪个服务器。该文件包含服务器编号(ASCII字符集)。
最后,记下每个服务器名称后面的两个端口号:“2888”和“3888”。各个节点使用这两个端口连接到其他节点。这种连接是必要的,以便节点之间可以进行通信,例如,就更新顺序达成一致。更具体地说,ZooKeeper 服务器使用此端口将从者连接到领导。当出现新的领导时,从者使用此端口打开与领导的 TCP 连接。由于默认领导选举也使用 TCP,因此我们目前需要另一个端口进行领导选举。这是服务器配置中的第二个端口。
注意
如果要在一台计算机上测试多个服务器,请将服务器名称指定为 localhost 并分配不同的端口(例如 2888:3888、2889:3889、2890:3890)。当然,单独的 _dataDir_s 和不同的 _clientPort_s 也是必要的(在上述的复制示例中,在单个主机上运行,您将有三个不同的配置文件)。
请注意,在一台计算机上设置多个服务不会产生任何冗余。如果机器发生故障,所有 zookeeper 服务都将脱机。完全冗余要求每个服务都有自己的主机,它必须是完全独立的物理服务器。同一物理主机上的多个虚拟机仍然容易受到该宿主机故障的影响。
如果您的 ZooKeeper 计算机中有多个网络接口,您还可以指示 ZooKeeper 绑定所有接口,并在发生网络故障时自动切换到正常运行的接口。有关详细信息,请参阅配置参数。
0x04 单机集群配置补充
将解压后的 zookeeper 包复制三份,可分别重命名为 server1、server2、server3。
在 server1/conf/zoo.cfg 中创建如下配置:
tickTime=2000
dataDir=/server1/data
clientPort=2181
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
在 server2/conf/zoo.cfg 中创建如下配置:
tickTime=2000
dataDir=/server2/data
clientPort=2182
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
在 server3/conf/zoo.cfg 中创建如下配置:
tickTime=2000
dataDir=/server3/data
clientPort=2183
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
在 /server1/data 中创建 myid 文件(没有后缀),文件内容为数字 1;
在 /server2/data 中创建 myid 文件(没有后缀),文件内容为数字 2;
在 /server3/data 中创建 myid 文件(没有后缀),文件内容为数字 3;
依次启动三个 ZooKeeper 服务。
0x05 官方原文
Standalone Operation
Setting up a ZooKeeper server in standalone mode is straightforward. The server is contained in a single JAR file, so installation consists of creating a configuration.
Once you've downloaded a stable ZooKeeper release unpack it and cd to the root
To start ZooKeeper you need a configuration file. Here is a sample, create it in conf/zoo.cfg:
tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181
This file can be called anything, but for the sake of this discussion call it conf/zoo.cfg. Change the value of dataDir to specify an existing (empty to start with) directory. Here are the meanings for each of the fields:
tickTime : the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
dataDir : the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
clientPort : the port to listen for client connections
Now that you created the configuration file, you can start ZooKeeper:
bin/zkServer.sh start
ZooKeeper logs messages using log4j -- more detail available in the Logging section of the Programmer's Guide. You will see log messages coming to the console (default) and/or a log file depending on the log4j configuration.
The steps outlined here run ZooKeeper in standalone mode. There is no replication, so if ZooKeeper process fails, the service will go down. This is fine for most development situations, but to run ZooKeeper in replicated mode, please see Running Replicated ZooKeeper.
Running Replicated ZooKeeper
Running ZooKeeper in standalone mode is convenient for evaluation, some development, and testing. But in production, you should run ZooKeeper in replicated mode. A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file.
Note
For replicated mode, a minimum of three servers are required, and it is strongly recommended that you have an odd number of servers. If you only have two servers, then you are in a situation where if one of them fails, there are not enough machines to form a majority quorum. Two servers are inherently less stable than a single server, because there are two single points of failure.
The required conf/zoo.cfg file for replicated mode is similar to the one used in standalone mode, but with a few differences. Here is an example:
tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
The new entry, initLimit is timeouts ZooKeeper uses to limit the length of time the ZooKeeper servers in quorum have to connect to a leader. The entry syncLimit limits how far out of date a server can be from a leader.
With both of these timeouts, you specify the unit of time using tickTime. In this example, the timeout for initLimit is 5 ticks at 2000 milliseconds a tick, or 10 seconds.
The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. That file has the contains the server number, in ASCII.
Finally, note the two port numbers after each server name: " 2888" and "3888". Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.
Note
If you want to test multiple servers on a single machine, specify the servername as localhost with unique quorum & leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in the example above) for each server.X in that server's config file. Of course separate _dataDir_s and distinct _clientPort_s are also necessary (in the above replicated example, running on a single localhost, you would still have three config files).
Please be aware that setting up multiple servers on a single machine will not create any redundancy. If something were to happen which caused the machine to die, all of the zookeeper servers would be offline. Full redundancy requires that each server have its own machine. It must be a completely separate physical server. Multiple virtual machines on the same physical host are still vulnerable to the complete failure of that host.
If you have multiple network interfaces in your ZooKeeper machines, you can also instruct ZooKeeper to bind on all of your interfaces and automatically switch to a healthy interface in case of a network failure. For details, see the Configuration Parameters.