Zookeeper学习--Zookeeper基本使用和集群搭建

Zookeeper学习--Zookeeper基本使用和集群搭建

本章记录学习Zookeeper的环境搭建以及基本api使用等。部分内容摘要自Zookeeper官网。本次使用的Zookeeper版本为3.4.9,这里我在本机上安装了VMWare,并创建了三台虚拟机,方便后面集群搭建。

@

基本介绍

ZooKeeper: A Distributed Coordination Service for Distributed Applications

ZooKeeper is a distributed, open-source coordination service for distributed applications. It exposes a simple set of primitives that distributed applications can build upon to implement higher level services for synchronization, configuration maintenance, and groups and naming. It is designed to be easy to program to, and uses a data model styled after the familiar directory tree structure of file systems. It runs in Java and has bindings for both Java and C.

Coordination services are notoriously hard to get right. They are especially prone to errors such as race conditions and deadlock. The motivation behind ZooKeeper is to relieve distributed applications the responsibility of implementing coordination services from scratch.

直译:zookeeper是一个分布式应用的分布式协调服务

ZooKeeper是一个为分布式应用提供的分布式、开源的协调服务。它公开了一组简单的原语,分布式应用程序可以根据这些原语来实现用于同步、配置维护以及组和命名的高级服务。它被设计为易于编程,并使用了文件系统中常见的目录树结构样式的数据模型。它在Java中运行,并且有针对Java和C的绑定。

协调服务是出了名的难搞。它们特别容易出现竞态条件和死锁等错误。ZooKeeper背后的动机是减轻分布式应用程序从零开始实现协调服务的责任。

下面是关于zookeeper的一些特性

Guarantees

ZooKeeper is very fast and very simple. Since its goal, though, is to be a basis for the construction of more complicated services, such as synchronization, it provides a set of guarantees. These are:

  • Sequential Consistency - Updates from a client will be applied in the order that they were sent.
  • Atomicity - Updates either succeed or fail. No partial results.
  • Single System Image - A client will see the same view of the service regardless of the server that it connects to.
  • Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.
  • Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.

直译:

担保

ZooKeeper是非常快速和简单的。但是,由于其目标是构建更复杂的服务(如同步)的基础,因此它提供了一组保证。这些都是:

顺序一致性——来自客户端的更新将按照发送的顺序应用。

  • 原子性-更新成功或失败。没有部分结果。
  • 单一系统映像-客户端将看到相同的服务视图,无论它连接到哪个服务器。
  • 可靠性——一旦应用了更新,它将从那时起一直持续到客户端覆盖更新为止。
  • 及时性-客户对系统的看法是保证在一定的时间范围内是最新的。

服务端下载

进入Zookeeper官网下载Zookeeper3.4.9版本,并解压到虚拟机目录。

Zookeeper下载

服务端运行和基本操作

#启动zkServer。
sh zkServer.sh start
#查看zkServer状态
sh zkServer.sh status
#Mode: standalone  表示非集群的标准模式启动

连接到zkServer可以使用zookeeper提供的zkCli工具。这里的-server加上ip:端口可以省略。

sh zkCli.sh -server 127.0.0.1:2181

然后可以看到已连接上。

通过执行ls / 查看根节点下的Path

[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]

通过执行get zookeeper查看Path:zookeeper的stat信息

[zk: localhost:2181(CONNECTED) 2] get /zookeeper

cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = -1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 1

其他的crud操作可以参加如下官网摘要:

Next, create a new znode by running create /zk_test my_data. This creates a new znode and associates the string "my_data" with the node. You should see:

[zkshell: 9] create /zk_test my_data
Created /zk_test

Issue another ls / command to see what the directory looks like:

[zkshell: 11] ls /
[zookeeper, zk_test]

Notice that the zk_test directory has now been created.

Next, verify that the data was associated with the znode by running the get command, as in:

[zkshell: 12] get /zk_test
my_data
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 5
mtime = Fri Jun 05 13:57:06 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0
dataLength = 7
numChildren = 0

We can change the data associated with zk_test by issuing the set command, as in:

[zkshell: 14] set /zk_test junk
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 6
mtime = Fri Jun 05 14:01:52 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0
dataLength = 4
numChildren = 0
[zkshell: 15] get /zk_test
junk
cZxid = 5
ctime = Fri Jun 05 13:57:06 PDT 2009
mZxid = 6
mtime = Fri Jun 05 14:01:52 PDT 2009
pZxid = 5
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0
dataLength = 4
numChildren = 0

(Notice we did a get after setting the data and it did, indeed, change.

Finally, let's delete the node by issuing:

[zkshell: 16] delete /zk_test
[zkshell: 17] ls /
[zookeeper]
[zkshell: 18]

客户端连接

客户端连接构建

/**
 * @author: to_be_continued
 * @Date: 2020/6/22 10:36
 */
public class TestZk {
    private static ZooKeeper zooKeeper;

    public static void main(String[] args) throws IOException {
        //初始化zookeeper客户端,TestZkWatch用于测试watch机制
        zooKeeper = new ZooKeeper("192.168.80.128:2181", 5000, new TestZkWatch());
        System.in.read();
    }
}

/**
 * 测试用zk-watch
 * @author to_be_continued
 */
class TestZkWatch implements Watcher {

    @Override
    public void process(WatchedEvent watchedEvent) {
        System.out.println("testZkWatch watchedEvent: " + watchedEvent);
    }
}

如上所示,zookeeper的客户端连接就初始化好了。接下来使用最常见的Crud操作。

创建节点

/**
 * 创建节点
 * @param path
 */
public static String create(String path, byte[] data) throws KeeperException, InterruptedException {
    //创建节点,这里指定了ACL和节点Mode。
    return zooKeeper.create(path, data, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}

public static void main(String[] args) throws IOException, KeeperException, InterruptedException {
    //初始化zookeeper客户端
    zooKeeper = new ZooKeeper("192.168.80.128:2181", 5000, new TestZkWatch());

    //创建一个testZk节点,节点数据信息为testCreateData,创建完成后可以看到TestZkWatch的#process执行
    create("/testZk", "testCreateData".getBytes());

    System.in.read();
}

查询节点

public static void main(String[] args) throws IOException, KeeperException, InterruptedException {
    //初始化zookeeper客户端
    zooKeeper = new ZooKeeper("192.168.80.128:2181", 5000, new TestZkWatch());

    //创建一个testZk节点,节点数据信息为testCreateData
//        create("/testZk", "testCreateData".getBytes());

    //查询testZk节点信息
    Stat stat = new Stat();
    System.out.println(new String(getData("/testZk", stat)));
    System.out.println(stat);

    System.in.read();
}

/**
 * 查询节点信息
 * @param path
 * @param stat
 * @return
 * @throws KeeperException
 * @throws InterruptedException
 */
public static byte[] getData(String path, Stat stat) throws KeeperException, InterruptedException {
    //这里的watch传入true表示使用默认的watch
    return zooKeeper.getData(path, true, stat);
}

修改节点

public static void main(String[] args) throws IOException, KeeperException, InterruptedException {
    //初始化zookeeper客户端
    zooKeeper = new ZooKeeper("192.168.80.128:2181", 5000, new TestZkWatch());

    //创建一个testZk节点,节点数据信息为testCreateData
//        create("/testZk", "testCreateData".getBytes());

    //查询testZk节点信息
    Stat stat = new Stat();
    System.out.println(new String(getData("/testZk", stat)));
    System.out.println(stat);

    //更新testZk节点信息
    System.out.println(update("/testZk", "updateData".getBytes(), stat.getVersion()));

    System.out.println(new String(getData("/testZk", stat)));
    System.out.println(stat);

    System.in.read();
}

/**
 * 更新指定节点数据信息
 * @param path
 * @param data
 * @param version 版本号,乐观锁的机制
 * @return stat
 * @throws KeeperException
 * @throws InterruptedException
 */
public static Stat update(String path, byte[] data, int version) throws KeeperException, InterruptedException {
    return zooKeeper.setData(path, data, version);
}

删除节点

public static void main(String[] args) throws IOException, KeeperException, InterruptedException {
    //初始化zookeeper客户端
    zooKeeper = new ZooKeeper("192.168.80.128:2181", 5000, new TestZkWatch());

    //创建一个testZk节点,节点数据信息为testCreateData
//        create("/testZk", "testCreateData".getBytes());



    //查询testZk节点信息
    Stat stat = new Stat();
    System.out.println(new String(getData("/testZk", stat)));
    System.out.println(stat);

    delete("/testZk", stat.getVersion());

//        //更新testZk节点信息
//        System.out.println(update("/testZk", "updateData".getBytes(), stat.getVersion()));
//
//        System.out.println(new String(getData("/testZk", stat)));
//        System.out.println(stat);

    System.in.read();
}

/**
 * 删除指定节点
 * @param path
 * @param version
 * @throws KeeperException
 * @throws InterruptedException
 */
public static void delete(String path, int version) throws KeeperException, InterruptedException {
    zooKeeper.delete(path, version);
}

ACL和CreateMode

可以看到在create的时候,指定了ACL和CreateMode。

ACL:zookeeper支持znode设置access control(访问控制)

ACL Permissions

ZooKeeper supports the following permissions:

  • CREATE: you can create a child node
  • READ: you can get data from a node and list its children.
  • WRITE: you can set data for a node
  • DELETE: you can delete a child node
  • ADMIN: you can set permissions

CreateMode:为zookeeper支持的节点模型。

  • EPHEMERAL 临时节点,会话断开时节点被删除

  • EPHEMERAL_SEQUENTIAL 临时有序节点,会话断开时,znode将被删除,其名称将附加一个单调递增的数字。

  • PERSISTENT 持久化节点,会话断开时节点不会被删除

  • PERSISTENT_SEQUENTIAL 持久化有序节点,会话断开时,znode不会被删除,其名称将附加一个单调递增的数字。

  • TTL(Added in 3.6.0) TTL节点是3.6以后新增的一种可以支持设置过期时间的类型。可以理解为节点的属性。TTL仅支持在Persistent和Persistent Sequence上设置。

    TTL Nodes

    Added in 3.6.0

    When creating PERSISTENT or PERSISTENT_SEQUENTIAL znodes, you can optionally set a TTL in milliseconds for the znode. If the znode is not modified within the TTL and has no children it will become a candidate to be deleted by the server at some point in the future.

    Note: TTL Nodes must be enabled via System property as they are disabled by default. See the Administrator's Guide for details. If you attempt to create TTL Nodes without the proper System property set the server will throw KeeperException.UnimplementedException.

  • CONTAINER(Added in 3.6.0) 容器节点是3.6以后新增的一种特殊节点。当容器节点的子节点都被删除时,容器节点被删除。

    Container Nodes

    Added in 3.6.0

    ZooKeeper has the notion of container znodes. Container znodes are special purpose znodes useful for recipes such as leader, lock, etc. When the last child of a container is deleted, the container becomes a candidate to be deleted by the server at some point in the future.

    Given this property, you should be prepared to get KeeperException.NoNodeException when creating children inside of container znodes. i.e. when creating child znodes inside of container znodes always check for KeeperException.NoNodeException and recreate the container znode when it occurs.

使用curator插件

What is Curator?

Curator n ˈkyoor͝ˌātər: a keeper or custodian of a museum or other collection - A ZooKeeper Keeper.

Apache Curator is a Java/JVM client library for Apache ZooKeeper, a distributed coordination service. It includes a highlevel API framework and utilities to make using Apache ZooKeeper much easier and more reliable. It also includes recipes for common use cases and extensions such as service discovery and a Java 8 asynchronous DSL.

直译:

什么是Curator?

Apache curator是Apache ZooKeeper(分布式协调服务)的Java/JVM客户端库。它包括一个高级API框架和工具,使使用Apache ZooKeeper更容易和更可靠。它还包括常见用例和扩展(如服务发现和Java 8异步DSL)的配方。

create and getData
/**
 * 使用curator api可以减少很多使用原生api的不便之处,例如多级节点创建、大量的声明式异常等。
 * @author: tu
 * @Date: 2020/6/22 11:22
 */
public class TestCurator {

    public static void main(String[] args) throws Exception {

        CuratorFramework curatorFramework = CuratorFrameworkFactory.newClient("192.168.80.128:2181", new ExponentialBackoffRetry(1000, 3));
        curatorFramework.start();
        String path = "/testCurator";

        System.out.println(curatorFramework.create().forPath(path, "testCurator".getBytes()));
        System.out.println(curatorFramework.getData().forPath(path));
        System.in.read();
    }
}
Distribute Lock

可以使用curator api来实现分布式锁。

InterProcessMutex lock = new InterProcessMutex(client, lockPath);
if ( lock.acquire(maxWait, waitUnit) ) 
{
    try 
    {
        // do some work inside of the critical section here
    }
    finally
    {
        lock.release();
    }
}
Leader Election

可以使用curator api来实现选举

LeaderSelectorListener listener = new LeaderSelectorListenerAdapter()
{
    public void takeLeadership(CuratorFramework client) throws Exception
    {
        // this callback will get called when you are the leader
        // do whatever leader work you need to and only exit
        // this method when you want to relinquish leadership
    }
}

LeaderSelector selector = new LeaderSelector(client, path, listener);
selector.autoRequeue();  // not required, but this is behavior that you will probably expect
selector.start();

集群部署

部署介绍

集群部署测试建议多台机器分开部署,部署同一台上需要修改dataDir,port等其他配置。可以参考官网介绍。

If you want to test multiple servers on a single machine, specify the servername as localhost with unique quorum & leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in the example above) for each server.X in that server's config file. Of course separate _dataDir_s and distinct _clientPort_s are also necessary (in the above replicated example, running on a single localhost, you would still have three config files).

Please be aware that setting up multiple servers on a single machine will not create any redundancy. If something were to happen which caused the machine to die, all of the zookeeper servers would be offline. Full redundancy requires that each server have its own machine. It must be a completely separate physical server. Multiple virtual machines on the same physical host are still vulnerable to the complete failure of that host.

对于集群部署策略,官方给的建议至少3台服务器且强烈建议使用奇数个服务器。

For replicated mode, a minimum of three servers are required, and it is strongly recommended that you have an odd number of servers. If you only have two servers, then you are in a situation where if one of them fails, there are not enough machines to form a majority quorum. Two servers are inherently less stable than a single server, because there are two single points of failure.

直译:对于复制模式,至少需要三个服务器,强烈建议使用奇数个服务器。如果您只有两个服务器,那么您将处于这样一种情况:如果其中一个服务器发生故障,则没有足够的机器来形成多数仲裁。两台服务器在本质上比一台服务器更不稳定,因为有两个单点故障。

配置修改

集群模式需要修改conf/zoo.cfg文件。配置集群下的其他server信息。

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
#配置集群信息
server.1=192.168.80.128:2888:3888
server.2=192.168.80.129:2888:3888
server.3=192.168.80.130:2888:3888

server是配置的前缀,这个是固定的。后面的.1代表zookeeper的serverId,id的文件在dataDir下存放,会有一个叫myid的文件,文件内部就是具体id值。等于号后面的分别是ip:同步通信端口:选举通信端口。其他的配置可以参考:Configuration Parameters

三台机器配置完成后,直接启动就完成了集群的部署。通过 sh zkServer.sh status可以查看到server的mode状态,Follower、Leader或者Observer。

posted @ 2022-03-03 15:16  生如梦境  阅读(25)  评论(0编辑  收藏  举报