clickhouse数据库1分片1副本搭建

【需求】

根据需求需要搭建clickhouse数据库副本操作

【环境介绍】
  系统环境:CentOS Linux release 7.6 + ClickHouse Server version 20.10.1 + apache-zookeeper-3.6.3 + java version "1.8.0_311"

 

 

 

【hosts环境】

两个节点添加hosts信息,这里生产建议统一主机名称识别,例如:cknode1,cknode2等

10.0.0.xx1 productionbackup
10.0.0.xx2 mysql-dev

【java环境】

可以从Oracle官网或者GitHub下载JAVA

https://www.oracle.com/java/technologies/downloads

https://github.com/AdoptOpenJDK/openjdk8-binaries/releases

 

解压及软连接
tar -zxvf jdk-8u311-linux-x64.tar.gz -C /usr/local/
ln -s /usr/local/jdk1.8.0_311 /usr/local/java

设置环境变量
vim /etc/profile
export JAVA_HOME=/usr/local/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

source /etc/profile

查看版本
java -version
java version "1.8.0_311"
Java(TM) SE Runtime Environment (build 1.8.0_311-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.311-b11, mixed mode

 

【zookeeper环境】

两个节点安装,节点一为主节点(一般建议3个节点安装)

两节点集群安装
选择安装的版本,建议使用稳定版
http://archive.apache.org/dist/zookeeper/

两个节点下载安装包做软链
wget http://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
tar -zxvf apache-zookeeper-3.6.3-bin.tar.gz -C /usr/local/

ln -s /usr/local/apache-zookeeper-3.6.3-bin /usr/local/zookeeper

两个节点配置环境变量
vim /etc/profile
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin

source /etc/profile

两个节点添加配置文件
cd /usr/local/zookeeper/conf
参考官方
https://clickhouse.tech/docs/en/operations/tips/#zookeeper

两个节点部署zoo.cfg配置文件
vim zoo.cfg
tickTime=2000
initLimit=30000
syncLimit=10
maxClientCnxns=2000
maxSessionTimeout=60000000
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/logs
autopurge.snapRetainCount=10
autopurge.purgeInterval=1
preAllocSize=131072
snapCount=3000000
leaderServes=yes
clientPort=2181
quorumListenOnAllIPs=true

 

这里需要在zoo.cfg添加集群信息,两个节点分别为
# 主节点为
server.1=10.0.0.xx1:2889:3889
server.2=10.0.0.xx2:2888:3888


# 从节点为
server.1=10.0.0.xx1:2889:3889
server.2=10.0.0.xx2:2888:3888

创建目录
mkdir -p /data/zookeeper/data
mkdir -p /usr/local/zookeeper/logs

myid设置
# 主节点为
echo "1">/data/zookeeper/data/myid
# 从节点为
echo "2">/data/zookeeper/data/myid

 

 

配置zk日志
默认zk日志输出到一个文件,且不会自动清理,所以,一段时间后zk日志会非常大
1.zookeeper-env.sh conf目录下新建zookeeper-env.sh文件
vim zookeeper-env.sh
#!/usr/bin/env bash
#tip:custom configurationfile,do not amend the zkEnv.sh file
#chang the log dir and output of rolling file

ZOO_LOG_DIR="/usr/local/zookeeper/logs"
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

添加权限
chmod 755 zookeeper-env.sh

2.log4j.properties 修改日志的输入形式
zookeeper.root.logger=INFO, ROLLINGFILE
#zookeeper.root.logger=INFO, CONSOLE

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=128MB
# uncomment the next line to limit number of backup files
log4j.appender.ROLLINGFILE.MaxBackupIndex=10


3.配置运行zk的JVM
conf目录下新建java.env文件,修改到sudo chmod 755 java.env权限,主要用于GC log,RAM等的配置.
vim java.env
#!/usr/bin/env bash
#config the jvm parameter in a reasonable
#note that the shell is source in so that do not need to use export
#set java classpath
#CLASSPATH=""
#set jvm start parameter , also can set JVMFLAGS variable
SERVER_JVMFLAGS="-Xms1024m -Xmx2048m $JVMFLAGS"

添加权限
chmod 755 java.env

启动zookeeper服务(所有节点)
zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

状态查看
zkServer.sh status
节点一状态:Mode: leader
节点二状态:Mode: follower

部署注意事项:
zoo.cfg配置文件中server这里使用不同端口,查看网上说端口一一致,但是却启动不了;
需要添加quorumListenOnAllIPs=true参宿,否则两节点互相访问不了;
如果设置了iptables,需要开通对应端口,2181/2889:3889/2888:3888;
如果是低版本调试可以使用zkServer.sh start-foreground日志输出观察信息;
myid设置建议对应server.x,x对应值配置;
生产建议使用三个节点,故障选举;

 

测试集群信息
两节点登陆:zkCli.sh -server 127.0.0.1:2181
登陆后两节点确认信息:ls /
节点一创建对象:create /zk_test mydata
节点二查询对象:get /zk_test ===》存在zk_test的mydata信息
节点一修改对象:set /zk_test junk
节点二查询对象:get /zk_test ===》存在zk_test的junk信息
节点一删除对象:delete /zk_test
节点二查询对象:ls / ===》不存在zk_test信息
说明:节点二能看到对应操作步骤结果即可,follower只能查询不能设置、修改对象

 

【clickhouse安装】

按照官网安装clickhouse

准备操作
需要验证当前服务器的 CPU 是否支持 SSE 4.2 指令集,因为向量化执行需要用到这项特性:
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

连网安装
依次执行如下命令:
yum install yum-utils -y
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/clickhouse.repo yum install clickhouse-server clickhouse-client -y

如果您没法链接互联网,则也可以使用 rpm 的方式来进行离线安装:需要下载的安装包有:
clickhouse-server-20.5.4.40-2.noarch.rpm
clickhouse-common-static-20.5.4.40-2.x86_64.rpm
clickhouse-client-20.5.4.40-2.noarch.rpm

下载地址在:
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/ https://packagecloud.io/Altinity/clickhouse

安装完成后,修改配置文件/etc/clickhouse-server/config.xml:
去掉注释,能所有连接:<listen_host>::</listen_host> 
添加配置,关联集群配置文件:<include_from>/etc/clickhouse-server/metrika.xml</include_from>

添加default用户密码,一般建议加密方式
/etc/clickhouse-server/users.xml
<password>default</password>

 

添加metrika配置文件
/etc/clickhouse-server/metrika.xml

<?xml version="1.0"?>
<yandex>
<!--ck集群节点-->
<clickhouse_remote_servers>
<ch_cluster_all>
<!--分片1-->
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>productionbackup</host>
<port>9000</port>
<user>default</user>
<password>default</password>
</replica>
<!--复制集1-->
<replica>
<host>mysql-dev</host>
<port>9000</port>
<user>default</user>
<password>default</password>
</replica>
</shard>
</ch_cluster_all>
</clickhouse_remote_servers>
<!--zookeeper相关配置-->
<zookeeper-servers>
<node index="1">
<host>productionbackup</host>
<port>2181</port>
</node>
<node index="2">
<host>mysql-dev</host>
<port>2181</port>
</node>
</zookeeper-servers>
<macros>
<layer>01</layer>
<shard>01</shard>
<!--分片号-->
<replica>10.0.0.xx1</replica>
<!--当前节点IP-->
</macros>
<networks>
<ip>::/0</ip>
</networks>
<!--压缩相关配置-->
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
<!--压缩算法lz4压缩比zstd快, 更占磁盘-->
</case>
</clickhouse_compression>
</yandex>
节点二
<?xml version="1.0"?>
<yandex>
<!--ck集群节点-->
<clickhouse_remote_servers>
<ch_cluster_all>
<!--分片1-->
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>productionbackup</host>
<port>9000</port>
<user>default</user>
<password>default</password>
</replica>
<!--复制集1-->
<replica>
<host>mysql-dev</host>
<port>9000</port>
<user>default</user>
<password>default</password>
</replica>
</shard>
</ch_cluster_all>
</clickhouse_remote_servers>
<!--zookeeper相关配置-->
<zookeeper-servers>
<node index="1">
<host>productionbackup</host>
<port>2181</port>
</node>
<node index="2">
<host>mysql-dev</host>
<port>2181</port>
</node>
</zookeeper-servers>
<macros>
<layer>01</layer>
<shard>01</shard>
<!--分片号-->
<replica>10.0.0.xx2</replica>
<!--当前节点IP-->
</macros>
<networks>
<ip>::/0</ip>
</networks>
<!--压缩相关配置-->
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
<!--压缩算法lz4压缩比zstd快, 更占磁盘-->
</case>
</clickhouse_compression>
</yandex>

 

两节点启动clickhouse
systemctl start clickhouse-server
systemctl status clickhouse-server

查看日志没有错误后,测试登录
clickhouse-client -udefault --password default --port 9000

查看集群,查看到部署的集群信息,其他为默认信息,忽略或者删除中config.xml的remote_servers里面内容即可
select * from system.clusters;

 

 

 

【测试集群操作】

创建数据库:
create database testdb on cluster ch_cluster_all;

 

创建表,注意ck-node001/2信息
节点一创建
CREATE TABLE testdb.sbtest_local(EventDate DateTime,CounterID UInt32,UserID UInt32) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/sbtest','ck-node001')PARTITION BY toYYYYMM(EventDate)ORDER BY (CounterID, EventDate, intHash32(UserID))SETTINGS index_granularity = 8192;

节点二创建
CREATE TABLE testdb.sbtest_local(EventDate DateTime,CounterID UInt32,UserID UInt32) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/sbtest','ck-node002')PARTITION BY toYYYYMM(EventDate)ORDER BY (CounterID, EventDate, intHash32(UserID))SETTINGS index_granularity = 8192;

 

写入分片1的写入节点的local表
insert into testdb.sbtest_local VALUES (now(), 10000, 10000)
在分片1的local表可见一条记录
select * from testdb.sbtest_local
在分片1副本节点local读取可见一条记录
select * from testdb.sbtest_local

 可以看到创建表的时候比较麻烦,需要两个节点创建表,可以使用以下方法创建表

参考:https://cloud.tencent.com/developer/article/1762200

CREATE TABLE testdb.test ON CLUSTER ch_cluster_all(`id` Int64,`ymd` Int64)ENGINE = ReplicatedMergeTree('/clickhouse/tables/replicated/{shard}/test', '{replica}')PARTITION BY ymd ORDER BY id;
insert into testdb.test values('1', '20201112');

posted @ 2021-12-15 11:17  zetan·chen  阅读(802)  评论(0编辑  收藏  举报