ZooKeeper的日志和快照
ZooKeeper有两种日志、一种快照。日志分为事务日志和ZooKeeper运行时的系统日志。
1.事务日志和快照
ZooKeeper集群中的每个服务器节点每次接收到写操作请求时,都会先将这次请求发送给leader,leader将这次写操作转换为带有状态的事务,然后leader会对这次写操作广播出去以便进行协调。当协调通过(大多数节点允许这次写)后,leader通知所有的服务器节点,让它们将这次写操作应用到内存数据库中,并将其记录到事务日志中。
当事务日志记录的次数达到一定数量后(默认10W次),就会将内存数据库序列化一次,使其持久化保存到磁盘上,序列化后的文件称为"快照文件"。每次拍快照都会生成新的事务日志。
有了事务日志和快照,就可以让任意节点恢复到任意时间点(只要没有清理事务日志和快照)。
1.1 事务日志和快照相关的配置项
-
dataDir
:
ZooKeeper的数据目录,主要目的是存储内存数据库序列化后的快照路径。如果没有配置事务日志(即dataLogDir配置项)的路径,那么ZooKeeper的事务日志也存放在数据目录中。 -
dataLogDir
:
指定事务日志的存放目录。事务日志对ZooKeeper的影响非常大,强烈建议事务日志目录和数据目录分开,不要将事务日志记录在数据目录(主要用来存放内存数据库快照)下。 -
preAllocSize
:
为事务日志预先开辟磁盘空间。默认是64M,意味着每个事务日志大小就是64M(可以去事务日志目录中看一下,每个事务日志只要被创建出来,就是64M)。如果ZooKeeper产生快照频率较大,可以考虑减小这个参数,因为每次快照后都会切换到新的事务日志,但前面的64M根本就没写完。(见snapCount配置项) -
snapCount
:
ZooKeeper使用事务日志和快照来持久化每个事务(注意是日志先写)。该配置项指定ZooKeeper在将内存数据库序列化为快照之前,需要先写多少次事务日志。也就是说,每写几次事务日志,就快照一次。默认值为100000。为了防止所有的ZooKeeper服务器节点同时生成快照(一般情况下,所有实例的配置文件是完全相同的),当某节点的先写事务数量在(snapCount/2+1,snapCount)范围内时(挑选一个随机值),这个值就是该节点拍快照的时机。 -
autopurge.snapRetainCount
:
该配置项指定开启了ZooKeeper的自动清理功能后(见下一个配置项),每次自动清理时要保留的版本数量。默认值为3,最小值也为3。它表示在自动清理时,会保留最近3个快照以及这3个快照对应的事务日志。其它的所有快照和日志都清理。 -
autopurge.purgeInterval
:
指定触发自动清理功能的时间间隔,单位为小时,值为大于或等于1的整数,默认值为0,表示不开启自动清理功能。
1.2 事务日志和快照的命名规则
在ZooKeeper集群启动后,当第一个客户端连接到某个服务器节点时,会创建一个会话,这个会话也是事务,于是创建第一个事务日志,一般名为log.100000001
,这里的100000001是这次会话的事务id(zxid)。之后的事务都将写入到这个文件中,直到拍下一个快照。
如果是事务ZXID5触发的拍快照,那么快照名就是snapshot.ZXID5,拍完后,下一个事务的ID就是ZXID6,于是新的事务日志名为log.ZXID6。
1.3 查看事务日志
事务日志是一个二进制文件,无法直接查看。好在ZooKeeper提供了一个LogFormatter工具类。
假设ZooKeeper安装目录为/usr/local/zookeeper,那么可以通过下面的方法来查看事务日志log.100000001中的内容。
[root@node02 ~]# java -cp /usr/local/zookeeper/zookeeper-3.4.7.jar:/usr/local/zookeeper/lib/slf4j-api-1.6.1.jar org.apache.zookeeper.server.LogFormatter /usr/local/zookeeper/data/version-2/log.100000001 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. ZooKeeper Transactional Log File with dbid 0 txnlog format version 2 20-3-5 下午09时10分54秒 session 0x170aaceef8c0000 cxid 0x0 zxid 0x100000001 createSession 30000 20-3-5 下午09时10分54秒 session 0x170aaceef8c0000 cxid 0x1 zxid 0x100000002 create '/test_znode,#68656c6c6f20776f726c64,v{s{31,s{'world,'anyone}}},F,1 20-3-5 下午09时11分26秒 session 0x170aaceef8c0000 cxid 0x0 zxid 0x100000003 closeSession null 20-3-5 下午09时11分52秒 session 0x270aacedac50000 cxid 0x0 zxid 0x100000004 createSession 30000 20-3-5 下午09时12分22秒 session 0x270aacedac50001 cxid 0x0 zxid 0x100000005 createSession 30000 20-3-5 下午09时12分24秒 session 0x270aacedac50000 cxid 0x0 zxid 0x100000006 closeSession null 20-3-5 下午09时12分54秒 session 0x270aacedac50001 cxid 0x0 zxid 0x100000007 closeSession null 20-3-5 下午09时13分42秒 session 0x370aacef0570000 cxid 0x0 zxid 0x100000008 createSession 30000 20-3-5 下午09时14分14秒 session 0x370aacef0570000 cxid 0x0 zxid 0x100000009 closeSession null 20-3-5 下午09时16分55秒 session 0x170aaceef8c0001 cxid 0x0 zxid 0x10000000a createSession 30000 20-3-5 下午09时17分56秒 session 0x170aaceef8c0001 cxid 0x1 zxid 0x10000000b closeSession null EOF reached after 11 txns.
1.4 自动清理功能
从ZooKeeper 3.4.0开始,ZooKeeper提供了自动清理事务日志和快照的功能,见事务日志和快照相关的配置项。
此外,还提供了一个脚本zkCleanup.sh,它也用来清理事务日志和快照。但比较少用。
有时也会写定时任务脚本,来删除定时、定点的事务日志和快照数据。
2.ZooKeeper系统的系统日志
ZooKeeper使用log4j(log for java)来记录系统日志。默认情况下,系统日志文件为ZooKeeper安装目录下的zookeeper.out,这是由log4j的配置文件决定的。(实际上,zkEnv.sh和zkServer.sh中也设置了日志的路径,见下文)。
ZooKeeper使用的log4j的配置文件为$ZOOKEEPER_HOME/conf/log4j.properties
。
[root@node02 zookeeper]# cat conf/log4j.properties # Define some default values that can be overridden by system properties zookeeper.root.logger=INFO, CONSOLE zookeeper.console.threshold=INFO zookeeper.log.dir=. # 日志目录 zookeeper.log.file=zookeeper.log # 日志文件名称 zookeeper.log.threshold=DEBUG zookeeper.tracelog.dir=. zookeeper.tracelog.file=zookeeper_trace.log # # ZooKeeper Logging Configuration # # Format is "<default threshold> (, <appender>)+ # DEFAULT: console appender only log4j.rootLogger=${zookeeper.root.logger} # Example with rolling log file #log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE # Example with rolling log file and tracing #log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE # # Log INFO level and above messages to the console # log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold} log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n # # Add ROLLINGFILE to rootLogger to get log file output # Log DEBUG level and above messages to a log file log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold} log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file} # Max log file size of 10MB log4j.appender.ROLLINGFILE.MaxFileSize=10MB # uncomment the next line to limit number of backup files #log4j.appender.ROLLINGFILE.MaxBackupIndex=10 log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n # # Add TRACEFILE to rootLogger to get log file output # Log DEBUG level and above messages to a log file log4j.appender.TRACEFILE=org.apache.log4j.FileAppender log4j.appender.TRACEFILE.Threshold=TRACE log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file} log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout ### Notice we are including log4j's NDC here (%x) log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n
log4j.properties中没有指定zookeeper.out啊?但为什么会输出到zookeeper.out中呢?这是因为zkServer.sh中指定了这个文件。以下是zkServer.sh中和zookeeper.out相关的内容:
[root@node02 conf]# cat /usr/local/zookeeper/bin/zkServer.sh #!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # If this scripted is run out of /usr/bin or some other system bin directory # it should be linked to and not copied. Things like java jar files are found # relative to the canonical path of this script. # # use POSTIX interface, symlink is followed automatically ZOOBIN="${BASH_SOURCE-$0}" ZOOBIN="$(dirname "${ZOOBIN}")" ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then . "$ZOOBINDIR/../libexec/zkEnv.sh" else . "$ZOOBINDIR/zkEnv.sh" fi # See the following page for extensive details on setting # up the JVM to accept JMX remote management: # http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # by default we allow local JMX connections if [ "x$JMXLOCALONLY" = "x" ] then JMXLOCALONLY=false fi if [ "x$JMXDISABLE" = "x" ] then echo "ZooKeeper JMX enabled by default" >&2 if [ "x$JMXPORT" = "x" ] then # for some reason these two options are necessary on jdk6 on Ubuntu # accord to the docs they are not necessary, but otw jconsole cannot # do a local attach ZOOMAIN="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY org.apache.zookeeper.server.quorum.QuorumPeerMain" else if [ "x$JMXAUTH" = "x" ] then JMXAUTH=false fi if [ "x$JMXSSL" = "x" ] then JMXSSL=false fi if [ "x$JMXLOG4J" = "x" ] then JMXLOG4J=true fi echo "ZooKeeper remote JMX Port set to $JMXPORT" >&2 echo "ZooKeeper remote JMX authenticate set to $JMXAUTH" >&2 echo "ZooKeeper remote JMX ssl set to $JMXSSL" >&2 echo "ZooKeeper remote JMX log4j set to $JMXLOG4J" >&2 ZOOMAIN="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=$JMXPORT -Dcom.sun.management.jmxremote.authenticate=$JMXAUTH -Dcom.sun.management.jmxremote.ssl=$JMXSSL -Dzookeeper.jmx.log4j.disable=$JMXLOG4J org.apache.zookeeper.server.quorum.QuorumPeerMain" fi else echo "JMX disabled by user request" >&2 ZOOMAIN="org.apache.zookeeper.server.quorum.QuorumPeerMain" fi if [ "x$SERVER_JVMFLAGS" != "x" ] then JVMFLAGS="$SERVER_JVMFLAGS $JVMFLAGS" fi if [ "x$2" != "x" ] then ZOOCFG="$ZOOCFGDIR/$2" fi # if we give a more complicated path to the config, don't screw around in $ZOOCFGDIR if [ "x$(dirname "$ZOOCFG")" != "x$ZOOCFGDIR" ] then ZOOCFG="$2" fi if $cygwin then ZOOCFG=`cygpath -wp "$ZOOCFG"` # cygwin has a "kill" in the shell itself, gets confused KILL=/bin/kill else KILL=kill fi echo "Using config: $ZOOCFG" >&2 case "$OSTYPE" in *solaris*) GREP=/usr/xpg4/bin/grep ;; *) GREP=grep ;; esac if [ -z "$ZOOPIDFILE" ]; then ZOO_DATADIR="$($GREP "^[[:space:]]*dataDir" "$ZOOCFG" | sed -e 's/.*=//')" if [ ! -d "$ZOO_DATADIR" ]; then mkdir -p "$ZOO_DATADIR" fi ZOOPIDFILE="$ZOO_DATADIR/zookeeper_server.pid" else # ensure it exists, otw stop will fail mkdir -p "$(dirname "$ZOOPIDFILE")" fi if [ ! -w "$ZOO_LOG_DIR" ] ; then mkdir -p "$ZOO_LOG_DIR" fi _ZOO_DAEMON_OUT="$ZOO_LOG_DIR/zookeeper.out" case $1 in start) echo -n "Starting zookeeper ... " if [ -f "$ZOOPIDFILE" ]; then if kill -0 `cat "$ZOOPIDFILE"` > /dev/null 2>&1; then echo $command already running as process `cat "$ZOOPIDFILE"`. exit 0 fi fi nohup "$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \ -cp "$CLASSPATH" $JVMFLAGS $ZOOMAIN "$ZOOCFG" > "$_ZOO_DAEMON_OUT" 2>&1 < /dev/null & if [ $? -eq 0 ] then case "$OSTYPE" in *solaris*) /bin/echo "${!}\\c" > "$ZOOPIDFILE" ;; *) /bin/echo -n $! > "$ZOOPIDFILE" ;; esac if [ $? -eq 0 ]; then sleep 1 echo STARTED else echo FAILED TO WRITE PID exit 1 fi else echo SERVER DID NOT START exit 1 fi ;; start-foreground) ZOO_CMD=(exec "$JAVA") if [ "${ZOO_NOEXEC}" != "" ]; then ZOO_CMD=("$JAVA") fi "${ZOO_CMD[@]}" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \ -cp "$CLASSPATH" $JVMFLAGS $ZOOMAIN "$ZOOCFG" ;; print-cmd) echo "\"$JAVA\" -Dzookeeper.log.dir=\"${ZOO_LOG_DIR}\" -Dzookeeper.root.logger=\"${ZOO_LOG4J_PROP}\" -cp \"$CLASSPATH\" $JVMFLAGS $ZOOMAIN \"$ZOOCFG\" > \"$_ZOO_DAEMON_OUT\" 2>&1 < /dev/null" ;; stop) echo -n "Stopping zookeeper ... " if [ ! -f "$ZOOPIDFILE" ] then echo "no zookeeper to stop (could not find file $ZOOPIDFILE)" else $KILL -9 $(cat "$ZOOPIDFILE") rm "$ZOOPIDFILE" echo STOPPED fi exit 0 ;; upgrade) shift echo "upgrading the servers to 3.*" "$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \ -cp "$CLASSPATH" $JVMFLAGS org.apache.zookeeper.server.upgrade.UpgradeMain ${@} echo "Upgrading ... " ;; restart) shift "$0" stop ${@} sleep 3 "$0" start ${@} ;; status) # -q is necessary on some versions of linux where nc returns too quickly, and no stat result is output clientPortAddress=`$GREP "^[[:space:]]*clientPortAddress[^[:alpha:]]" "$ZOOCFG" | sed -e 's/.*=//'` if ! [ $clientPortAddress ] then clientPortAddress="localhost" fi clientPort=`$GREP "^[[:space:]]*clientPort[^[:alpha:]]" "$ZOOCFG" | sed -e 's/.*=//'` STAT=`"$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \ -cp "$CLASSPATH" $JVMFLAGS org.apache.zookeeper.client.FourLetterWordMain \ $clientPortAddress $clientPort srvr 2> /dev/null \ | $GREP Mode` if [ "x$STAT" = "x" ] then echo "Error contacting service. It is probably not running." exit 1 else echo $STAT exit 0 fi ;; *) echo "Usage: $0 {start|start-foreground|stop|restart|status|upgrade|print-cmd}" >&2 esac [root@node02 conf]#
可以看到,在zkServer.sh的start选项中,使用nohup启动ZooKeeper,并将日志输出到"$ZOO_LOG_DIR/zookeeper.out"中。
一般来说,没有特殊需求,没必要去改log4j日志配置。要改的话,记得把log4j.properties和zkEnv.sh和zkServer.sh中相关的内容都修改掉。