zookeeper 超时问题

问题1:

2020-03-01 16:04:06,085 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x10635fe2a6368f1 with negotiated timeout 120000 for client /10.62.3.14:55222
2020-03-01 16:06:12,006 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 5073ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2020-03-01 16:06:13,906 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 1123ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

分析: ZK服务端在fsync-ing the write ahead log日志时超长引起。

解决办法:

1、在zoo.cfg添加:

forceSync=no

默认是开启的,为避免同步延迟问题,ZK接收到数据后会立刻去讲当前状态信息同步到磁盘日志文件中,同步完成后才会应答。将此项关闭后,客户端连接可以得到快速响应。Zk涮日志源码如下图:

 

关闭forceSync选项后,会存在潜在风险,虽然依旧会刷磁盘(log.flush()首先被执行),但因为操作系统为提高写磁盘效率,会先写缓存,当机器异常后,可能导致一些zk状态信息没有同步到磁盘,从而带来ZK前后信息不一样问题。
2、把zookeeper的日志文件和数据文件分开存储,不存在在一块磁盘

 

问题2:

2020-03-01 16:27:16,786 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x3075e0e93860151 with negotiated timeout 120000 for client /10.62.3.2:60124
2020-03-01 16:34:52,706 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x3075e0e93860135, likely client has closed socket
2020-03-01 16:34:52,706 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.2:50244 which had sessionid 0x3075e0e93860135
2020-03-01 16:35:48,351 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636912, likely client has closed socket
2020-03-01 16:35:48,351 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60822 which had sessionid 0x10635fe2a636912
2020-03-01 16:35:58,226 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636914, likely client has closed socket
2020-03-01 16:35:58,226 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60856 which had sessionid 0x10635fe2a636914
2020-03-01 16:36:04,902 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636910, likely client has closed socket

分析: 客户端连接Zookeeper时,配置的超时时长过短。

从上述的信息可以看出来,,会话超时时间已经设置了120s,对于hbase集群来说,,这个超时时间应该是没问题的,但是还是有的regionserver机器由于在flush memstor时失败了,,这里暂且在zoo.cfg文件,修改tickTime参数在观察看看。

posted @   北漂-boy  阅读(9673)  评论(1编辑  收藏  举报
编辑推荐:
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 2025年我用 Compose 写了一个 Todo App
· 张高兴的大模型开发实战:(一)使用 Selenium 进行网页爬虫
点击右上角即可分享
微信分享提示