opscenter dashboard排错

系统环境

opscenter 5.2
centOS 6.6
cassandra 2.0.x

问题

opscenter上的dashboard监控cassandra集群一段时间(大约1天)后总会停止显示。

然而在cassandra节点上发现datastax-agent进程还是好好的在运行着。

之后查看datastax agent的LOG日志发现

WARN [Thread-10] .... operations dropped so far.
WARN [Thread-10] .... Cassandra operation queue is full, discarding cassandra operation

Error when proccessing cassandra callcom.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.47.222:9042 (com.datastax.driver.core.TransportException: [/192.168.47.222:9042] Connection has been closed))

ERROR [Reconnection-0] 2015-08-05 16:06:39,841 Unknown error during reconnection to /192.168.47.222:9042, scheduling retry in 8000 milliseconds

初步认定是cassandra request过多导致

解决方案

/var/lib/datastax-agent/conf/address.yaml中添加参数

stomp_interface: opscenterIP
use_ssl: 0
async_pool_size: 200
thrift_max_cons: 200
async_queue_size: 20000
hosts: 集群ip,格式为["host1","host2"]
local_interface: localhost
cassandra_conf: /xxx/apache-cassandra-2.0.15/conf/cassandra.yaml

$CASSANDRA_HOME/conf/clusters/cluster_name.conf中修改

[stomp]
batch_size = 10000
push_interval = 10

一些参数

#address.yaml参数
thrift_max_conns - the max number of concurrent connections to make to the local node

asysnc_pool_size - the size of the threadpool pulling from a queue of inserts and inserting in to cassandra

async_queue_size - the size of the queue of inserts to send to cassandra, if the queue fills up additional operations will be dropped

#stomp参数
batch_size - The number of request updates OpsCenter will push out at once. The default value is 100. This is used to avoid overloading the browser.

push_interval - How often OpsCenter will push out updates to requests. The default value is 3 seconds. This is used to avoid overloading the browser

done.

opscenter配置官方文档

posted on 2015-08-10 10:03  毛小娃  阅读(174)  评论(0编辑  收藏  举报

导航