ganglia问题小结

1.gmetad和rrdtool的关系

   gmetad负责将轮询gmond拉取到的数据存入rrdtool的文件中,rrdtool

 

2.gemtad.conf

①命令:/usr/sbin/gmetad -d 1

增加debug_level参数可以帮助我们查找gmetad失败原因

#-------------------------------------------------------------------------------

# Setting the debug_level to 1 will keep daemon in the forground and

# show only error messages. Setting this value higher than 1 will make

# gmetad output debugging information and stay in the foreground.

# default: 0

# debug_level 10

 

②data_source "my cluster" [polling interval] address1:port addreses2:port ...

gmetad在轮询的时候会自动识别data_source是cluster还是grid,如果是cluster会收集明细指标数据,如果是grid只会收集汇总数据(相当于remote grid,如果想看它的明细指标,需要在该grid的gmetad上开启gweb服务,详见多层gemetad的介绍)(如果是从grid,即从另一个gmetad收集数据,端口号应设置为8651,当然如果该gmetad仅为gmetad节点,无gmond应用,则不用特别设置端口号8651,会默认取8651)

# The data_source tag specifies either a cluster or a grid to

# monitor. If we detect the source is a cluster, we will maintain a complete

# set of RRD databases for it, which can be used to create historical

# graphs of the metrics. If the source is a grid (it comes from another gmetad),

# we will only maintain summary RRDs for it.

 

③data_source "my cluster" [polling interval] address1:port addreses2:port ...

polling interval需要紧跟在data_source后,指定了gmetad轮询该data_source的时间间隔,默认是15秒;

如果要自定义这个间隔时间,需要注意一点,gweb前台判断一台注意是否down了的标准是,响应时间是否在最长响应时间内;

该响应时间是4 * TMAX (20sec by default),即80秒,如果设置的间隔超过了80是,gweb就会认为主机down了;

 

# The keyword 'data_source' must immediately be followed by a unique

# string which identifies the source, then an optional polling interval in

# seconds. The source will be polled at this interval on average.

# If the polling interval is omitted, 15sec is asssumed.

 

# If you choose to set the polling interval to something other than the default,

# note that the web frontend determines a host as down if its TN value is less

# than 4 * TMAX (20sec by default).  Therefore, if you set the polling interval

# to something around or greater than 80sec, this will cause the frontend to

# incorrectly display hosts as down even though they are not.

 

data_source "my cluster" [polling interval] address1:port addreses2:port ...

无特别指定默认port是8649

# A list of machines which service the data source follows, in the

# format ip:port, or name:port. If a port is not specified then 8649

# (the default gmond port) is assumed.

# default: There is no default value

 

3.

 

4.

#-------------------------------------------------------------------------------

# Scalability mode. If on, we summarize over downstream grids, and respect

# authority tags. If off, we take on 2.5.0-era behavior: we do not wrap our output

# in <GRID></GRID> tags, we ignore all <GRID> tags we see, and always assume

# we are the "authority" on data source feeds. This approach does not scale to

# large groups of clusters, but is provided for backwards compatibility.

# default: on

# scalable off

posted on 2016-06-14 16:48  roger888  阅读(279)  评论(0编辑  收藏  举报

导航