Docker之监控与日志
1. cAdvisor原生集群监控
docker run -d --name=cadvisor \
-v /:/rootfs:ro \
-v /var/run:/var/run:rw \
-v /sys:/sys:ro \
-v /var/lib/docker/:/var/lib/docker:ro \
-p 8080:8080 \
google/cadvisor:latest
2. 部署prometheus服务端
1. 创建prometheus.yml文件,内容如下:
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
#监听的地址
- targets: ['192.168.1.203:9090']
2. docker启动prometheus容器
docker run -d -p 9090:9090 -v /root/prometheus.yml:/etc/prometheus/prometheus.yml -v "/etc/localtime:/etc/localtime" --name prometheus --net=host prom/prometheus
3. 使用prometheus监控docker主机和容器
1. 安装node exporter收集硬件信息
docker run -d -p 9100:9100 -v "/proc:/host/proc" -v "/sys:/host/sys" -v "/:/rootfs" -v "/etc/localtime:/etc/localtime" --net=host prom/node-exporter --path.procfs /host/proc --path.sysfs /host/sys --collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"
2. 安装cAdvisor收集容器信息
docker run -d \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:rw \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ --publish=8080:8080 \ --detach=true \ --name=cadvisor \ --net=host \ -v "/etc/localtime:/etc/localtime" \ google/cadvisor:latest
3. cAdvisor使用metrics收集监控指标
访问地址: http://ip:8080/metrics
4. 修改prometheus的配置文件,让它采集被监控的主机的metrics
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
#监听的地址
- targets: ['192.168.1.203:9090']
- job_name: 'docker203'
static_configs:
- targets: ['192.168.1.203:8098']
5. 在prometheus的界面,输入container,就可以看到对应的监控值的图。
4. 安装grafana
docker run -d -i -p 3000:3000 -v "/etc/localtime:/etc/localtime" -e "GF_SERVER_ROOT_URL=http://grafana.server.name" -e "GF_SECURITY_ADMIN_PASSWORD=admin8888" --net=host grafana/grafana
5. 配置管理grafana
1. 添加data source

2. 添加dashboard
可以导入import模板,编号为193
也可以下载对应的模板,下载地址:https://grafana.com/grafana/dashboards?dataSource=prometheus&search=docker
一往无前虎山行,拨开云雾见光明

浙公网安备 33010602011771号