《Windows Azure Platform 系列文章目录》
一.用户现状及需求
1.客户团队使用Prometheus Cloud Watch Exporter,把AWS监控指标,与Prometheus整合:
https://github.com/prometheus/cloudwatch_exporter
客户团队希望微软云Azure也提供类似的Exporter功能,能把Azure的监控指标(虚拟机、Redis PaaS等,MySQL PaaS数据等),与Prometheus整合
二.说明
微软目前没有官方提供的Exporter功能,但是查询到第三方开源的解决方案:
https://github.com/webdevops/azure-metrics-exporter
三.技术实现
该方案通过基于Azure SDK for Go,实现Azure Monitor Metric Exporter功能
四.实现关键步骤
1.创建和使用Azure订阅,步骤略
2.创建Service Principal,并赋权的权限为订阅的Reader。具体步骤略。
3.安装Azure虚拟机,我们这里以CentOS 7.9为例,具体步骤略
4.设置环境变量
vi ~/.bashrc
5.设置Service Principal相关信息
#App ID
export AZURE_CLIENT_ID=XXXXXXXX
#租户ID
export AZURE_TENANT_ID=XXXXXXXX
#App Key
export AZURE_CLIENT_SECRET=XXXXXXXX
6.设置环境变量生效
source ~/.bashrc
五.安装Prometheus
1.我这里使用Prometheus 2.50.1,具体安装步骤略
2.下载与运行Azure Monitor Metric Exporter项目,项目文件在:https://github.com/webdevops/azure-metrics-exporter/releases
4.下载后运行
nohup ./azure-metrics-exporter.linux.amd64 &
六.配置Prometheus yml文件for Azure Storage
我们编辑prometheus.yml文件,增加下面的内容
1. job_name,设置job名称
2. 下面的第7行,是我的订阅ID。请PE团队按照实际情况修改
3. 下面的第11行,是指标名称。我们这里查询的是BlobCapacity,存储容量大小
4. 请注意下图使用的端口号,在第21行,为8080
具体指标可以参考文档:https://github.com/webdevops/azure-metrics-exporter
- job_name: azure-metrics-storageaccount-connections
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["my_own_metric_name"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
resourceType: ["Microsoft.Storage/storageAccounts"]
metricNamespace: ["Microsoft.Storage/storageAccounts/blobServices"]
metric:
- BlobCapacity
interval: ["PT1H"]
timespan: ["PT1H"]
aggregation:
- average
- count
# by blobtype (dimension support)
# metricFilter: ["BlobType eq '*'"]
metricTop: ["10"]
static_configs:
- targets: ["localhost:8080"]
七.配置Prometheus yml文件for Azure MySQL Flexible Server
1. 以下是配置Azure MySQL Flexible Server的Prometheus配置文件
2. 具体的Metric可以参考:https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/microsoft-dbformysql-flexibleservers-metrics
- job_name: azure-metrics-databases
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["azure-database"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
#filter: ["resourceType eq 'Microsoft.DBforMySQL/servers'"]
resourceType: ["Microsoft.DBforMySQL/flexibleServers"]
#metricNamespace: ["Microsoft.DBforMySQL/flexibleServers"]
metric:
- cpu_percent
- memory_percent
interval: ["PT1M"]
timespan: ["2024-08-09T07:00:00Z/2024-08-09T08:00:00Z"]
aggregation:
- average
#- count
# by blobtype (dimension support)
# metricFilter: ["BlobType eq '*'"]
metricTop: ["10"]
static_configs:
- targets: ["localhost:8080"]
八.Azure Postgre SQL Flexible Server
1.以下是配置Azure PGSQL Flexible Server的Prometheus配置文件
2.主要监控的指标有两个:CPU利用率和内存利用率
3.具体的Metric可以参考:https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/microsoft-dbforpostgresql-flexibleservers-metrics
- job_name: azure-metrics-pgsql
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["azure-pgsql"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
resourceType: ["Microsoft.DBforPostgreSQL/flexibleServers"]
metric:
- cpu_percent
- memory_percent
interval: ["PT1M"]
#P7D表示最近7天
timespan: ["P7D"]
aggregation:
- average
#- count
metricTop: ["20"]
static_configs:
- targets: ["localhost:8080"]
九.根据PGSQL Flexible Server名称等于某个值
1. 以下是配置Azure PGSQL Flexible Server的Prometheus配置文件
2. 显示类型为:PGSQL Flexible Server
3. 筛选PGSQL的Server Name为:等于leipgsql01
4. 这里用的是OData的运算符eq,也就是equal,等于某个值
- job_name: azure-metrics-pgsql
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["azure-pgsql"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
filter: ["resourceType eq 'Microsoft.DBforPostgreSQL/flexibleServers' and name eq 'leipgsql01'"]
#filter: ["resourceName eq 'leipgsql01'"]
#resourceType: ["Microsoft.DBforPostgreSQL/flexibleServers"]
#metricNamespace: ["Microsoft.DBforMySQL/flexibleServers"]
metric:
- cpu_percent
- memory_percent
interval: ["PT1M"]
#timespan: ["2024-12-26T03:00:00Z/2024-12-28T08:00:00Z"]
timespan: ["P7D"]
aggregation:
- average
#- count
# by blobtype (dimension support)
# metricFilter: ["BlobType eq '*'"]
metricTop: ["20"]
static_configs:
- targets: ["localhost:8080"]
- 执行结果,可以看到只显示Server Name为leipgsql01的指标:cpu_percent,memory_percent
# HELP azure_pgsql Azure monitor insight metric
# TYPE azure_pgsql gauge
azure_pgsql{aggregation="average",interval="PT1M",metric="cpu_percent",resourceGroup="sig-rg",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/sig-rg/providers/microsoft.dbforpostgresql/flexibleservers/leipgsql01",resourceName="leipgsql01",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="P7D",unit="Percent"} 10.5
azure_pgsql{aggregation="average",interval="PT1M",metric="memory_percent",resourceGroup="sig-rg",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/sig-rg/providers/microsoft.dbforpostgresql/flexibleservers/leipgsql01",resourceName="leipgsql01",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="P7D",unit="Percent"} 65.5
十.根据PGSQL Flexible Server名称不等于某个值
1. 以下是配置Azure PGSQL Flexible Server的Prometheus配置文件
2. 显示类型为:PGSQL Flexible Server
3. 筛选PGSQL的Server Name为:不等于leipgsql01
4. 这里用的是OData的运算符ne,也就是not equal,不等于某个值
- job_name: azure-metrics-pgsql
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["azure-pgsql"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
filter: ["resourceType eq 'Microsoft.DBforPostgreSQL/flexibleServers' and name ne 'leipgsql01'"]
#filter: ["resourceName eq 'leipgsql01'"]
#resourceType: ["Microsoft.DBforPostgreSQL/flexibleServers"]
#metricNamespace: ["Microsoft.DBforMySQL/flexibleServers"]
metric:
- cpu_percent
- memory_percent
interval: ["PT1M"]
#timespan: ["2024-12-26T03:00:00Z/2024-12-28T08:00:00Z"]
timespan: ["P7D"]
aggregation:
- average
#- count
# by blobtype (dimension support)
# metricFilter: ["BlobType eq '*'"]
metricTop: ["20"]
static_configs:
- targets: ["localhost:8080"]
十一.根据PGSQL Flexible Server名称包含某个值
1. 以下是配置Azure PGSQL Flexible Server的Prometheus配置文件
2. 显示类型为:PGSQL Flexible Server
3. 筛选PGSQL的Server Name为:包含lei
4. 这里用的是OData的运算符substringof
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: azure-metrics-pgsql
scrape_interval: 1m
metrics_path: /probe/metrics/list
params:
name: ["azure-pgsql"]
subscription:
- 166157a8-9ce9-400b-91c7-1d42482b83d6
filter: ["resourceType eq 'Microsoft.DBforPostgreSQL/flexibleServers' and substringof('lei',name)"]
#filter: ["resourceName eq 'leipgsql01'"]
#resourceType: ["Microsoft.DBforPostgreSQL/flexibleServers"]
#metricNamespace: ["Microsoft.DBforMySQL/flexibleServers"]
metric:
- cpu_percent
- memory_percent
interval: ["PT1M"]
#timespan: ["2024-12-26T03:00:00Z/2024-12-28T08:00:00Z"]
timespan: ["P7D"]
aggregation:
- average
#- count
# by blobtype (dimension support)
# metricFilter: ["BlobType eq '*'"]
metricTop: ["20"]
static_configs:
- targets: ["localhost:8080"]
十二.启动Prometheus
1.我们执行命令:
./prometheus --config.file=prometheus.yml
2.Prometheus的默认端口为9090
3.打开浏览器,查看http://ip:9090,如下图。点击Status, Target
4.下图展示的azure-metric-storageaccount-connection,就是我们之前配置的
5.我们打开上图的Exporter端口,显示结果为,包含我订阅下所有存储账户名称和存储账户的容量大小
# HELP my_own_metric_name Azure monitor insight metric
# TYPE my_own_metric_name gauge
my_own_metric_name{aggregation="average",interval="PT1H",metric="BlobCapacity",resourceGroup="cdn-rg",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/cdn-rg/providers/microsoft.storage/storageaccounts/leicdnoriginalstorage",resourceName="leicdnoriginalstorage",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="PT1H",unit="Bytes"} 33835
my_own_metric_name{aggregation="average",interval="PT1H",metric="BlobCapacity",resourceGroup="cloud-shell-storage-southeastasia",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/cloud-shell-storage-southeastasia/providers/microsoft.storage/storageaccounts/cs110032002647d220b",resourceName="cs110032002647d220b",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="PT1H",unit="Bytes"} 0
my_own_metric_name{aggregation="average",interval="PT1H",metric="BlobCapacity",resourceGroup="fw-hybrid-test",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/fw-hybrid-test/providers/microsoft.storage/storageaccounts/niostoragetest01",resourceName="niostoragetest01",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="PT1H",unit="Bytes"} 0
my_own_metric_name{aggregation="average",interval="PT1H",metric="BlobCapacity",resourceGroup="lab-rg",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/lab-rg/providers/microsoft.storage/storageaccounts/leiadls",resourceName="leiadls",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="PT1H",unit="Bytes"} 1104
my_own_metric_name{aggregation="average",interval="PT1H",metric="BlobCapacity",resourceGroup="lab-rg",resourceID="/subscriptions/166157a8-9ce9-400b-91c7-1d42482b83d6/resourcegroups/lab-rg/providers/microsoft.storage/storageaccounts/leilabstorage01",resourceName="leilabstorage01",subscriptionID="166157a8-9ce9-400b-91c7-1d42482b83d6",subscriptionName="leizhang-non-prod",tag_owner="",timespan="PT1H",unit="Bytes"} 34272
这就是我的环境的5个存储对象,如下图:
6.Azure Monitor Metric Exporter还提供调试的功能,我们可以执行:
nohup ./azure-metrics-exporter.linux.amd64 --development.webui &
7.Azure metric exporter提供web 界面进行查询。以我的环境为例,打开链接:http://20.52.9.41:8080/query。我们可以在下面进行调试:
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构
2018-03-15 Windows Azure Web Site (19) Azure Web App链接到VSTS