2.Influxdb函数
载自:http://www.linuxdaxue.com/
官网:https://archive.docs.influxdata.com/influxdb/v0.13/query_language/functions/
一、聚合类函数(Aggregations)
1. COUNT()函数
返回一个(field)字段中的非空值的数量。
语法:
SELECT COUNT(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
>select count(inodes_total) from disk;
name: disk
time count
---- -----
1970-01-01T00:00:00Z 98331
注意:InfluxDB中的函数如果没有指定时间的话,会默认以 epoch 0 (1970-01-01T00:00:00Z
) 作为时间。
可以在where 中加入时间条件,并使用group by用时间进行分组,如下:
>select count(inodes_total) from disk where time >='2020-12-26T06:20:20Z' and time <='2020-12-29T06:20:20Z' group by time(2d);
name: disk
time count
---- -----
2020-12-26T00:00:00Z 44991
2020-12-28T00:00:00Z 32733
2. DISTINCT()函数
语法:
SELECT DISTINCT(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例
select distinct(total) from disk;
name: disk
time distinct
---- --------
1970-01-01T00:00:00Z 1063256064
1970-01-01T00:00:00Z 50446991360
1970-01-01T00:00:00Z 536604577792
3. MEAN() 函数
返回一个字段(field)中的值的算术平均值(平均值),字段类型必须是长整型或float64。
语法格式:
SELECT MEAN(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
使用分组可以按时间值进行平均值计算,当group by时间和采集时间粒度一致时,平均值就代表了实时值(当前采集粒度为10s)
> select mean(used) from disk GROUP BY time(10s) limit 5;
name: disk
time mean
---- ----
2020-12-26T06:20:00Z 29156952746.666668
2020-12-26T06:20:10Z 29156989610.666668
2020-12-26T06:20:20Z 29156997802.666668
2020-12-26T06:20:30Z
2020-12-26T06:20:40Z 29157014186.666668
4. SPREAD()函数
返回字段的最小值和最大值之间的差值。数据的类型必须是长整型或float64。
SELECT SPREAD(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例
> select SPREAD(used) from disk;
name: disk
time spread
---- ------
1970-01-01T00:00:00Z 77111853056
5. SUM()函数
返回一个字段中的所有值的和。字段的类型必须是长整型或float64。
语法:
SELECT SUM(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
6. MEDIAN()函数
从单个字段(field)中的排序值返回中间值(中位数)。字段值的类型必须是长整型或float64格式。
语法:
SELECT MEDIAN(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
使用示例
> SELECT MEDIAN(water_level) from h2o_feet
name: h2o_feet
--------------
time median
1970-01-01T00:00:00Z 4.124
说明表中 water_level字段的中位数是 4.124
二、选择类函数(Selectors)
1. TOP()函数
作用:返回一个字段中最大的N个值,字段类型必须是长整型或float64类型。
语法:
SELECT TOP( <field_key>[,<tag_key(s)>],<N> )[,<tag_key(s)>|<field_key(s)>] [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] [SLIMIT_clause] [SOFFSET_clause]
示例
select top(used,5) from disk;
name: disk
time top
---- ---
2020-12-28T06:42:00Z 77260963840
2020-12-28T06:42:10Z 77260955648
2020-12-28T06:42:20Z 77260980224
2020-12-28T06:42:30Z 77260877824
2020-12-28T06:43:30Z 77260877824
2. BOTTOM()函数
作用:返回一个字段中最小的N个值。字段类型必须是长整型或float64类型。
语法:
SELECT BOTTOM(<field_key>[,<tag_keys>],<N>)[,<tag_keys>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例
> select BOTTOM(used,5) from disk;
name: disk
time bottom
---- ------
2020-12-26T06:20:00Z 149127168
2020-12-26T06:20:10Z 149127168
2020-12-26T06:20:20Z 149127168
2020-12-26T06:20:40Z 149127168
2020-12-26T06:20:50Z 149127168
3. FIRST()函数
作用:按照时间维度,返回一个字段中最早时间录入的值。
语法:
SELECT FIRST(<field_key>)[,<tag_key(s)>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例:
> select first(used) from disk;
name: disk
time first
---- -----
2020-12-26T06:20:00Z 10270826496
4. LAST()函数
作用:按照时间维度,返回一个字段中最新时间录入的值。
语法:
SELECT LAST(<field_key>)[,<tag_key(s)>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例:
5. MAX()函数
作用:返回一个字段中的最大值。该字段类型必须是长整型,float64,或布尔类型。
语法:
SELECT MAX(<field_key>)[,<tag_key(s)>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
6. MIN()函数
作用:返回一个字段中的最小值。该字段类型必须是长整型,float64,或布尔类型。
语法:
SELECT MIN(<field_key>)[,<tag_key(s)>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
7. PERCENTILE()函数
作用:返回排序值排位为N的百分值。字段的类型必须是长整型或float64。
百分值是介于100到0之间的整数或浮点数,包括100。
语法:
SELECT PERCENTILE(<field_key>, <N>)[,<tag_key(s)>] FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例:
> SELECT PERCENTILE(water_level,5),location FROM h2o_feet
name: h2o_feet
--------------
time percentile location
2015-08-28T12:06:00Z 1.122 santa_monica
就是将water_level字段按照不同的location求百分比,然后取第五位数据。
PERCENTILE(<field_key>,100)
等效于MAX(<field_key>)
;
PERCENTILE(<field_key>,0)
不等同于MIN(<field_key>)
;
注意:
PERCENTILE(<field_key>, 50)
几乎等于MEDIAN()
,除了MEDIAN()
字段包含偶数点,则返回两个中间值的平均值。
三、变换类函数(Transformations)
1. DERIVATIVE()函数
作用:返回一个字段在一个series中的变化率。
InfluxDB会计算按照时间进行排序的字段值之间的差异,并将这些结果转化为单位变化率。其中,单位可以指定,默认为1s。
语法:
SELECT DERIVATIVE(<field_key>, [<unit>]) FROM <measurement_name> [WHERE <stuff>]
其中,unit
取值可以为以下几种:
u --microseconds
s --seconds
m --minutes
h --hours
d --days
w --weeks
DERIVATIVE()函数还可以在GROUP BY time()的条件下与聚合函数嵌套使用,格式如下:
SELECT DERIVATIVE(AGGREGATION_FUNCTION(<field_key>),[<unit>]) FROM <measurement_name> WHERE <stuff> GROUP BY time(<aggregation_interval>)
示例:
假设location = santa_monica 条件下数据有以下几条:
name: h2o_feet
--------------
time water_level
2015-08-18T00:00:00Z 2.064
2015-08-18T00:06:00Z 2.116
2015-08-18T00:12:00Z 2.028
2015-08-18T00:18:00Z 2.126
2015-08-18T00:24:00Z 2.041
2015-08-18T00:30:00Z 2.051
计算每一秒的变化率:
> SELECT DERIVATIVE(water_level) FROM h2o_feet WHERE location = 'santa_monica' LIMIT 5
name: h2o_feet
--------------
time derivative
2015-08-18T00:06:00Z 0.00014444444444444457
2015-08-18T00:12:00Z -0.00024444444444444465
2015-08-18T00:18:00Z 0.0002722222222222218
2015-08-18T00:24:00Z -0.000236111111111111
2015-08-18T00:30:00Z 2.777777777777842e-05
第一行数据的计算公式为(2.116 - 2.064) / (360s / 1s)
计算每六分钟的变化率
> SELECT DERIVATIVE(water_level,6m) FROM h2o_feet WHERE location = 'santa_monica' LIMIT 5
name: h2o_feet
--------------
time derivative
2015-08-18T00:06:00Z 0.052000000000000046
2015-08-18T00:12:00Z -0.08800000000000008
2015-08-18T00:18:00Z 0.09799999999999986
2015-08-18T00:24:00Z -0.08499999999999996
2015-08-18T00:30:00Z 0.010000000000000231
第一行数据的计算过程如下:(2.116 - 2.064) / (6m / 6m)
计算每12分钟的变化率:
> SELECT DERIVATIVE(water_level,12m) FROM h2o_feet WHERE location = 'santa_monica' LIMIT 5
name: h2o_feet
--------------
time derivative
2015-08-18T00:06:00Z 0.10400000000000009
2015-08-18T00:12:00Z -0.17600000000000016
2015-08-18T00:18:00Z 0.19599999999999973
2015-08-18T00:24:00Z -0.16999999999999993
2015-08-18T00:30:00Z 0.020000000000000462
第一行数据计算过程为:(2.116 - 2.064) / (6m / 12m)
计算每12分钟最大值的变化率
> SELECT DERIVATIVE(MAX(water_level)) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' AND time < '2015-08-18T00:36:00Z' GROUP BY time(12m)
name: h2o_feet
--------------
time derivative
2015-08-18T00:12:00Z 0.009999999999999787
2015-08-18T00:24:00Z -0.07499999999999973
2. DIFFERENCE()函数
作用:返回一个字段中连续的时间值之间的差异。字段类型必须是长整型或float64。
最基本的语法:
SELECT DIFFERENCE(<field_key>) FROM <measurement_name> [WHERE <stuff>]
与GROUP BY time()以及其他嵌套函数一起使用的语法格式:
SELECT DIFFERENCE(<function>(<field_key>)) FROM <measurement_name> WHERE <stuff> GROUP BY time(<time_interval>)
其中,函数可以包含以下几个:
COUNT()`, `MEAN()`, `MEDIAN()`,`SUM()`, `FIRST()`, `LAST()`, `MIN()`, `MAX()`, 和 `PERCENTILE()。
使用示例
例子中使用的源数据如下所示:
> SELECT water_level FROM h2o_feet WHERE location='santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:36:00Z'
name: h2o_feet
--------------
time water_level
2015-08-18T00:00:00Z 2.064
2015-08-18T00:06:00Z 2.116
2015-08-18T00:12:00Z 2.028
2015-08-18T00:18:00Z 2.126
2015-08-18T00:24:00Z 2.041
2015-08-18T00:30:00Z 2.051
2015-08-18T00:36:00Z 2.067
计算water_level
间的差异:
> SELECT DIFFERENCE(water_level) FROM h2o_feet WHERE location='santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:36:00Z'
name: h2o_feet
--------------
time difference
2015-08-18T00:06:00Z 0.052000000000000046
2015-08-18T00:12:00Z -0.08800000000000008
2015-08-18T00:18:00Z 0.09799999999999986
2015-08-18T00:24:00Z -0.08499999999999996
2015-08-18T00:30:00Z 0.010000000000000231
2015-08-18T00:36:00Z 0.016000000000000014
数据类型都为float类型。
3. ELAPSED()函数
作用:返回一个字段在连续的时间内的时间间隔差值,间隔单位可选,默认为1纳秒。
单位可选 u(纳秒),ms,s,m,h,d,w
语法:
SELECT ELAPSED(<field_key>, <unit>) FROM <measurement_name> [WHERE <stuff>]
其中,unit单位可选:
u --microseconds
ms--milliseconds
s --seconds
m --minutes
h --hours
d --days
w --weeks
示例:
计算h2o_feet字段在纳秒间隔下的差异。
> SELECT ELAPSED(water_level) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:24:00Z'
name: h2o_feet
--------------
time elapsed
2015-08-18T00:06:00Z 360000000000
2015-08-18T00:12:00Z 360000000000
2015-08-18T00:18:00Z 360000000000
2015-08-18T00:24:00Z 360000000000
在一分钟间隔下的差异率:
> SELECT ELAPSED(water_level,1m) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:24:00Z'
name: h2o_feet
--------------
time elapsed
2015-08-18T00:06:00Z 6
2015-08-18T00:12:00Z 6
2015-08-18T00:18:00Z 6
2015-08-18T00:24:00Z 6
注意:如果设置的时间间隔比字段数据间的时间间隔更大时,则函数会返回0,如下所示:
> SELECT ELAPSED(water_level,1h) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:24:00Z'
name: h2o_feet
--------------
time elapsed
2015-08-18T00:06:00Z 0
2015-08-18T00:12:00Z 0
2015-08-18T00:18:00Z 0
2015-08-18T00:24:00Z 0
4. MOVING_AVERAGE()函数
作用:返回一个连续字段值的移动平均值,字段类型必须是长整形或者float64类型。
语法:
基本语法
SELECT MOVING_AVERAGE(<field_key>,<window>) FROM <measurement_name> [WHERE <stuff>]
与其他函数和GROUP BY time()语句一起使用时的语法
SELECT MOVING_AVERAGE(<function>(<field_key>),<window>) FROM <measurement_name> WHERE <stuff> GROUP BY time(<time_interval>)
此函数可以和以下函数一起使用:
COUNT()`, `MEAN()`,`MEDIAN()`, `SUM()`, `FIRST()`, `LAST()`, `MIN()`, `MAX()`, and `PERCENTILE().
Examples:
The following examples focus on the field water_level
in santa_monica
between 2015-08-18T00:00:00Z
and 2015-08-18T00:36:00Z
:
> SELECT water_level FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:36:00Z'
name: h2o_feet
--------------
time water_level
2015-08-18T00:00:00Z 2.064
2015-08-18T00:06:00Z 2.116
2015-08-18T00:12:00Z 2.028
2015-08-18T00:18:00Z 2.126
2015-08-18T00:24:00Z 2.041
2015-08-18T00:30:00Z 2.051
2015-08-18T00:36:00Z 2.067
Calculate the moving average across every 2 field values:
> SELECT MOVING_AVERAGE(water_level,2) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:36:00Z'
CLI response:
name: h2o_feet
--------------
time moving_average
2015-08-18T00:06:00Z 2.09
2015-08-18T00:12:00Z 2.072
2015-08-18T00:18:00Z 2.077
2015-08-18T00:24:00Z 2.0835
2015-08-18T00:30:00Z 2.0460000000000003
2015-08-18T00:36:00Z 2.059
The first value in the moving_average
column is the average of 2.064
and 2.116
, the second value in the moving_average
column is the average of 2.116
and 2.028
.
Select the minimum water_level
at 12 minute intervals and calculate the moving average across every 2 field values:
> SELECT MOVING_AVERAGE(MIN(water_level),2) FROM h2o_feet WHERE location = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' and time <= '2015-08-18T00:36:00Z' GROUP BY time(12m)
CLI response:
name: h2o_feet
--------------
time moving_average
2015-08-18T00:12:00Z 2.0460000000000003
2015-08-18T00:24:00Z 2.0345000000000004
2015-08-18T00:36:00Z 2.0540000000000003
To get those results, InfluxDB first selects the MIN()
water_level
for every 12 minute interval:
name: h2o_feet
--------------
time min
2015-08-18T00:00:00Z 2.064
2015-08-18T00:12:00Z 2.028
2015-08-18T00:24:00Z 2.041
2015-08-18T00:36:00Z 2.067
It then uses those values to calculate the moving average across every 2 field values; the first result in the moving_average
column the average of 2.064
and 2.028
, and the second result is the average of 2.028
and 2.041
.
5. NON_NEGATIVE_DERIVATIVE()函数
作用:返回在一个series中的一个字段中值的变化的非负速率。
语法:
SELECT NON_NEGATIVE_DERIVATIVE(<field_key>, [<unit>]) FROM <measurement_name> [WHERE <stuff>]
其中unit取值可以为以下几个:
u --microseconds
s --seconds
m --minutes
h --hours
d --days
w --weeks
与聚合类函数放在一起使用时的语法如下所示:
SELECT NON_NEGATIVE_DERIVATIVE(AGGREGATION_FUNCTION(<field_key>),[<unit>]) FROM <measurement_name> WHERE <stuff> GROUP BY time(<aggregation_interval>)
6. STDDEV()函数
作用:返回一个字段中的值的标准偏差。值的类型必须是长整型或float64类型。
标准差:所有数减去其平均值的平方以该组数之个数(或个数减一,即变异数),再把所得值开根号,所得之数就是这组数据的标准差。
语法:
SELECT STDDEV(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]
示例:
> SELECT STDDEV(water_level) FROM h2o_feet
name: h2o_feet
--------------
time stddev
1970-01-01T00:00:00Z 2.279144584196145
示例2:
> SELECT STDDEV(water_level) FROM h2o_feet WHERE time >= '2015-08-18T00:00:00Z' and time < '2015-09-18T12:06:00Z' GROUP BY time(1w), location
name: h2o_feet
tags: location = coyote_creek
time stddev
---- ------
2015-08-13T00:00:00Z 2.2437263080193985
2015-08-20T00:00:00Z 2.121276150144719
2015-08-27T00:00:00Z 3.0416122170786215
2015-09-03T00:00:00Z 2.5348065025435207
2015-09-10T00:00:00Z 2.584003954882673
2015-09-17T00:00:00Z 2.2587514836274414
name: h2o_feet
tags: location = santa_monica
time stddev
---- ------
2015-08-13T00:00:00Z 1.11156344587553
2015-08-20T00:00:00Z 1.0909849279082366
2015-08-27T00:00:00Z 1.9870116180096962
2015-09-03T00:00:00Z 1.3516778450902067
2015-09-10T00:00:00Z 1.4960573811500588
2015-09-17T00:00:00Z 1.075701669442093