loki grafana

{job=~”apache|syslog”} <- show me logs where the job is apache **OR** syslog

{job=”apache”} |= “11.11.11.11”   #搜job=apache的11.11.11.11

( |= “text”, |~ “regex”, …) 

{app=”loki”} |= “level=error” 少加一个label 但是速度不慢 略等于{app=”loki”,level=”error”}
 
{app=”loki”,level!=”debug”} 如果用不等于 性能可能就不及于{app=”loki”} != “level=debug” ,因为后者没有label,磁盘性能更高,更少的chunk

`\w+` is the same as "\\w+"


{container="query-frontend",namespace="tempo-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500

he query is composed of:

  • a log stream selector {container="query-frontend",namespace="loki-dev"} which targets the query-frontend container in the loki-devnamespace.
  • a log pipeline |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500 which will filter out log that contains the word metrics.go, then parses each log line to extract more labels and filter with them.


  • =: exactly equal.
  • !=: not equal.
  • =~: regex matches.
  • !~: regex does not match.

Examples:

  • {name=~"mysql.+"}
  • {name!~"mysql.+"}
  • {name!~`mysql-\d+`}


  • {job="mysql"} |= "error"
  • {name="kafka"} |~ "tsdb-ops.*io:2003"
  • {name="cassandra"} |~ `error=\w+`
  • {instance=~"kafka-[23]",name="kafka"} != "kafka.server:type=ReplicaManager"

When using |~ and !~, Go (as in GolangRE2 syntax regex may be used. The matching is case-sensitive by default and can be switched to case-insensitive prefixing the regex with (?i).





The json parsers take no parameters and can be added using the expression | json in your pipeline. It will extract all json properties as labels if the log line is a valid json document. Nested properties are flattened into label keys using the _ separator. Arrays are skipped.

For example the json parsers will extract from the following document:

{
    "protocol": "HTTP/2.0",
    "servers": ["129.0.1.1","10.2.1.3"],
    "request": {
        "time": "6.032",
        "method": "GET",
        "host": "foo.grafana.net",
        "size": "55",
    },
    "response": {
        "status": 401,
        "size": "228",
        "latency_seconds": "6.031"
    }
}
The following list of labels:

"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_size" => "228"

 

The logfmt parser can be added using the | logfmt and will extract all keys and values from the logfmt formatted log line.

For example the following log line:

at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" connect=4ms service=8ms status=200
will get those labels extracted:

"at" => "info"
"method" => "GET"
"path" => "/"
"host" => "grafana.net"
"fwd" => "124.133.124.161"
"service" => "8ms"
"status" => "200"

 

Unlike the logfmt and json, which extract implicitly all values and takes no parameters, the regexp parser takes a single parameter | regexp "<re>" which is the regular expression using the Golang RE2 syntax.

The regular expression must contain a least one named sub-match (e.g (?P<name>re)), each sub-match will extract a different label.

For example the parser | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)" will extract from the following line:

POST /api/prom/api/v1/query_range (200) 1.5s
those labels:

"method" => "POST"
"path" => "/api/prom/api/v1/query_range"
"status" => "200"
"duration" => "1.5s"

 

复杂案例:

level=debug ts=2020-10-02T10:10:42.092268913Z caller=logging.go:66 traceID=a9d4d8a928d8db1 msg="POST /api/prom/api/v1/query_range (200) 1.5s"
{job="cortex-ops/query-frontend"} | logfmt | line_format "{{.msg}}" | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"`

 

 
比较例子: 
时间:300ms   1.5h    2h45m
logfmt | duration > 1m and bytes_consumed > 20MB
 
 

 pipeline例子:

| duration >= 20ms or size == 20kb and method!~"2.."
| duration >= 20ms or size == 20kb | method!~"2.."
| duration >= 20ms or size == 20kb , method!~"2.."
| duration >= 20ms or size == 20kb  method!~"2.."


 用括号提升优先级:

| duration >= 20ms or method="GET" and size <= 20KB
| ((duration >= 20ms or method="GET") and size <= 20KB)


line format 表达式:
{container="frontend"} | logfmt | line_format "{{.query}} {{.duration}}"


原标签替换:
label_replace()
For each timeseries in v, label_replace(v instant-vector, dst_label string, replacement string, src_label string, regex string) matches the regular expression regex against the label src_label. 
If it matches, then the timeseries is returned with the label dst_label replaced by the expansion of replacement. $1 is replaced with the first matching subgroup, $2 with the second etc.
If the regular expression doesn’t match then the timeseries is returned unchanged. This example will return a vector with each time series having a foo label with the value a added to it: label_replace(rate({job="api-server",service="a:c"} |= "err" [1m]), "foo", "$1", "service", "(.*):.*")

 

多重过滤:

{cluster="ops-tools1", namespace="loki-dev", job="loki-dev/query-frontend"} |= "metrics.go" !="out of order" | logfmt | duration > 30s or status_code!="200"
 

 

原日志:
level=debug ts=2020-10-02T10:10:42.092268913Z caller=logging.go:66 traceID=a9d4d8a928d8db1 msg="POST /api/prom/api/v1/query_range (200) 1.5s"

多重解析:
{job="cortex-ops/query-frontend"} | logfmt | line_format "{{.msg}}" | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"`
 
 
 

 输出格式化:

{cluster="ops-tools1", name="querier", namespace="loki-dev"}
  |= "metrics.go" != "loki-canary"
  | logfmt
  | query != ""
  | label_format query="{{ Replace .query \"\\n\" \"\" -1 }}"
  | line_format "{{ .ts}}\t{{.duration}}\ttraceID = {{.traceID}}\t{{ printf \"%-100.100s\" .query }} "
 原日志:
level=info ts=2020-10-23T20:32:18.094668233Z caller=metrics.go:81 org_id=29 traceID=1980d41501b57b68 latency=fast query="{cluster=\"ops-tools1\", job=\"cortex-ops/query-frontend\"} |= \"query_range\"" query_type=filter range_type=range length=15m0s step=7s duration=650.22401ms status=200 throughput_mb=1.529717 total_bytes_mb=0.994659
level=info ts=2020-10-23T20:32:18.068866235Z caller=metrics.go:81 org_id=29 traceID=1980d41501b57b68 latency=fast query="{cluster=\"ops-tools1\", job=\"cortex-ops/query-frontend\"} |= \"query_range\"" query_type=filter range_type=range length=15m0s step=7s duration=624.008132ms status=200 throughput_mb=0.693449 total_bytes_mb=0.432718

 后变为:

2020-10-23T20:32:18.094668233Z	650.22401ms	    traceID = 1980d41501b57b68	{cluster="ops-tools1", job="cortex-ops/query-frontend"} |= "query_range"
2020-10-23T20:32:18.068866235Z	624.008132ms	traceID = 1980d41501b57b68	{cluster="ops-tools1", job="cortex-ops/query-frontend"} |= "query_range"
 

 

 
  • rate(log-range): calculates the number of entries per second
  • count_over_time(log-range): counts the entries for each log stream within the given range.
  • bytes_rate(log-range): calculates the number of bytes per second for each stream.
  • bytes_over_time(log-range): counts the amount of bytes used by each log stream for a given range.
  • absent_over_time(log-range): returns an empty vector if the range vector passed to it has any elements and a 1-element vector with the value 1 if the range vector passed to it has no elements. (absent_over_time is useful for alerting on when no time series and logs stream exist for label combination for a certain amount of time.)

 

Log Examples
count_over_time({job="mysql"}[5m])
Logql

This example counts all the log lines within the last five minutes for the MySQL job.

sum by (host) (rate({job="mysql"} |= "error" != "timeout" | json | duration > 10s [1m]))
Logql

This example demonstrates a LogQL aggregation which includes filters and parsers. It returns the per-second rate of all non-timeout errors within the last minutes per host for the MySQL job and only includes errors whose duration is above ten seconds.

 

 grafana作图:(统计5分钟内的error数)

sum (count_over_time({job="logstash"} |~ "(?i)error"[5m]))

sum (count_over_time({job="logstash",alexenv=~"$namespace",podName=~"$serviceName"} |~ "(?i)error"[5m]))

{alexenv=~"$namespace",podName=~"$serviceName"}     (显示实时日志)

grafana loki输出格式化:
{job="logstash",alexerror="true"} |regexp `(?P<allmsg>(?s)(.+?)$)`|line_format "{{.allmsg}}"


logstash配置:
input {
  beats {
    port => 5044
  }
}
filter {
  ruby { 
    code => "event.set('alextime',event.get('@timestamp').time.localtime + 8*60*60)"
  }
  ruby { 
        code => "event.set('alexyear',event.get('alextime').to_s.split(pattern='-')[0])"
  }
  ruby {
        code => "event.set('alexmonth',event.get('alextime').to_s.split(pattern='-')[1])"
  }
  ruby {
        code => "event.set('alexday',event.get('alextime').to_s.split(pattern='-')[2].slice(0..1))"
  }
  ruby {
        code => "event.set('alexhour',event.get('alextime').to_s.split(pattern=':')[0].slice(-2..-1))"
  } 
  ruby {
    code => "event.set('alexpath',event.get('log'))"
  }
  ruby {
    #code => "event.set('blex',event.get('alexpath')['file']['path'])"
    #code => "puts event.get('alexpath')['file']['path'].split(pattern=':')"
    #code => "event.set('alexpath',event.get('alexpath')['file']['path'].split(pattern=':')[-1])"
    code => "event.set('alexpath',event.get('alexpath')['file']['path'].split(pattern=':')[-1].tr('\\','/'))"
}
  mutate {
    split => { "shortHostname" => "-" }
    add_field => { "podName" => "%{[shortHostname][0]}"
                   "job" => "logstash" 
                 }
  }
}
output {
  file {
        path => "/nfs/%{[alexenv]}/%{podName}-%{alexyear}-%{alexmonth}-%{alexday}-%{alexhour}.log"
        codec => line { format => "%{message}"}
  }
#        stdout { }
  loki {
    url => "http://172.23.29.3:3100/loki/api/v1/push"
    batch_size => 112640
    retries => 5
    min_delay => 3
    max_delay => 500
  }
}

 

  正则表达式:
这个是正则表达式的模式修饰符。
  (?i)即匹配时不区分大小写。表示匹配时不区分大小写。

  (?s)即Singleline(单行模式)。表示更改.的含义,使它与每一个字符匹配(包括换行 符\n)。

  (?m)即Multiline(多行模式) 。 表示更改^和$的 含义,使它们分别在任意一行的行首和行尾匹配,而不仅仅在整个字符串的开头和结尾匹配。(在此模式下,$的 精确含意是:匹配\n之前的位置以及字符串结束前的位置.) 
  (?x):表示如果加上该修饰符,表达式中的空白字符将会被忽略,除非它已经被转义。 
  (?e):表示本修饰符仅仅对于replacement有用,代表在replacement中作为PHP代码。 
  (?A):表示如果使用这个修饰符,那么表达式必须是匹配的字符串中的开头部分。比如说"/a/A"匹配"abcd"。 
  (?E):与"m"相反,表示如果使用这个修饰符,那么"$"将匹配绝对字符串的结尾,而不是换行符前面,默认就打开了这个模式。 

  (?U):表示和问号的作用差不多,用于设置"贪婪模式"。

 

?:  (?)单个问号是不捕捉模式 

写法如:(?:)

  如何关闭圆括号的捕获能力?
      而只是用它来做分组,方法是在左括号的后边加上:?,
这里第一个圆括弧只是用来分组,而不会占用捕获变量,*/ 

    "(?:\\w+\\s(\\w+))"

 
posted @ 2021-01-13 10:48  alexhe  阅读(1267)  评论(0编辑  收藏  举报