elasticsearch

调试模式立即生效:

logstash -f beats.conf --config.reload.automatic

 

改表logstash数据:

filter {
  if [action] == "login" {
    mutate { remove_field => "secret" }
  }
}

You can specify multiple expressions in a single condition:

output {
  # Send production errors to pagerduty
  if [loglevel] == "ERROR" and [deployment] == "production" {
    pagerduty {
    ...
    }
  }
}

 

filter {
  if [foo] in [foobar] {
    mutate { add_tag => "field in field" }
  }
  if [foo] in "foo" {
    mutate { add_tag => "field in string" }
  }
  if "hello" in [greeting] {
    mutate { add_tag => "string in field" }
  }
  if [foo] in ["hello", "world", "foo"] {
    mutate { add_tag => "field in list" }
  }
  if [missing] in [alsomissing] {
    mutate { add_tag => "shouldnotexist" }
  }
  if !("foo" in ["hello", "world"]) {
    mutate { add_tag => "shouldexist" }
  }
}

You use the not in conditional the same way. For example, you could use not in to only route events to Elasticsearch when grok is successful:

output {
  if "_grokparsefailure" not in [tags] {
    elasticsearch { ... }
  }
}

The @metadata fieldedit

In Logstash 1.5 and later, there is a special field called @metadata. The contents of @metadata will not be part of any of your events at output time, which makes it great to use for conditionals, or extending and building event fields with field reference and sprintf formatting.

The following configuration file will yield events from STDIN. Whatever is typed will become the message field in the event. The mutate events in the filter block will add a few fields, some nested in the @metadata field.

input { stdin { } }

filter {
  mutate { add_field => { "show" => "This data will be in the output" } }
  mutate { add_field => { "[@metadata][test]" => "Hello" } }
  mutate { add_field => { "[@metadata][no_show]" => "This data will not be in the output" } }
}

output {
  if [@metadata][test] == "Hello" {
    stdout { codec => rubydebug }
  }
}

Let’s see what comes out:

$ bin/logstash -f ../test.conf
Pipeline main started
asdf
{
    "@timestamp" => 2016-06-30T02:42:51.496Z,
      "@version" => "1",
          "host" => "example.com",
          "show" => "This data will be in the output",
       "message" => "asdf"
}



grok例子:
input { stdin { } }

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

 

mutate例子:

mutate { replace => { "type" => "apache_access" } }

 

csv或者分隔符:

filter {

  mutate { add_field => { "show" => "This data will be in the output" } }

  mutate { add_field => { "[@metadata][test]" => "Hello" } }

  mutate { add_field => { "[@metadata][no_show]" => "This data will not be in the output" } }

  csv {

    separator => "^^^"

    columns =>["time","username","age","id"]

 

  }

}

 

 

 

查看统计有多少个index

curl 'localhost:9200/_cat/indices?v'

 

 

统计有多少文档:

curl -XGET 'http://localhost:9200/_count?pretty' -d '
{
    "query": {
        "match_all": {}
    }
}
'

 

数据删除

要删除数据,修改发送的 HTTP 请求方法为 DELETE 即可:

# curl -XDELETE http://127.0.0.1:9200/logstash-2015.06.21/testlog/AU4ew3h2nBE6n0qcyVJK

删除不单针对单条数据,还可以删除整个整个索引。甚至可以用通配符。

# curl -XDELETE http://127.0.0.1:9200/logstash-2015.06.0*

 

 

 

 

查询

 curl -XGET '192.168.1.118:9200/_count?pretty' 

 

添加一个索引

curl -XPUT 'localhost:9200/megacorp/employee/1?pretty' -H 'Content-Type: application/json' -d'

{

    "first_name" : "John",

    "last_name" :  "Smith",

    "age" :        25,

    "about" :      "I love to go rock climbing",

    "interests": [ "sports", "music" ]

}

'

 

megacorp索引名称   employee类型名称   1特定雇员的ID

 

继续新增两个:

curl -XPUT 'localhost:9200/megacorp/employee/2?pretty' -H 'Content-Type: application/json' -d'

{

    "first_name" :  "Jane",

    "last_name" :   "Smith",

    "age" :         32,

    "about" :       "I like to collect rock albums",

    "interests":  [ "music" ]

}

'

curl -XPUT 'localhost:9200/megacorp/employee/3?pretty' -H 'Content-Type: application/json' -d'

{

    "first_name" :  "Douglas",

    "last_name" :   "Fir",

    "age" :         35,

    "about":        "I like to build cabinets",

    "interests":  [ "forestry" ]

}

'

 

检索

curl -XGET 'localhost:9200/megacorp/employee/1?pretty'

 

将 HTTP 命令由 PUT 改为 GET 可以用来检索文档,同样的,可以使用 DELETE 命令来删除文档,以及使用 HEAD 指令来检查文档是否存在。如果想更新已存在的文档,只需再次 PUT 。

 

 

轻量搜索:

搜索所有雇员:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty'

 

 

搜索lastname为smith的雇员:

curl -XGET 'localhost:9200/megacorp/employee/_search?q=last_name:Smith&pretty'

 

使用查询表达式搜索:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match" : {
"last_name" : "Smith"
}
}
}
'

 

更复杂的搜索:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"bool": {
"must": {
"match" : {
"last_name" : "smith"
}
},
"filter": {
"range" : {
"age" : { "gt" : 30 }
}
}
}
}
}
'

 

 

全文搜索:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
'

会返回两条数据把jane的也返回了,因为jane里有关键词rock

 

 

短语搜索:

精确查询,仅匹配同时包含 “rock”  “climbing” ,并且 二者以短语 “rock climbing” 的形式紧挨着的雇员记录

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
}
}
'

 

高亮搜索:"I love to go <em>rock</em> <em>climbing</em>"

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
},
"highlight": {
"fields" : {
"about" : {}
}
}
}
'

 

分析:

如果预见报错:

搜了一下应该是5.x后对排序,聚合这些操作用单独的数据结构(fielddata)缓存到内存里了,需要单独开启,官方解释在此fielddata

简单来说就是在聚合前执行如下操作

PUT megacorp/_mapping/employee/
{
  "properties": {
    "interests": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

支持管理者对雇员目录做分析。 Elasticsearch 有一个功能叫聚合(aggregations),允许我们基于数据生成一些精细的分析结果。聚合与 SQL 中的 GROUP BY 类似但更强大。

举个例子,挖掘出雇员中最受欢迎的兴趣爱好:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"aggs": {
"all_interests": {
"terms": { "field": "interests" }
}
}
}
'

 

暂时忽略掉语法,直接看看结果:

{
   ...
   "hits": { ... },
   "aggregations": {
      "all_interests": {
         "buckets": [
            {
               "key":       "music",
               "doc_count": 2
            },
            {
               "key":       "forestry",
               "doc_count": 1
            },
            {
               "key":       "sports",
               "doc_count": 1
            }
         ]
      }
   }
}

如果想知道叫 Smith 的雇员中最受欢迎的兴趣爱好,可以直接添加适当的查询来组合查询:

curl -XGET 'localhost:9200/megacorp/employee/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"last_name": "smith"
}
},
"aggs": {
"all_interests": {
"terms": {
"field": "interests"
}
}
}
}
'

 

查看集群健康:

curl 'localhost:9200/_cat/health?v'

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent

1503907205 16:00:05  elasticsearch yellow          1         1     10  10    0    0       10             0                  -                 50.0%

 Green means everything is good (cluster is fully functional), yellow means all data is available but some replicas are not yet allocated (cluster is fully functional), and red means some data is not available for whatever reason. Note that even if a cluster is red, it still is partially functional (i.e. it will continue to serve search requests from the available shards) but you will likely need to fix it ASAP since you have missing data.

 

curl -XGET 'localhost:9200/_cat/nodes?v&pretty'

ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name

127.0.0.1            6          77   0    0.00    0.00     0.00 mdi       *      P-Qyj5L

 

 

 查看所有索引目录:

 curl -XGET 'localhost:9200/_cat/indices?v&pretty'

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size

yellow open   filebeat-2017.08.24 XPda9kB8QNuiNZyaRyhEkA   5   1        100            0     89.9kb         89.9kb

yellow open   .kibana             7NeS89YiRsWagmLlTNslOQ   5   1         58           17      188kb          188kb

 


创建一个索引,然后查看索引目录:

 curl -XPUT 'localhost:9200/customer?pretty&pretty'

curl -XGET 'localhost:9200/_cat/indices?v&pretty'

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size

yellow open   customer            PWngMKTrSgyAUXaC3zocyg   5   1          0            0       324b           324b

we now have 1 index named customer and it has 5 primary shards and 1 replica (the defaults) and it contains 0 documents in it.

 

在上面的索引内创建一个文档,他的type是external,id是1:

curl -XPUT 'localhost:9200/customer/external/1?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"name": "John Doe"
}
'

 

查看刚刚创建的文档索引:

 curl -XGET 'localhost:9200/customer/external/1?pretty&pretty'

 

更新文档,改名字:

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"doc": { "name": "Jane Doe" }
}
'

 

更新文档,加上age年龄:

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"doc": { "name": "Jane Doe", "age": 20 }
}
'

 

更新文档,年龄再加5:

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"script" : "ctx._source.age += 5"
}
'

查看以上更新后的东西:

 curl -XGET 'localhost:9200/customer/external/1?pretty&pretty'

{

  "_index" : "customer",

  "_type" : "external",

  "_id" : "1",

  "_version" : 4,

  "found" : true,

  "_source" : {

    "name" : "Jane Doe",

    "age" : 25

  }

}

 

删除一个文档:

curl -XDELETE 'localhost:9200/customer/external/2?pretty&pretty'

 

批量增加:the following call indexes two documents (ID 1 - John Doe and ID 2 - Jane Doe) in one bulk operation:

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'

 

批量修改和删除:

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'

 

增加1000条测试数据:https://www.elastic.co/guide/en/elasticsearch/reference/current/_exploring_your_data.html

curl -H "Content-Type: application/json" -XPOST 'localhost:9200/bank/account/_bulk?pretty&refresh' --data-binary "@accounts.json"
curl 'localhost:9200/_cat/indices?v'

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size

yellow open   customer            PWngMKTrSgyAUXaC3zocyg   5   1          1            0      4.5kb          4.5kb

yellow open   .kibana             7NeS89YiRsWagmLlTNslOQ   5   1         58           17      188kb          188kb

yellow open   bank                0kR5rv9SRp-aUhnCVyhH4A   5   1       1000            0    640.3kb        640.3kb

 

 

搜索:使用request uri方式:

curl -XGET 'localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty&pretty'

只返回1条 curl -XGET 'localhost:9200/bank/_search?q=*&sort=account_number:asc&size=1&pretty&pretty'

took代表查询毫秒数

_shards搜索了多少个shards

hits.total总共搜索出了多少内容

hits.hits返回结果,默认返回10条

hits.sort - sort key for results (missing if sorting by score)

 

 

 

 

 

 

搜索:使用request body方式:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" }
]
}
'

 

 

 

 

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} }
}
'

 

只返回1条记录:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"size": 1
}
'

 

类似mysql的limit:this example does a match_all and returns documents 11 through 20:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}
'

 

sort:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": { "balance": { "order": "desc" } }
}
'

 

 

 

只在_source中显示account_number和balance两个列:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
'

 

 

类似mysql的where条件搜索:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "account_number": 20 } }
}
'

 

包含mill的所有地址:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "address": "mill" } }
}
'

 

返回address里有mill或者lane:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "address": "mill lane" } }
}
'

 

返回完全mill lane短语的搜索:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_phrase": { "address": "mill lane" } }
}
'

 

搜索地址一定有mill和一定有lane的结果:mysql and

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'

 

搜索地址有mill或者lane的结果:mysql or

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'

 

搜索地址不能有mil也不能有lane的结果:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'

 

结合上面的综合例子:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
}
'

 

filter 过滤 ,range query:数字和日期的范围查询

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}
'

 综合例子:年龄31岁,存款在20000和30000之间:

curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'

{

  "query": {

    "bool": {

      "must": { "match": {"age":31} },

      "filter": {

        "range": {

          "balance": {

            "gte": 20000,

            "lte": 30000

          }

        }

      }

    }

  }

}

'

 

posted @ 2017-08-04 09:59  alexhe  阅读(191)  评论(0编辑  收藏  举报