学习过es,但是每次学习,感觉都不同,今天重新做一次梳理。

  gitee:https://gitee.com/juncaoit/fast

一:ELK介绍

1.说明

  elasticsearch,logstash,kibana

 

2.介绍

  elasticsearch:全文搜搜引擎,基于java。

  logstash:具有实时传输的数据收集引擎,用来数据收集

  kibana:提供分析与可视化。可以再es索引中查找,交互数据,生成各种维度的表格,图形

 

3.为什么使用

  elasticsearch:

  数据量庞大

  搜索要求快,准,多维度的使用

  logstash:

  数据源丰富,数据库,日志,分散的数据都可以收集

  kibana:

  分析展示,展示数据的价值

 

二:单机安装es

1.上传

  规划目录:

  

 

   上传:

  

 

 

2.解压

tar -zxvf elasticsearch-7.6.1-linux-x86_64.tar.gz -C ../software/

 

  

 

 

3.配置jdk

  删除不必要的jdk:

  

 

 

  配置:

export JAVA_HOME=/opt/software/elasticsearch-7.6.1/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

 

  在es的目录下,有jdk,版本是13的

  

 

 

4.添加用户

groupadd tedu
useradd tedu -g tedu
chown -R tedu:tedu elasticsearch-7.6.1/

  

 

 

5.启动

  su tedu

  ./bin/elasticsearch

 

6.测试

  curl "localhost:9200"

  

 

 

三:集群安装es

1.三台安装

 

2.修改配置文件config/

  es01节点(后面只是name不同)

cluster.name: es-cluster
node.name: es01
path.data: /opt/software/elasticsearch-7.6.1/data
path.logs: /opt/software/elasticsearch-7.6.1/logs
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300
discovery.seed_hosts: ["192.168.19.132", "192.168.19.133", "192.168.19.134"]
cluster.initial_master_nodes: ["192.168.19.132"]
http.cors.enabled: true
http.cors.allow-origin: "*"

 

3.出现的问题进行处理

[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

  【1】的处理:

vi /etc/security/limits.conf

添加:
*                soft    nofile          65536
*                hard    nofile          65536

切换成root,使用下面命令查询效果
ulimit -Hn

ulimit -Sn

 

  【2】的处理

vi /etc/sysctl.conf

添加:
vm.max_map_count=262144

 

  然后重启:

reboot

 

4.测试

  

 

四:head插件

1.安装nodejs

  head插件的运行环境是node

wget https://nodejs.org/dist/v15.0.0/node-v15.0.0-linux-x64.tar.gz

 

2.解压

tar -zxvf node-v15.0.0-linux-x64.tar.gz -C ../software/

 

3.配置环境变量

export NODE_HOME=/opt/software/node-v15.0.0-linux-x64
export PATH=$NODE_HOME/bin:$PATH

 

  验证:

  

 

 

 

4.下载head插件

  git:https://github.com/mobz/elasticsearch-head

yum -y install git
git clone https://github.com/mobz/elasticsearch-head.git

 

  git一直没有拉下来,下载了zip

  

 

 

   解压:

unzip elasticsearch-head-master.zip -d ../software/

 

5.npm安装

  在head插件文件夹的根目录下执行npm安装

npm install -g grunt
yum install bzip2
npm insall

 

6.配置head文件

  GruntFile.js中97行配置,添加hostname

  

 

 

   注意添加引号

 

7.head根目录下启动

grunt server

 

8.访问

  192.168.19.132:9100

  

 

  启动两个es节点,观察

  

 

 

 

五:kibana

1.解压

  版本需要对应

tar -zxvf kibana-7.6.1-linux-x86_64.tar.gz -C ../software/

 

2.配置

  config/kibana.yml

server.host: "192.168.19.132"
elasticsearch.hosts: ["http://192.168.19.132:9200"]

 

3.启动

  bin/kibana --allow-root

 

4.访问

http://192.168.19.132:5601

  

 

 

六:重要概念

1.分片

  单台存储是有限的,es可以将一个index的数据分为多个分片

 

2.rest方式

curl格式:curl -H 请求头 -d 请求体 -X POST 接口地址

  例如:

  新增一个索引文件,并且以漂亮格式展示响应

curl -X PUT http://localhost:9200/person/_doc/1?pretty -H "Content-type:application/json" -d '{"name":"laoshi"}'

  效果:

  

  head上:

  

 

 

七:索引管理

1.索引的创建

  put请求,表示新增

# 新增索引
PUT /index01

 

2.插入文档

  put请求

# 索引中添加文档数据
PUT /index01/_doc/1
{
  "name":"tom"
}

 

3.查询文档

# 查询文档
GET /index01/_doc/1

 

4.更新文档

  使用put再做一次相同的改变

# 更新
PUT /index01/_doc/1
{
  "name":"tom2"
}

  只有version有变化

  

 

 

5.删除文档

# 删除文档
DELETE /index01/_doc/1

 

6.删除索引

#删除索引
DELETE /index01

 

7.批量索引

  减少网络往返

# 批量索引
PUT /index05/_bulk
{"index":{"_id":"1"}}
{"id":"1", "name":"雅典娜", "job":"html", "age":"38", "salary":20000, "gender":"female", "like":"牛奶"}
{"index":{"_id":"2"}}
{"id":"2", "name":"马云云", "job":"html", "age":"22", "salary":35000, "gender":"male", "like":"香蕉"}
{"index":{"_id":"3"}}
{"id":"1", "name":"强东", "job":"go", "age":"24", "salary":10000, "gender":"male", "like":"苹果"}
{"index":{"_id":"4"}}
{"id":"1", "name":"小马", "job":"python", "age":"18", "salary":50000, "gender":"male", "like":"李子"}

 

八:搜索功能

1.match_all

  query查询类型

# 查询功能
GET /index05/_search
{
  "query": {
    "match_all": {}
  }
}

 

2.term(词项)

  词项,分词计算的基本单位

  中文会被拆分,例如李老师,则是李,老,师

  term查询,返回的文档包含了提供的确切词项的文档,如果没有包含,则不展示

#term查询
GET /index05/_search
{
  "query": {
    "term": {
      "name": {
        "value": "马"
      }
    }
  }
}

  效果:【符合预期】

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.7549127,
    "hits" : [
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.7549127,
        "_source" : {
          "id" : "1",
          "name" : "小马",
          "job" : "python",
          "age" : "18",
          "salary" : 50000,
          "gender" : "male",
          "like" : "李子"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.6407243,
        "_source" : {
          "id" : "2",
          "name" : "马云云",
          "job" : "html",
          "age" : "22",
          "salary" : 35000,
          "gender" : "male",
          "like" : "香蕉"
        }
      }
    ]
  }
}

 

  在看一个,term是马云

#term查询
GET /index05/_search
{
  "query": {
    "term": {
      "name": {
        "value": "马云"
      }
    }
  }
}

  效果:为空

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

 

 

3.boost加权

  是的查询结果的文档评分最终会乘以boost的结果进行返回

  在value下添加

  主要是组合查询了,不同的条件添加不同的权重

#boost加权
GET /index05/_search
{
  "query": {
    "term": {
      "name": {
        "value": "马",
        "boost": 2
      }
    }
  }
}

  效果:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.5098253,
    "hits" : [
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.5098253,
        "_source" : {
          "id" : "1",
          "name" : "小马",
          "job" : "python",
          "age" : "18",
          "salary" : 50000,
          "gender" : "male",
          "like" : "李子"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2814486,
        "_source" : {
          "id" : "2",
          "name" : "马云云",
          "job" : "html",
          "age" : "22",
          "salary" : 35000,
          "gender" : "male",
          "like" : "香蕉"
        }
      }
    ]
  }
}

 

4.range

  返回一个范围内包含的文档

#range
GET /index05/_search
{
  "query": {
    "range": {
      "salary": {
        "gte": 10000,
        "lte": 30000
      }
    }
  }
}

  

5.exist

  包含这个字段,则返回

  文档的字段不存在的原因:

    写入的索引字段值在json中是null或者[]

    字段设置了“index”:false的映射导致不会写到索引中

    字段设置了ignore_above,当超过长度不会写入索引

# exist
GET /index05/_search
{
  "query": {
    "exists": {
      "field": "name"
    }
  }
}

 

6.match

  先分词计算

  这里可以发现与term不一样。通过分词之后的数据进行查询

  默认是或的关系

#match
GET /index05/_search
{
  "query": {
    "match": {
      "name": "马云"
    }
  }
}

  效果:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.2080264,
    "hits" : [
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 2.2080264,
        "_source" : {
          "id" : "2",
          "name" : "马云云",
          "job" : "html",
          "age" : "22",
          "salary" : 35000,
          "gender" : "male",
          "like" : "香蕉"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.7549127,
        "_source" : {
          "id" : "1",
          "name" : "小马",
          "job" : "python",
          "age" : "18",
          "salary" : 50000,
          "gender" : "male",
          "like" : "李子"
        }
      }
    ]
  }
}

  调整逻辑关系

  相关性更加明确

GET /index05/_search
{
  "query": {
    "match": {
      "name":{
        "query": "马云",
        "operator": "and"
      }
    }
  }
}

  效果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.2080264,
    "hits" : [
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 2.2080264,
        "_source" : {
          "id" : "2",
          "name" : "马云云",
          "job" : "html",
          "age" : "22",
          "salary" : 35000,
          "gender" : "male",
          "like" : "香蕉"
        }
      }
    ]
  }
}

 

7.bool

  定义多个子查询

GET /index05/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "马云"
          }
        },
        {
          "term": {
            "job": {
              "value": "python"
            }
          }
        }
      ]
    }
  }
}

  must与must_not结合使用的逻辑关系:

  

 

  should:查询结果可能是也可能不是这个条件的子集,should和must同时使用,should的唯一作用就是影响最终的相关性的评分计算。

  filter:查询结果必须是该条件的子集,但是满足filter子条件的结果要忽略评分,也就是其他的子条件的查询评分不会为filter的存在而变化

 

九:索引的映射设置

1.mapping

  决定如何存储,如何生成存储,定义字段类型

  存在静态映射与动态映射

 

2.动态映射

## 动态索引
PUT /index01/_doc/1
{
    "name":"小马",
    "job":"python",
    "age":"18",
    "salary":50000,
    "gender":"male",
    "like":"李子",
    "address":"北京市大兴区",
    "sorted":false,
    "emplotedTime":"2020-01-12",
    "location":
    {
      "lat":"41.43",
      "lon":"67.98"
    },
    "ip":"192.168.5.102"
}

GET /index01/_mapping

  效果:

{
  "index01" : {
    "mappings" : {
      "properties" : {
        "address" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "emplotedTime" : {
          "type" : "date"
        },
        "gender" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "ip" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "job" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "like" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "location" : {
          "properties" : {
            "lat" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "lon" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "salary" : {
          "type" : "long"
        },
        "sorted" : {
          "type" : "boolean"
        }
      }
    }
  }
}

 

3.结构

  字符串类型:

  

 

   fields是一个可选的属性,它表示给当前字段的扩展属性,扩展了一个keyword。具备了text'的特点,也具备了keyword的特点

  

  上面的可以查询到,因为分词;下面的反而搜不到,因为存的是一个整的词。

GET /index01/_search
{
  "query": {
    "term": {
      "address": {
        "value": ""
      }
    }
  }
}

GET /index01/_search
{
  "query": {
    "term": {
      "address.keyword": {
        "value": ""
      }
    }
  }
}

 

4.整数类型

  默认long

 

5.浮点类型

  默认double

 

6.日期

  默认对应是date,是因为几种类型被识别

 

7.对象

  

 

 

8.添加静态映射

#添加mapping
PUT /index02
{
  "mappings": {
    "properties": {
      "email":{
        "type": "keyword"
      }
    }
  }
}

GET /index02/_mapping

PUT /index02/_doc/1
{
  "email":"1354488@qq.com",
  "name":"tom"
}

GET /index02/_doc/1

GET /index02/_mapping

  效果:

  不存在的则按照动态mapping生成。

{
  "index02" : {
    "mappings" : {
      "properties" : {
        "email" : {
          "type" : "keyword"
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

 

9.索引之后,添加静态映射

# 后添加映射
PUT /index04
PUT /index04/_mapping
{
  "properties":{
    "name":{
      "type":"text"
    }
  }
}

GET /index04/_mapping

 

十:分词器与热词设置

1.分词

  主要有Tokenization与Normalization

  Tokenization:将文本分成一小块一小块,称之为token

  Mormalization:词条允许在单个术语上进行匹配,q允许精确匹配,还可以使用相关性查询

 

2.分词器

  自带的分词器,standard analyzer

# 分词器测试
POST /_analyze
{
  "text": ["王者荣耀"],
  "analyzer": "standard"
}

  效果:

{
  "tokens" : [
    {
      "token" : "",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<IDEOGRAPHIC>",
      "position" : 2
    },
    {
      "token" : "耀",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "<IDEOGRAPHIC>",
      "position" : 3
    }
  ]
}

 

3.ik分词器

  上传

  

 

 

 

  解压到es的plugins/ik

unzip elasticsearch-analysis-ik-7.6.1.zip -d ../software/elasticsearch-7.6.1/plugins/ik

 

  启动es

    没有上传ik的不用重启

    是否加载

[es01] try load config from /opt/software/elasticsearch-7.6.1/plugins/ik/config/IKAnalyzer.cfg.xml

 

  检验:

#ik
POST /_analyze
{
  "text": ["疑是银河落九天"],
  "analyzer": "ik_max_word"
}

  效果:

{
  "tokens" : [
    {
      "token" : "疑是银河落九天",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "疑是",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "银河",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "落九天",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "九天",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "TYPE_CNUM",
      "position" : 5
    },
    {
      "token" : "",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "COUNT",
      "position" : 6
    }
  ]
}

 

4.分词进步

#ik
POST /_analyze
{
  "text": ["王者荣耀"],
  "analyzer": "ik_max_word"
}

  效果:不认识王者荣耀四个字

{
  "tokens" : [
    {
      "token" : "王者",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "荣耀",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    }
  ]
}

 

5.本地ik词典的配置

  在我们解压的ik分词器文件夹中/plugins/ik/config有一个xml配置文件可以指定词典使用。

  

 

  

  非热加载的方式处理:

/opt/software/elasticsearch-7.6.1/plugins/ik/config
vi my_main.dic

  添加:

  

 

  

 

 

  测试效果:

{
  "tokens" : [
    {
      "token" : "王者荣耀",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "王者",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "荣耀",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 2
    }
  ]
}

 

6.本地ik词典的配置

  下载安装tomcat

tar -zxvf apache-tomcat-9.0.64.tar.gz -C ../software/

 

  然后进入root目录

/opt/software/apache-tomcat-9.0.64/webapps/ROOT
vi hot.dic

  

 

 

  启动tomcat

# bin/startup.sh 

  访问:

  

 

  在ik远程字典中进行配置,然后启动es

  

 

 

  校验效果:

{
  "tokens" : [
    {
      "token" : "王者荣耀",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "王者荣",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "王者",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "荣耀",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    }
  ]
}

 

十一:java调用

1.pom

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.6.2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-client</artifactId>
            <version>7.6.2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.6.2</version>
        </dependency>

 

2.新增索引

    /**
     * 新建索引
     */
    @Override
    public CreateIndexResponse createIndex(String index) throws IOException {
        CreateIndexRequest createRequest = new CreateIndexRequest(index);
        createRequest.settings(Settings.builder()
                .put("number_of_shards", "3")
                .put("number_of_replicas", "2"));
        CreateIndexResponse createIndexResponse = highLevelClient.indices().create(createRequest, RequestOptions.DEFAULT);
        return createIndexResponse;
    }

 

3.删除索引

    /**
     * 删除索引
     */
    @Override
    public AcknowledgedResponse deleteIndex(String index) throws IOException {
        DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest(index);
        return highLevelClient.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
    }

 

4.新增文档

    /**
     * 增加文档
     */
    @Override
    public boolean add(EsDto esDto, String index, String id){
        // 执行
        IndexRequest indexRequest = new IndexRequest(index).id(id).source(esDto.getJsonStr(), XContentType.JSON);
        try {
            IndexResponse response = highLevelClient.index(indexRequest, RequestOptions.DEFAULT);
            log.info("增加返回结果:{}", JSONObject.toJSON(response));
        } catch (IOException e) {
            e.printStackTrace();
        }
        return true;
    }

 

5.查询文档

    /**
     * 查询文档
     */
    @Override
    public Map get(String index, String id){
        GetRequest getRequest = new GetRequest(index, id);
        try {
            GetResponse response = highLevelClient.get(getRequest, RequestOptions.DEFAULT);
            return response.getSource();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return Maps.newHashMap();
    }

 

6.判断是否存在

    /**
     * 是否存在文档
     */
    @Override
    public Boolean exist(String index, String id){
        GetRequest getRequest = new GetRequest(index, id);
        try {
            boolean exists = highLevelClient.exists(getRequest, RequestOptions.DEFAULT);
            return exists;
        } catch (IOException e) {
            e.printStackTrace();
        }
        return false;
    }

 

7.删除文档

    /**
     * 删除文档
     */
    @Override
    public boolean delete(String index, String id){
        DeleteRequest deleteRequest = new DeleteRequest(index, id);
        try {
            DeleteResponse response = highLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
            log.info("删除文档返回结果:{}", JSONObject.toJSON(response));
            return true;
        } catch (IOException e) {
            e.printStackTrace();
        }
        return false;
    }

 

8.更新文档

  index必须存在

    /**
     * 更新文档
     */
    @Override
    public boolean update(EsDto esDto, String index, String id){
        UpdateRequest updateRequest = new UpdateRequest(index, id).doc(esDto.getJsonStr());
        try {
            UpdateResponse response = highLevelClient.update(updateRequest, RequestOptions.DEFAULT);
            log.info("更新文档返回结果:{}", JSONObject.toJSON(response));
            return true;
        } catch (IOException e) {
            e.printStackTrace();
        }
        return false;
    }

 

 9.批量新增

  索引可以不存在

    public void bulk(String index) throws IOException {
        BulkRequest request = new BulkRequest();
        request.add(new IndexRequest(index).id("2")
                .source(XContentType.JSON, "age", "18", "address", "长江中下游"));
        request.add(new IndexRequest(index).id("3")
                .source(XContentType.JSON, "age", "20", "address", "长江下游"));
        highLevelClient.bulk(request, RequestOptions.DEFAULT);
    }

 

10.bool查询

/**
     * bool查询
     */
    @Override
    public List<String> searchBool(String index, String key, String value) {
        List<String> result = Lists.newArrayList();
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        // query1
        TermQueryBuilder query1 = QueryBuilders.termQuery(key, value);
        // query2
        ExistsQueryBuilder query2 = QueryBuilders.existsQuery(key);
        // 组合
        boolQueryBuilder.must(query1);
        boolQueryBuilder.must(query2);
        SearchResponse searchResponse = commonQuery(index, boolQueryBuilder, 0, 10);
        if(Objects.nonNull(searchResponse)){
            // 解析
            SearchHit[] hits = searchResponse.getHits().getHits();
            List resultList = Lists.newArrayList();
            for (SearchHit hit : hits){
                String sourceAsString = hit.getSourceAsString();
                result.add(sourceAsString);
            }
            return result;
        }
        return result;
    }

    /**
     * 公共的查询
     */
    private SearchResponse commonQuery(String index, QueryBuilder queryBuilder, int from, int size){
        SearchRequest searchRequest = new SearchRequest(index);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(queryBuilder);
        searchSourceBuilder.from(from);
        searchSourceBuilder.size(size);
        searchRequest.source(searchSourceBuilder);
        try {
            return highLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

 

11.match查询

    /**
     * match查询
     */
    @Override
    public SearchResponse searchMatch(String index, String key, String value){
        // 组装
        MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery(key, value);
        // 公共查询

        return commonQuery(index, matchQueryBuilder, 0, 10);
    }

 

十二:聚合

1.stat

GET /index01/_search
{
  "aggs": {
    "NAME": {
      "stats": {
        "field": "salary"
      }
    }
  }
}

  效果:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index01",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "小马",
          "job" : "python",
          "age" : "18",
          "salary" : 50000,
          "gender" : "male",
          "like" : "李子",
          "address" : "北京市大兴区",
          "sorted" : false,
          "emplotedTime" : "2020-01-12",
          "location" : {
            "lat" : "41.43",
            "lon" : "67.98"
          },
          "ip" : "192.168.5.102"
        }
      }
    ]
  },
  "aggregations" : {
    "NAME" : {
      "count" : 1,
      "min" : 50000.0,
      "max" : 50000.0,
      "avg" : 50000.0,
      "sum" : 50000.0
    }
  }
}

 

2.job中都会html的个数

  桶的概念

GET /index05/_search
{
  "aggs": {
    "NAME": {
      "terms": {
        "field": "job.keyword",
        "size": 10
      }
    }
  }
}

  效果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "id" : "1",
          "name" : "雅典娜",
          "job" : "html",
          "age" : "38",
          "salary" : 20000,
          "gender" : "female",
          "like" : "牛奶"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "id" : "2",
          "name" : "马云云",
          "job" : "html",
          "age" : "22",
          "salary" : 35000,
          "gender" : "male",
          "like" : "香蕉"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "id" : "1",
          "name" : "强东",
          "job" : "go",
          "age" : "24",
          "salary" : 10000,
          "gender" : "male",
          "like" : "苹果"
        }
      },
      {
        "_index" : "index05",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "id" : "1",
          "name" : "小马",
          "job" : "python",
          "age" : "18",
          "salary" : 50000,
          "gender" : "male",
          "like" : "李子"
        }
      }
    ]
  },
  "aggregations" : {
    "NAME" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "html",
          "doc_count" : 2
        },
        {
          "key" : "go",
          "doc_count" : 1
        },
        {
          "key" : "python",
          "doc_count" : 1
        }
      ]
    }
  }
}

 

3.子聚合

  不同job下对like的喜欢不同做统计

GET /index05/_search
{
  "size": 0, 
  "aggs": {
    "job-info": {
      "terms": {
        "field": "job.keyword"
      },
      "aggs": {
        "jon-like-info": {
          "terms": {
            "field": "like.keyword",
            "size": 10
          }
        }
      }
    }
  }
}

  效果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "job-info" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "html",
          "doc_count" : 2,
          "jon-like-info" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "牛奶",
                "doc_count" : 1
              },
              {
                "key" : "香蕉",
                "doc_count" : 1
              }
            ]
          }
        },
        {
          "key" : "go",
          "doc_count" : 1,
          "jon-like-info" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "苹果",
                "doc_count" : 1
              }
            ]
          }
        },
        {
          "key" : "python",
          "doc_count" : 1,
          "jon-like-info" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "李子",
                "doc_count" : 1
              }
            ]
          }
        }
      ]
    }
  }
}

 

4.不同工种的薪资水平

GET /index05/_search
{
  "size": 0, 
  "aggs": {
    "job-info": {
      "terms": {
        "field": "job.keyword"
      },
      "aggs": {
        "jon-salary": {
          "stats": {
            "field": "salary"
          }
        }
      }
    }
  }
}

 

5.先查询,后sum

GET /index05/_search
{
  "query": {
    "match": {
      "name": "马"
    }
  },
  "aggs": {
    "salary_sum": {
      "sum": {
        "field": "salary"
      }
    }
  }
}

 

6.多层聚合

  分桶的时候,才能一层层进行聚合

# 三層聚合
GET /index05/_search
{
  "size": 0, 
  "aggs": {
    "job-info": {
      "terms": {
        "field": "job.keyword"
      },
      "aggs": {
        "jon-salary": {
          "terms": {
            "field": "gender.keyword"
          },
          "aggs": {
            "NAME": {
              "stats": {
                "field": "salary"
              }
            }
          }
        }
      }
    }
  }
}

 

7.top-hits

# top-hits
GET /index05/_search
{
  "size": 0,
  "aggs": {
    "NAME": {
      "top_hits": {
        "size": 2,
        "sort": [
          {
            "salary": {
              "order": "desc"
            }
          }  
        ],
        "_source": {
          "includes": ["id", "name"]
        }
      }
    }
  }
}

 

8.range

#rangge
GET /index05/_search
{
  "size": 0,
  "aggs": {
    "range_info": {
      "range": {
        "field": "salary",
        "ranges": [
          {
            "key": "D", 
            "from": 5000,
            "to": 10000
          },
          {
            "key": "C", 
            "from": 10000,
            "to": 20000
          }
        ]
      }
    }
  }
}

 

9.直方图

# 直方图
GET /index05/_search
{
  "size": 0,
  "aggs": {
    "NAME": {
      "histogram": {
        "field": "salary",
        "interval": 5000
      }
    }
  }
}

 

  进一步做正真的直方图:

# 批量索引
PUT /index06/_bulk
{"index":{"_id":"1"}}
{"id":"1", "name":"雅典娜", "job":"html", "age":38, "salary":20000, "gender":"female", "like":"牛奶"}
{"index":{"_id":"2"}}
{"id":"2", "name":"马云云", "job":"html", "age":22, "salary":35000, "gender":"male", "like":"香蕉"}
{"index":{"_id":"3"}}
{"id":"1", "name":"强东", "job":"go", "age":24, "salary":10000, "gender":"male", "like":"苹果"}
{"index":{"_id":"4"}}
{"id":"1", "name":"小马", "job":"python", "age":18, "salary":50000, "gender":"male", "like":"李子"}
{"index":{"_id":"5"}}
{"id":"1", "name":"小军", "job":"java", "age":18, "salary":50000, "gender":"male", "like":"雪花"}

  

# 直方图
GET /index06/_search
{
  "size": 0,
  "aggs": {
    "age-info": {
      "histogram": {
        "field": "age",
        "interval": 5
      },
      "aggs": {
        "salary-info": {
          "avg": {
            "field": "salary"
          }
        }
      }
    }
  }
}

  效果:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age-info" : {
      "buckets" : [
        {
          "key" : 15.0,
          "doc_count" : 2,
          "salary-info" : {
            "value" : 50000.0
          }
        },
        {
          "key" : 20.0,
          "doc_count" : 2,
          "salary-info" : {
            "value" : 22500.0
          }
        },
        {
          "key" : 25.0,
          "doc_count" : 0,
          "salary-info" : {
            "value" : null
          }
        },
        {
          "key" : 30.0,
          "doc_count" : 0,
          "salary-info" : {
            "value" : null
          }
        },
        {
          "key" : 35.0,
          "doc_count" : 1,
          "salary-info" : {
            "value" : 20000.0
          }
        }
      ]
    }
  }
}

 

10.min_bucket

  找到最小的那个

#min_bucket
GET /index05/_search
{
  "size": 0, 
  "aggs": {
    "job-info": {
      "terms": {
        "field": "job.keyword"
      },
      "aggs": {
        "jon-salary": {
          "avg": {
            "field": "salary"
          }
        }
      }
    },
    "min_bulk_info":{
      "min_bucket": {
        "buckets_path": "job-info>jon-salary"
      }
    }
  }
}

 

 11.聚合总结

  bucket

  

 

 

  metric:

  

 

  可以使用avg的单值聚合,也可以是stats多值聚合

 

  pipeline:

  

  

 

十三:Logstash

 1.解压

unzip logstash-7.6.1.zip -d ../software/

 

2.修改配置

vi jvm.options

  注释:

  

 

3.测试

bin/logstash -e 'input {stdin{}} output {stdout{}}'

  

 

 

4.入门案例

  将-e后面的内容写入到配置文件中

  

bin/logstash -f conf/demo1 

 

  案例二

bin/logstash -f conf/demo2
input{
        exec{
                command => 'ls'
                interval => 30
        }
}
output{
        stdout{}
}

 

5.原理

  

 

 

6.input插件-exec

  在入门案例中已经说明

 

7.input插件-file

  监控文件中的新事件,相当于tail

input{
        file{
                path => '/opt/software/logstash-7.6.1/conf/tomcat.log'
        }
}
output{
        stdout{}
}

 

8.input插件-jdbc

  有两种方式,后续有时间,再进行验证效果

  

 

 

9.output插件-stdout  

  上面已经说过,可以加编码格式

  

 

10.output组件-es

input{
        stdin{}
}
output{
        elasticsearch{
                action => "index"
                hosts => ["192.168.19.132:9200"]
                index => "index111"
        }
}

  效果:

#logstash
GET /index111/_search
{
  "query": {
    "match_all": {}
  }
}
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index111",
        "_type" : "_doc",
        "_id" : "LBJFg4IB0P-Q-DvAjHjt",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2022-08-09T15:42:29.192Z",
          "host" : "com.jun",
          "message" : "halou",
          "@version" : "1"
        }
      }
    ]
  }
}

 

11.filter组件-grok插件

  

  

 

 

 12.filter组件-grok插件-oniguruma语法

  

 

  

 

 

  

 

 

十四:kibana

 1.可视化

  

  含义:

 

  

 

 

  

 

 

 

 

 

 

 

  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 posted on 2022-07-10 10:28  曹军  阅读(161)  评论(0编辑  收藏  举报