ElasticSearch教程入门到精通笔记

概念

ELK
- Elasticsearch 存储搜索,分布式全文搜索引擎
- Logstash（Beats）采集数据
- Kibana 页面展示
Elasticsearch和Solr都是基于Lucene开发的。Lucene只是一个提供全文搜索功能类库的核心工具包。
- ES是基于JAVA开发的。
- 9000是ES集群间的通信端口，9200是浏览器访问端口。
- 基于RESTful风格的请求，JSON格式的出入参。
ES是面向文档型的数据库，一条数据等于一个文档。
倒排索引：根据关键词搜索文档id。

入门安装

官网下载安装

1、需要替换为mac本地的jdk安装路径，es自带的jdk默认会被mac识别为位置软件被禁止。所以要修改bin目录下的启动脚本文件elasticsearch
增加export JAVA_HOME=/opt/homebrew/Cellar/openjdk@17
#查看jdk的路径
查看当前默认的 JDK 路径
/usr/libexec/java_home
#输出示例：
/Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home
#列出所有已安装的 JDK 路径
/usr/libexec/java_home -V

#修改config目录下的jvm.options设置堆大小
-Xms1g
-Xmx1g

2、运行es
./elasticsearch -d   #后台运行
./elasticsearch



3、#验证安装是否成功
curl http://localhost:9200

#问题1、执行测试连接报错：curl: (52) Empty reply from server，安全配置冲突，关闭安全功能（仅限开发环境）：在 config/elasticsearch.yml 中添加：
xpack.security.enabled: false
xpack.security.http.ssl:
  enabled: false


#5、停止 Elasticsearch
如果是前台运行，按 Ctrl+C。

#如果是后台进程：
# 查找进程 ID
ps aux | grep elasticsearch
# 终止进程
kill -9 <pid>

启动脚本文件elasticsearch配置后如下：

#!/bin/bash
export JAVA_HOME=/opt/homebrew/Cellar/openjdk@17
CLI_NAME=server
CLI_LIBS=lib/tools/server-cli
source "`dirname "$0"`"/elasticsearch-cli

homebrew安装

todo

ik分词器安装

首先到github官网上去找对对应版本的ik分词器
下载后解压缩到plugin文件夹下，并且要删除该压缩包，否则启动会失败。
如果有版本的问题也可以尝试修改plugin-descriptor.properties文件里的版本，如果修改后还是不能启动只能找对应的准确版本。

细节

增删改查

put请求要求是幂等性的，post不是幂等性的。
全量数据的覆盖性修改用put(post也行)；局部更新用post，因为局部更新不是幂等的，全量才是。并明确指定为_update,如果是_doc会被认为是新增。
查询关键字大小写不敏感。
查询的关键词匹配是根据倒排索引的分词决定的，只要关键字命中了倒排索引就会把所有匹配的并集返回。
分词设置
- type:keyword,表示不能分词，要完整匹配，比如reqid。用户id等。
- index:true，表示这个字段是可以索引查询的

细节

get索引

创建文档

根据id查看文档

根据id修改文档

根据id删除文档

根据条件删除文档

创建索引映射

查找索引全部文档

条件查询

组合查询

范围查询

排序

高亮

分页查询

聚合查询-最大值

最小值

求和

平均值

去重计总数，cardinality是基数的意思

针对字段进行统计

分组统计

JavaAPI增删改查——7.8.0版本

pom依赖

   <properties>
        <!--        注意es7.8.0版本依赖的是java8,如果是7.*.*版本设置jdk版本是8-->
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <es.version>7.8.0</es.version>
    </properties>
    <dependencies>
<!--        注意es7.8.0版本依赖的是java8-->
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>${es.version}</version>
        </dependency>
        <!-- elasticsearch的客户端 如果es服务端是8.0.0开始的版本，就不建议使用elasticsearch-rest-high-level-client-->
        <!--        Elasticsearch从7.15版本开始，RestHighLevelClient已经被标记为废弃，建议使用新的Java API客户端-->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>${es.version}</version>
        </dependency>
        <!-- elasticsearch依赖 2.x的 log4j -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.8.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.8.2</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.15.3</version>
        </dependency>
        <!-- junit单元测试 -->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba.fastjson2</groupId>
            <artifactId>fastjson2</artifactId>
            <version>2.0.34</version>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.30</version>
        </dependency>
    </dependencies>

单元测试篇

测试类

对象类

@NoArgsConstructor
@Data
public class Person {
    private String name;
    private String sex;
    private Integer age;
    private String birthDate;
    private String about;
    private List<String> interests;


}

索引CRUD类

public class ESIndexTest {

    public static void main(String[] args) throws IOException {
        //创建客户端
        RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost",9200,"http")));

        //创建索引
        createIndex(client);

        //查询索引
//        getIndex(client);

        //删除索引
//        deleteIndex(client);

        //close
        client.close();

    }

    private static void createIndex(RestHighLevelClient client) throws IOException {
        CreateIndexRequest indexRequest = new CreateIndexRequest("person");
        CreateIndexResponse person = client.indices().create(indexRequest,RequestOptions.DEFAULT);
        System.out.println(person.isAcknowledged());
    }

    private static void getIndex(RestHighLevelClient client) throws IOException {
        GetIndexRequest indexRequest = new GetIndexRequest("person");
        GetIndexResponse person = client.indices().get(indexRequest,RequestOptions.DEFAULT);
        System.out.println(person.getAliases());
        System.out.println(JSON.toJSONString(person.getMappings().entrySet()));
        System.out.println(person.getSettings());
    }

    private static void deleteIndex(RestHighLevelClient client) throws IOException {
        DeleteIndexRequest indexRequest = new DeleteIndexRequest("person");
        AcknowledgedResponse person = client.indices().delete(indexRequest,RequestOptions.DEFAULT);
        System.out.println(person.isAcknowledged());
    }

}

document文件CRUD类

package com.roy.test;

import com.alibaba.fastjson2.JSON;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.roy.test.bean.Person;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.client.indices.GetIndexResponse;
import org.elasticsearch.common.unit.Fuzziness;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.*;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.aggregations.AggregationBuilder;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.sort.SortOrder;

import javax.swing.text.Highlighter;
import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;

/***
 * @ClassName: ESTest
 * @Description:
 * @version : 1.0
 */
public class ESDocumentTest {

    public static void main(String[] args) throws IOException {
        //创建客户端
        RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));

        //创建文档 mac m2芯片不支持7.8.0版本
        createDocument(client);

        //局部更新
//        updateDocument(client);

        //删除文档
//        deleteDocument(client);
        // 获取文档
//        getDocument(client);
        //批量插入
        batchInsertDocument(client);
        //批量删除
        batchDeleteDocument(client);
        //close
        client.close();

    }

    private static void createDocument(RestHighLevelClient client) throws IOException {
        IndexRequest indexRequest = new IndexRequest();
        indexRequest.index("person").id("3001");
        Person person = new Person();
        person.setName("艾米");
        person.setAbout("艾米帝国佣兵");
        person.setAge(18);
        person.setInterests(Collections.singletonList("贪财"));
        //Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=failed to parse date field [1880-02-02] with format [yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis]]]; nested: ElasticsearchException[Elasticsearch exception [type=date_time_parse_exception, reason=Failed to parse with all enclosed parsers]];
        //默认的时间格式是yyyy/MM/dd HH:mm:ss，需要注意
        person.setBirthDate("1998/02/02 12:12:12");
//        ObjectMapper objectMapper = new ObjectMapper();
//        String value = objectMapper.writeValueAsString(person);
        String value = JSON.toJSONString(person);
        indexRequest.source(value, XContentType.JSON);
        IndexResponse response = client.index(indexRequest, RequestOptions.DEFAULT);
        System.out.println(response.getResult());
    }

    private static void updateDocument(RestHighLevelClient client) throws IOException {
        UpdateRequest updateRequest = new UpdateRequest();
        updateRequest.index("person").id("3001");
        //局部更新
        updateRequest.doc(XContentType.JSON,"name","大青山");
        UpdateResponse response = client.update(updateRequest, RequestOptions.DEFAULT);
        System.out.println(response.getResult());
    }

    private static void getDocument(RestHighLevelClient client) throws IOException {
        GetRequest request = new GetRequest();
        request.index("person").id("3001");
        GetResponse response = client.get(request, RequestOptions.DEFAULT);
        System.out.println(response.getSourceAsString());
    }

    private static void deleteDocument(RestHighLevelClient client) throws IOException {
        DeleteRequest request = new DeleteRequest();
        request.index("person").id("3001");
        DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
        System.out.println(response.getResult());
    }

    /**
     * 批量插入
     * @param client
     * @throws IOException
     */
    private static void batchInsertDocument(RestHighLevelClient client) throws IOException {
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.add(new IndexRequest().index("person").id("5001").source(XContentType.JSON,"name","霍恩斯"));
        bulkRequest.add(new IndexRequest().index("person").id("5002").source(XContentType.JSON,"name","池傲天"));
        bulkRequest.add(new IndexRequest().index("person").id("5003").source(XContentType.JSON,"name","池长风"));
        BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
        System.out.println(response.getItems());
    }

    private static void batchDeleteDocument(RestHighLevelClient client) throws IOException {
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.add(new DeleteRequest().index("person").id("5001"));
        bulkRequest.add(new DeleteRequest().index("person").id("5002"));
        bulkRequest.add(new DeleteRequest().index("person").id("5003"));
        BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
        System.out.println(response.getItems());
    }

    /**
     * 查询所有
     * @param client
     * @throws IOException
     */
    private static void queryAllDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        request.source(new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()));
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 查询名称叫艾米的
     * @param client
     * @throws IOException
     */
    private static void queryConditionDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        request.source(new SearchSourceBuilder().query(QueryBuilders.termQuery("name","艾米")));
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 分页查询
     * @param client
     * @throws IOException
     */
    private static void queryConditionDocumentPage(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.termQuery("name", "艾米"));
        builder.from(0);
        builder.size(10);
        //根据年龄升序排序
        builder.sort("age", SortOrder.ASC);
        //字段展示和排除过滤
        String[] includes = {"name,age"};
        String[] excludes = {};
        builder.fetchSource(includes,excludes);

        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }


    /**
     * 组合查询
     * @param client
     * @throws IOException
     */
    private static void queryCondiDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        //必须名称叫大青山且年龄是30
//        boolQuery.must(QueryBuilders.matchQuery("name","大青山"));
//        boolQuery.must(QueryBuilders.matchQuery("age",30));
//        boolQuery.mustNot(QueryBuilders.matchQuery("age",30));


        //年龄等于30或者20都可以
        boolQuery.should(QueryBuilders.matchQuery("age",20));
        boolQuery.should(QueryBuilders.matchQuery("age",30));

        SearchSourceBuilder builder = new SearchSourceBuilder().query(boolQuery);
        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 范围查询
     * @param client
     * @throws IOException
     */
    private static void queryRangeDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        //查询年龄在30-50范围内的数据
        RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("age");
        rangeQuery.gte(30);
        rangeQuery.lt(50);

        SearchSourceBuilder builder = new SearchSourceBuilder().query(rangeQuery);
        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 模糊查询
     * @param client
     * @throws IOException
     */
    private static void queryFuzzyDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");
        //1位差别的模糊查询，可以设置多位
        FuzzyQueryBuilder fuzzyQueryBuilder = QueryBuilders.fuzzyQuery("name", "amy").fuzziness(Fuzziness.ONE);

        SearchSourceBuilder builder = new SearchSourceBuilder().query(fuzzyQueryBuilder);

        //设置高亮
        HighlightBuilder highlightBuilder =new HighlightBuilder();
        //前后缀标签设置
        highlightBuilder.preTags("<font color='red'>");
        highlightBuilder.postTags("</font>");
        //设置对哪个字段高亮
        highlightBuilder.field("name");
        builder.highlighter();

        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 聚合查询
     * @param client
     * @throws IOException
     */
    private static void aggrDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");

        SearchSourceBuilder builder = new SearchSourceBuilder();
        //求平均年龄,avg("avgAge")表示设置聚合名称
        AggregationBuilder aggregationBuilder = AggregationBuilders.avg("avgAge").field("age");
        //求最大年龄
//        AggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("age");
        builder.aggregation(aggregationBuilder);
        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }

    /**
     * 分组查询
     * @param client
     * @throws IOException
     */
    private static void groupDocument(RestHighLevelClient client) throws IOException {
        SearchRequest request = new SearchRequest();
        request.indices("person");

        SearchSourceBuilder builder = new SearchSourceBuilder();
        //根据年龄分组
        AggregationBuilder aggregationBuilder = AggregationBuilders.terms("ageGroup").field("age");
        builder.aggregation(aggregationBuilder);
        request.source(builder);
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        System.out.println(response.getHits());
        SearchHits hits = response.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hitsHit : hitsHits) {
            System.out.println(hitsHit.getSourceAsString());
        }
    }
}

环境部署

默认集群名称：elasticsearch。这个名字是重要的，因为一个节点只能通过指定某个集群的名字，来加入这个集群。

环境

单机

集群

进阶

Elasticsearch 索引的精髓：一切设计都是为了提高搜索的性能。
分片和复制的数量可以在索引创建的时候指定。在索引创建之后，你可以在任何时候动态地改变复制的数量，但是你事后不能改变分片的数量。
默认情况下一个索引的分片是1，副本是1。索引创建的时候指定分片和副本数。
加入新的节点后的分片和备份分配原则
- 主分片不能和其副本分片在同一台主机节点上。
- 保证分配均匀。
索引的分片在创建时就固定了，不能调整，但是副本数量可以调整，可以通过提高副本数量来提高查询的吞吐量。扩容的最大数量是主分片和副本分片之和，也就是每个分片放一个节点。
路由计算：hash(id)%主分片数量
分片控制：用户可以访问任何一个节点获取数据，这个节点称为协调节点。如果访问的接口很忙，则会把查询请求转发到其他节点上去查询请求。
倒排索引
- 分词器
- 设置为keyword不能被分词
- ik_max_word：最细力度的拆分
- ik_smart:最粗力度的拆分
- 词条：索引中最小存储和查询单元
- 词典：字典，是词条的集合，用B+树或者hashmap实现。
- 倒排表
文档搜索
近实时搜索
文档分析（分析器）
- 字符过滤器
- 分词器
- Token过滤器

kibana

官网下载
注意配置项变化，8.x.x版本已经不需要配置索引名，否则报错，未知项
- kibana.index: ".kibana"

kibana.yml配置

# 默认端口
server:
  port: 5601
# ES服务器的地址
elasticsearch:
  hosts: ["http://localhost:9200"]
# 索引名 新版本默认不需要添加
# kibana:
#   index: '.kibana'
# 支持中文
i18n:
  locale: "zh-CN"