ElasticSearch教程入门到精通笔记
ElasticSearch教程入门到精通笔记
概念
-
ELK
- Elasticsearch 存储搜索,分布式全文搜索引擎
- Logstash(Beats) 采集数据
- Kibana 页面展示
-
Elasticsearch和Solr都是基于Lucene开发的。Lucene只是一个提供全文搜索功能类库的核心工具包。
- ES是基于JAVA开发的。
- 9000是ES集群间的通信端口,9200是浏览器访问端口。
- 基于RESTful风格的请求,JSON格式的出入参。
-
ES是面向文档型的数据库,一条数据等于一个文档。
-
倒排索引:根据关键词搜索文档id。
入门安装
官网下载安装
1、需要替换为mac本地的jdk安装路径,es自带的jdk默认会被mac识别为位置软件被禁止。所以要修改bin目录下的启动脚本文件elasticsearch
增加export JAVA_HOME=/opt/homebrew/Cellar/openjdk@17
#查看jdk的路径
查看当前默认的 JDK 路径
/usr/libexec/java_home
#输出示例:
/Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home
#列出所有已安装的 JDK 路径
/usr/libexec/java_home -V
#修改config目录下的jvm.options设置堆大小
-Xms1g
-Xmx1g
2、运行es
./elasticsearch -d #后台运行
./elasticsearch
3、#验证安装是否成功
curl http://localhost:9200
#问题1、执行测试连接报错:curl: (52) Empty reply from server,安全配置冲突,关闭安全功能(仅限开发环境):在 config/elasticsearch.yml 中添加:
xpack.security.enabled: false
xpack.security.http.ssl:
enabled: false
#5、停止 Elasticsearch
如果是前台运行,按 Ctrl+C。
#如果是后台进程:
# 查找进程 ID
ps aux | grep elasticsearch
# 终止进程
kill -9 <pid>
启动脚本文件elasticsearch配置后如下:
#!/bin/bash
export JAVA_HOME=/opt/homebrew/Cellar/openjdk@17
CLI_NAME=server
CLI_LIBS=lib/tools/server-cli
source "`dirname "$0"`"/elasticsearch-cli
homebrew安装
todo
ik分词器安装
- 首先到github官网上去找对对应版本的ik分词器
- 下载后解压缩到plugin文件夹下,并且要删除该压缩包,否则启动会失败。
- 如果有版本的问题也可以尝试修改plugin-descriptor.properties文件里的版本,如果修改后还是不能启动只能找对应的准确版本。
细节
增删改查
- put请求要求是幂等性的,post不是幂等性的。
- 全量数据的覆盖性修改用put(post也行);局部更新用post,因为局部更新不是幂等的,全量才是。并明确指定为
_update
,如果是_doc
会被认为是新增。 - 查询关键字大小写不敏感。
- 查询的关键词匹配是根据倒排索引的分词决定的,只要关键字命中了倒排索引就会把所有匹配的并集返回。
- 分词设置
- type:keyword,表示不能分词,要完整匹配,比如reqid。用户id等。
- index:true,表示这个字段是可以索引查询的
细节
get索引
创建文档
根据id查看文档
根据id修改文档
根据id删除文档
根据条件删除文档
创建索引映射
查找索引全部文档
条件查询
组合查询
范围查询
排序
高亮
分页查询
聚合查询-最大值
最小值
求和
平均值
去重计总数,cardinality是基数的意思
针对字段进行统计
分组统计
JavaAPI增删改查——7.8.0版本
pom依赖
<properties>
<!-- 注意es7.8.0版本依赖的是java8,如果是7.*.*版本设置jdk版本是8-->
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<es.version>7.8.0</es.version>
</properties>
<dependencies>
<!-- 注意es7.8.0版本依赖的是java8-->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${es.version}</version>
</dependency>
<!-- elasticsearch的客户端 如果es服务端是8.0.0开始的版本,就不建议使用elasticsearch-rest-high-level-client-->
<!-- Elasticsearch从7.15版本开始,RestHighLevelClient已经被标记为废弃,建议使用新的Java API客户端-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${es.version}</version>
</dependency>
<!-- elasticsearch依赖 2.x的 log4j -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.8.2</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.8.2</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.15.3</version>
</dependency>
<!-- junit单元测试 -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
<dependency>
<groupId>com.alibaba.fastjson2</groupId>
<artifactId>fastjson2</artifactId>
<version>2.0.34</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.30</version>
</dependency>
</dependencies>
单元测试篇
测试类
对象类
@NoArgsConstructor
@Data
public class Person {
private String name;
private String sex;
private Integer age;
private String birthDate;
private String about;
private List<String> interests;
}
索引CRUD类
public class ESIndexTest {
public static void main(String[] args) throws IOException {
//创建客户端
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost",9200,"http")));
//创建索引
createIndex(client);
//查询索引
// getIndex(client);
//删除索引
// deleteIndex(client);
//close
client.close();
}
private static void createIndex(RestHighLevelClient client) throws IOException {
CreateIndexRequest indexRequest = new CreateIndexRequest("person");
CreateIndexResponse person = client.indices().create(indexRequest,RequestOptions.DEFAULT);
System.out.println(person.isAcknowledged());
}
private static void getIndex(RestHighLevelClient client) throws IOException {
GetIndexRequest indexRequest = new GetIndexRequest("person");
GetIndexResponse person = client.indices().get(indexRequest,RequestOptions.DEFAULT);
System.out.println(person.getAliases());
System.out.println(JSON.toJSONString(person.getMappings().entrySet()));
System.out.println(person.getSettings());
}
private static void deleteIndex(RestHighLevelClient client) throws IOException {
DeleteIndexRequest indexRequest = new DeleteIndexRequest("person");
AcknowledgedResponse person = client.indices().delete(indexRequest,RequestOptions.DEFAULT);
System.out.println(person.isAcknowledged());
}
}
document文件CRUD类
package com.roy.test;
import com.alibaba.fastjson2.JSON;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.roy.test.bean.Person;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.client.indices.GetIndexResponse;
import org.elasticsearch.common.unit.Fuzziness;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.*;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.aggregations.AggregationBuilder;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.sort.SortOrder;
import javax.swing.text.Highlighter;
import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;
/***
* @ClassName: ESTest
* @Description:
* @version : 1.0
*/
public class ESDocumentTest {
public static void main(String[] args) throws IOException {
//创建客户端
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
//创建文档 mac m2芯片不支持7.8.0版本
createDocument(client);
//局部更新
// updateDocument(client);
//删除文档
// deleteDocument(client);
// 获取文档
// getDocument(client);
//批量插入
batchInsertDocument(client);
//批量删除
batchDeleteDocument(client);
//close
client.close();
}
private static void createDocument(RestHighLevelClient client) throws IOException {
IndexRequest indexRequest = new IndexRequest();
indexRequest.index("person").id("3001");
Person person = new Person();
person.setName("艾米");
person.setAbout("艾米帝国佣兵");
person.setAge(18);
person.setInterests(Collections.singletonList("贪财"));
//Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=failed to parse date field [1880-02-02] with format [yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis]]]; nested: ElasticsearchException[Elasticsearch exception [type=date_time_parse_exception, reason=Failed to parse with all enclosed parsers]];
//默认的时间格式是yyyy/MM/dd HH:mm:ss,需要注意
person.setBirthDate("1998/02/02 12:12:12");
// ObjectMapper objectMapper = new ObjectMapper();
// String value = objectMapper.writeValueAsString(person);
String value = JSON.toJSONString(person);
indexRequest.source(value, XContentType.JSON);
IndexResponse response = client.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(response.getResult());
}
private static void updateDocument(RestHighLevelClient client) throws IOException {
UpdateRequest updateRequest = new UpdateRequest();
updateRequest.index("person").id("3001");
//局部更新
updateRequest.doc(XContentType.JSON,"name","大青山");
UpdateResponse response = client.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(response.getResult());
}
private static void getDocument(RestHighLevelClient client) throws IOException {
GetRequest request = new GetRequest();
request.index("person").id("3001");
GetResponse response = client.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());
}
private static void deleteDocument(RestHighLevelClient client) throws IOException {
DeleteRequest request = new DeleteRequest();
request.index("person").id("3001");
DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
System.out.println(response.getResult());
}
/**
* 批量插入
* @param client
* @throws IOException
*/
private static void batchInsertDocument(RestHighLevelClient client) throws IOException {
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.add(new IndexRequest().index("person").id("5001").source(XContentType.JSON,"name","霍恩斯"));
bulkRequest.add(new IndexRequest().index("person").id("5002").source(XContentType.JSON,"name","池傲天"));
bulkRequest.add(new IndexRequest().index("person").id("5003").source(XContentType.JSON,"name","池长风"));
BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
System.out.println(response.getItems());
}
private static void batchDeleteDocument(RestHighLevelClient client) throws IOException {
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.add(new DeleteRequest().index("person").id("5001"));
bulkRequest.add(new DeleteRequest().index("person").id("5002"));
bulkRequest.add(new DeleteRequest().index("person").id("5003"));
BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
System.out.println(response.getItems());
}
/**
* 查询所有
* @param client
* @throws IOException
*/
private static void queryAllDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
request.source(new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()));
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 查询名称叫艾米的
* @param client
* @throws IOException
*/
private static void queryConditionDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
request.source(new SearchSourceBuilder().query(QueryBuilders.termQuery("name","艾米")));
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 分页查询
* @param client
* @throws IOException
*/
private static void queryConditionDocumentPage(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.termQuery("name", "艾米"));
builder.from(0);
builder.size(10);
//根据年龄升序排序
builder.sort("age", SortOrder.ASC);
//字段展示和排除过滤
String[] includes = {"name,age"};
String[] excludes = {};
builder.fetchSource(includes,excludes);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 组合查询
* @param client
* @throws IOException
*/
private static void queryCondiDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
//必须名称叫大青山且年龄是30
// boolQuery.must(QueryBuilders.matchQuery("name","大青山"));
// boolQuery.must(QueryBuilders.matchQuery("age",30));
// boolQuery.mustNot(QueryBuilders.matchQuery("age",30));
//年龄等于30或者20都可以
boolQuery.should(QueryBuilders.matchQuery("age",20));
boolQuery.should(QueryBuilders.matchQuery("age",30));
SearchSourceBuilder builder = new SearchSourceBuilder().query(boolQuery);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 范围查询
* @param client
* @throws IOException
*/
private static void queryRangeDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
//查询年龄在30-50范围内的数据
RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("age");
rangeQuery.gte(30);
rangeQuery.lt(50);
SearchSourceBuilder builder = new SearchSourceBuilder().query(rangeQuery);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 模糊查询
* @param client
* @throws IOException
*/
private static void queryFuzzyDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
//1位差别的模糊查询,可以设置多位
FuzzyQueryBuilder fuzzyQueryBuilder = QueryBuilders.fuzzyQuery("name", "amy").fuzziness(Fuzziness.ONE);
SearchSourceBuilder builder = new SearchSourceBuilder().query(fuzzyQueryBuilder);
//设置高亮
HighlightBuilder highlightBuilder =new HighlightBuilder();
//前后缀标签设置
highlightBuilder.preTags("<font color='red'>");
highlightBuilder.postTags("</font>");
//设置对哪个字段高亮
highlightBuilder.field("name");
builder.highlighter();
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 聚合查询
* @param client
* @throws IOException
*/
private static void aggrDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
SearchSourceBuilder builder = new SearchSourceBuilder();
//求平均年龄,avg("avgAge")表示设置聚合名称
AggregationBuilder aggregationBuilder = AggregationBuilders.avg("avgAge").field("age");
//求最大年龄
// AggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("age");
builder.aggregation(aggregationBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
/**
* 分组查询
* @param client
* @throws IOException
*/
private static void groupDocument(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest();
request.indices("person");
SearchSourceBuilder builder = new SearchSourceBuilder();
//根据年龄分组
AggregationBuilder aggregationBuilder = AggregationBuilders.terms("ageGroup").field("age");
builder.aggregation(aggregationBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println(response.getHits());
SearchHits hits = response.getHits();
SearchHit[] hitsHits = hits.getHits();
for (SearchHit hitsHit : hitsHits) {
System.out.println(hitsHit.getSourceAsString());
}
}
}
环境部署
- 默认集群名称:elasticsearch。这个名字是重要的,因为一个节点只能通过指定某个集群的名字,来加入这个集群。
环境
单机
集群
进阶
-
Elasticsearch 索引的精髓:一切设计都是为了提高搜索的性能。
-
分片和复制的数量可以在索引创建的时候指定。在索引创建之后,你可以在任何时候动态地改变复制的数量,但是你事后不能改变分片的数量。
-
默认情况下一个索引的分片是1,副本是1。 索引创建的时候指定分片和副本数。
-
加入新的节点后的分片和备份分配原则
- 主分片不能和其副本分片在同一台主机节点上。
- 保证分配均匀。
-
索引的分片在创建时就固定了,不能调整,但是副本数量可以调整,可以通过提高副本数量来提高查询的吞吐量。扩容的最大数量是主分片和副本分片之和,也就是每个分片放一个节点。
-
路由计算:hash(id)%主分片数量
-
分片控制:用户可以访问任何一个节点获取数据,这个节点称为协调节点。如果访问的接口很忙,则会把查询请求转发到其他节点上去查询请求。
-
倒排索引
- 分词器
- 设置为keyword不能被分词
- ik_max_word:最细力度的拆分
- ik_smart:最粗力度的拆分
- 词条:索引中最小存储和查询单元
- 词典:字典,是词条的集合,用B+树或者hashmap实现。
- 倒排表
-
文档搜索
-
近实时搜索
-
文档分析(分析器)
- 字符过滤器
- 分词器
- Token过滤器
kibana
- 官网下载
- 注意配置项变化,8.x.x版本已经不需要配置索引名,否则报错,未知项
- kibana.index: ".kibana"
kibana.yml配置
# 默认端口
server:
port: 5601
# ES服务器的地址
elasticsearch:
hosts: ["http://localhost:9200"]
# 索引名 新版本默认不需要添加
# kibana:
# index: '.kibana'
# 支持中文
i18n:
locale: "zh-CN"
细节图
路由计算和分配控制
写流程——以三分片一副本举例
保存一致性设置
查询流程
近实时搜索
ik分词器
映射
分片
路由
写
读
更新
刷新
集成
再通过询问deepseek获取示例:Elasticsearch服务器版本是8.17.3,springboot项目要使用Spring Data Elasticsearch进行增删改查,给出代码示例。
优化
硬件要求
- SSD
- RAID 0
- 多块硬盘
- 不要使用远程挂载的存储。比如NFS或者SMB/CIFS
分配策略
- 一个分片类似于一个独立的搜索引擎(底层为一个Lucene索引),因此分片越多则消耗资源越多(文件句柄、内存、CPU)。分片并不是初始化设置越多越好。
- 每一个搜索请求都会命中索引中的每一个分片,如果分片在相同的节点则会竞争相同的资源。
- 计算相关度的词项统计信息是基于分片的,分片过多会导致相关度降低。
路由选择
写入速度优化
- 可以先关闭副本写入,等写入完成了用于查询再打开副本数量。
内存设置
- 默认内存1G, Xms 表示堆的初始大小, Xmx 表示可分配的最大内存,都是 1GB。
重要配置
面试题