Elasticsearch(7.5.0版本) 导航学习
简介:Elasticsearch是基于Apache Lucene的开源搜索引擎,采用Java语言开发的,它提供了一个分布式、高扩展、高实时能力的全文搜索与数据分析引擎,适合作为Nosql数据存储,但缺少分布式事务。ES通过简单的Restful Api来隐藏Lucence的复杂性,从而让全文搜索变得简单。与Logstash、Kibana组合并称ELK,达到可以同时实现日志收集、日志搜索和日志分析的能力。
1. 基本概念
名称 | 说明 |
Node(节点) | 单个装有Es服务并且提供故障转移和扩展服务器 |
Cluster(集群) | 由一个或多个Node组织在一起共同工作,共同分享整个数据具有负载均衡功能的集群,只有一个主节点 |
Document(文档) | 一个文档是一个可被索引的基础信息单元 |
Index(索引) | 索引就是一个拥有几分相似特征的文档集合 |
Field(列) | 列是Es的最小单位,相当于数据的某一列 |
Shards(索引分片) | Es将索引分成若干份,分布到不同节点上,每个部分就是一个分片 |
Replicas(索引副本) | Replicas是索引一份或几份拷贝,用于提高系统的容错性,且自动对搜索请求进行负载均衡 |
数据类型
keyword:索引结构化的字段,类似mysql中string;
数组:ES没有专用的数组类型,默认情况下任何字段都可以包含一个或者多个值,但是一个数组中的值要是同一种类型,有字符数组、整形数组、嵌套数组和对象数组
2. Elasticsearch和Solr的比较
A. 都是当前比较火的全文搜索引擎;
B. Elasticsearch更侧重于实时的数据分析,Solr这方面效率较ES低;
C. Solr支持的文本格式比较多,如HTML、PDF、Word、Excel、CVS等,而Elasticsearch只支持json的格式。
3. Maven依赖
<dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>7.5.0</version> </dependency> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>7.5.0</version> </dependency>
4. Elasticsearch配置类,注入restHighLevelClient
package com.ruhuanxingyun.elasticsearch.config; import org.apache.http.HttpHost; import org.elasticsearch.client.RestClient; import org.elasticsearch.client.RestClientBuilder; import org.elasticsearch.client.RestHighLevelClient; import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import java.util.Arrays; /** * @description: Elasticsearch配置类 * @author: ruphie * @date: Create in 2020/12/1 21:08 * @company: ruhuanxingyun */ @Configuration public class ElasticsearchConfig { @Value("${spring.data.elasticsearch.rest.hosts}") private String hosts; @Bean public RestClientBuilder restClientBuilder() { String[] hostsArr = hosts.split(","); HttpHost[] httpHosts = Arrays.stream(hostsArr) .map(host -> { String[] hostArr = host.split(":"); String ip = hostArr[0]; int port = Integer.parseInt(hostArr[1]); return new HttpHost(ip, port, HttpHost.DEFAULT_SCHEME_NAME); }).toArray(HttpHost[]::new); return RestClient.builder(httpHosts); } @Bean public RestHighLevelClient highLevelClient(RestClientBuilder restClientBuilder) { return new RestHighLevelClient(restClientBuilder); } }
5. 官网地址
A. https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/index.html
B. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/index.html
6. 环境搭建
A. Elasticsearch环境搭建:https://www.cnblogs.com/ruhuanxingyun/p/11399484.html
B. Filebeat环境搭建:https://www.cnblogs.com/ruhuanxingyun/p/11414708.html
C. Logstash环境搭建:https://www.cnblogs.com/ruhuanxingyun/p/11414719.html
7. 核心的Http Api
A. Index APIs:负责索引Index的创建create、删除Delete、获取Get等;
I. Java代码层操作:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_index_apis.html
II. elasticsearch-head / postman / kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices.html
III. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/11429347.html
B. Document APIs:负责索引文档的创建Index、删除Delete、获取Get等操作,它是根据doc_id进行查询;
I. Java代码层操作:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-supported-apis.html
II. elasticsearch-head / postman / kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/docs.html
III. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/11434385.html
C. Search APIs:负责索引文档的查询Search,它是根据条件查询;
I. Java代码层操作:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_search_apis.html
II. elasticsearch-head / postman / kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search.html
III. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/12201644.html
D. cat APIs:负责查询索引相关的各类信息查询;
I. Java代码层操作:执行请求-https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-requests.html、阅读响应-https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-responses.html
II. elasticsearch-head / postman / kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/cat.html
III. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/12174465.html
E. Cluster APIs:负责集群相关的各类信息查询;
I. Java代码层操作:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_cluster_apis.html
II. elasticsearch-head / postman / kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/cluster.html
III. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/12193148.html
F. Query DSL:结构化查询语句
I. java代码层操作:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-query-builders.html
II. elasticsearch-head / postman /kibana等操作:https://www.elastic.co/guide/en/elasticsearch/reference/7.5/query-dsl.html
III. 实例参考: https://www.cnblogs.com/ruhuanxingyun/p/11322670.html
G. Text analysis:分词
I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/analysis.html
II. 实例参考:
H. Mapping:映射
I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/mapping.html
II. 实例参考:
I. Aggregations:聚合
I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search-aggregations.html
II. 实例参考:https://www.cnblogs.com/ruhuanxingyun/p/12304502.html
8. ES的数据导出
1. Docker方式使用elasticsearch-dump
docker run --rm -ti -v /es:/tmp taskrabbit/elasticsearch-dump --input=http://127.0.0.1:8200/report_log_2021.02 --output=/tmp/report_log_2021.02.json --searchBody='{"query":{"bool":{"filter":[{"term":{"type": 8}}, {"range":{"timestamp":{"gte":"2021-02-01 00:00:00.000", "lte":"2021-02-07 23:59:59.999"}}}]}}, "_source":["user.name","user.phone","timestamp","access.url","access.title"]}' --type=data --sourceOnly=true
2. 原生方式使用elasticdump:npm install -g elasticdump
1 2 3 4 5 6 7 8 9 10 11 12 | elasticdump \ --input=http: //username:password@10.10.10.10:9200/company \ --output=company_data.json \ --type=data \ --searchBody='{ "_source" : [ "company_name" , "credit_code" , "company_org_type" , "category_name" , "company_status" , "legal_person_name" , "registered_address" , "registered_capital" , "established_date" , "business_scope" , "registered_authority" ], "query" : { "terms" : { "company_name" : [ "中市建设工程质量中心有限公司" , "中市伟泰用品有限公司" ] } } }' |
9. ES数据迁移或同步:https://cloud.tencent.com/developer/article/1621564
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构