Elasticsearch(7.5.0版本) 导航学习

简介：Elasticsearch是基于Apache Lucene的开源搜索引擎，采用Java语言开发的，它提供了一个分布式、高扩展、高实时能力的全文搜索与数据分析引擎，适合作为Nosql数据存储，但缺少分布式事务。ES通过简单的Restful Api来隐藏Lucence的复杂性，从而让全文搜索变得简单。与Logstash、Kibana组合并称ELK，达到可以同时实现日志收集、日志搜索和日志分析的能力。

1. 基本概念

名称	说明
Node(节点)	单个装有Es服务并且提供故障转移和扩展服务器
Cluster(集群)	由一个或多个Node组织在一起共同工作，共同分享整个数据具有负载均衡功能的集群，只有一个主节点
Document(文档)	一个文档是一个可被索引的基础信息单元
Index(索引)	索引就是一个拥有几分相似特征的文档集合
Field(列)	列是Es的最小单位，相当于数据的某一列
Shards(索引分片)	Es将索引分成若干份，分布到不同节点上，每个部分就是一个分片
Replicas(索引副本)	Replicas是索引一份或几份拷贝，用于提高系统的容错性，且自动对搜索请求进行负载均衡

　　数据类型

　　　　keyword：索引结构化的字段，类似mysql中string；

　　　　数组：ES没有专用的数组类型，默认情况下任何字段都可以包含一个或者多个值，但是一个数组中的值要是同一种类型，有字符数组、整形数组、嵌套数组和对象数组

2. Elasticsearch和Solr的比较

　　A. 都是当前比较火的全文搜索引擎；

　　B. Elasticsearch更侧重于实时的数据分析，Solr这方面效率较ES低；

　　C. Solr支持的文本格式比较多，如HTML、PDF、Word、Excel、CVS等，而Elasticsearch只支持json的格式。

3. Maven依赖

<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>7.5.0</version>
</dependency>

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.5.0</version>
</dependency>

4. Elasticsearch配置类，注入restHighLevelClient

package com.ruhuanxingyun.elasticsearch.config;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.util.Arrays;

/**
 * @description: Elasticsearch配置类
 * @author: ruphie
 * @date: Create in 2020/12/1 21:08
 * @company: ruhuanxingyun
 */
@Configuration
public class ElasticsearchConfig {

    @Value("${spring.data.elasticsearch.rest.hosts}")
    private String hosts;

    @Bean
    public RestClientBuilder restClientBuilder() {
        String[] hostsArr = hosts.split(",");
        HttpHost[] httpHosts = Arrays.stream(hostsArr)
                .map(host -> {
                    String[] hostArr = host.split(":");
                    String ip = hostArr[0];
                    int port = Integer.parseInt(hostArr[1]);

                    return new HttpHost(ip, port, HttpHost.DEFAULT_SCHEME_NAME);
                }).toArray(HttpHost[]::new);

        return RestClient.builder(httpHosts);
    }

    @Bean
    public RestHighLevelClient highLevelClient(RestClientBuilder restClientBuilder) {
        return new RestHighLevelClient(restClientBuilder);
    }

}

5. 官网地址

　　A. https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/index.html

　　B. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/index.html

6. 环境搭建

　　A. Elasticsearch环境搭建：https://www.cnblogs.com/ruhuanxingyun/p/11399484.html

　　B. Filebeat环境搭建：https://www.cnblogs.com/ruhuanxingyun/p/11414708.html

　　C. Logstash环境搭建：https://www.cnblogs.com/ruhuanxingyun/p/11414719.html

7. 核心的Http Api

　　A. Index APIs：负责索引Index的创建create、删除Delete、获取Get等；

　　　　I. Java代码层操作：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_index_apis.html

　　　　II. elasticsearch-head / postman / kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices.html

　　　　III. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/11429347.html

　　B. Document APIs：负责索引文档的创建Index、删除Delete、获取Get等操作，它是根据doc_id进行查询；

　　　　I. Java代码层操作：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-supported-apis.html

　　　　II. elasticsearch-head / postman / kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/docs.html

　　　　III. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/11434385.html

　 C. Search APIs：负责索引文档的查询Search，它是根据条件查询；

　　　　I. Java代码层操作：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_search_apis.html

　　　　II. elasticsearch-head / postman / kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search.html

　　　　III. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/12201644.html

　　D. cat APIs：负责查询索引相关的各类信息查询；

　　　　I. Java代码层操作：执行请求-https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-requests.html、阅读响应-https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-responses.html

　　　　II. elasticsearch-head / postman / kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/cat.html

　　　　III. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/12174465.html

　　E. Cluster APIs：负责集群相关的各类信息查询；

　　　　I. Java代码层操作：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_cluster_apis.html

　　　　II. elasticsearch-head / postman / kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/cluster.html

　　　　III. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/12193148.html

　　F. Query DSL：结构化查询语句

　　　　I. java代码层操作：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-query-builders.html

　　　　II. elasticsearch-head / postman /kibana等操作：https://www.elastic.co/guide/en/elasticsearch/reference/7.5/query-dsl.html

　　　　III. 实例参考： https://www.cnblogs.com/ruhuanxingyun/p/11322670.html

　　G. Text analysis：分词

　　　　I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/analysis.html

　　　　II. 实例参考：

　　H. Mapping：映射

　　　　I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/mapping.html

　　　　II. 实例参考：

　　I. Aggregations：聚合

　　　　I. https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search-aggregations.html

　　　　II. 实例参考：https://www.cnblogs.com/ruhuanxingyun/p/12304502.html

8. ES的数据导出

　　1. Docker方式使用elasticsearch-dump

docker run --rm -ti -v /es:/tmp taskrabbit/elasticsearch-dump --input=http://127.0.0.1:8200/report_log_2021.02 --output=/tmp/report_log_2021.02.json --searchBody='{"query":{"bool":{"filter":[{"term":{"type": 8}}, {"range":{"timestamp":{"gte":"2021-02-01 00:00:00.000", "lte":"2021-02-07 23:59:59.999"}}}]}}, "_source":["user.name","user.phone","timestamp","access.url","access.title"]}' --type=data --sourceOnly=true

　　2. 原生方式使用elasticdump：npm install -g elasticdump

elasticdump \
  --input=http://username:password@10.10.10.10:9200/company \
  --output=company_data.json \
  --type=data \
  --searchBody='{
    "_source": ["company_name", "credit_code", "company_org_type", "category_name", "company_status", "legal_person_name", "registered_address", "registered_capital", "established_date", "business_scope", "registered_authority"],
    "query": {
        "terms": {
			"company_name": ["中市建设工程质量中心有限公司", "中市伟泰用品有限公司"]
		}
	}
  }'

9. ES数据迁移或同步：https://cloud.tencent.com/developer/article/1621564

　　可参考：Elasticsearch导出数据方式

posted @ 2020-01-10 08:16 如幻行云阅读(835) 评论(0) 收藏举报

刷新页面返回顶部

如幻行云

Elasticsearch(7.5.0版本) 导航学习

公告