SpringBoot+Elasticsearch7.7.0实战

一:什么是Elasticsearch
1.Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java语言开发的,并作为Apache许可条款下的开放源码发布,是一种流行的企业级搜索引擎。Elasticsearch用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。官方客户端在Java、.NET(C#)、PHP、Python、Apache Groovy、Ruby和许多其他语言中都是可用的。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr,也是基于Lucene

1.1什么是ELK?

ELK=elasticsearch+Logstash+kibana 
elasticsearch:后台分布式存储以及全文检索 
logstash: 日志加工、“搬运工” 
kibana:数据可视化展示。 
ELK架构为数据分布式存储、可视化查询和日志解析创建了一个功能强大的管理链。 三者相互配合,取长补短,共同完成分布式大数据处理工作。

2.不仅仅是全文搜索引擎

  • 分布式实时文件存储,并将每一个字段都编入索引,使其可以被搜索。
  • 实时分析的分布式搜索引擎。
  • 可以扩展到上百台服务器,处理PB级别的结构化或非结构化数据。

二:安装并运行Elasticsearch

1.官网下载:https://www.elastic.co/cn/downloads/elasticsearch

2.解压并运行 elasticsearch.bat
3.修改配置文件elasticsearch.yml,解决与ElasticSearch-head整合跨域问题(开启跨域配置)

http.cors.enabled: true
http.cors.allow-origin: "*"

3:ElasticSearch-head 插件安装:在GitHub上找到head插件,地址:https://github.com/mobz/elasticsearch-head。将其下载/克隆到本地(elasticsearch-head)

# 安装插件;由于需要下载一些数据,所以可能会比较慢。
npm install
# 启动
npm run start

4:ES运行成功页面

 

5:运行Kibana(进入config下的kibana.yml 修改配置为)

# 本机的端口
server.port: 5601
# 本机IP地址
server.host: "127.0.0.1"
# ES的服务IP+端口
elasticsearch.hosts: ["http://localhost:9200"]

启动命令

.\kibana.bat

6:Kibana运行成功页面(http://localhost:5601/app/kibana#/home)


7:运行Logstash(进入E:\ELK\logstash-7.7.0\config 新建配置文件logstash.conf 修改内容如下)

input {
    file {
        type => "nginx_access"
        path => "E:/ELK/logs/logstash.log"
    }
}
output {
    elasticsearch {
        hosts => ["127.0.0.1:9200"]
        index => "access-%{+YYYY.MM.dd}"
    }
    stdout {
        codec => json_lines
    }
}

8:启动logstash

.\logstash.bat -f ../config/logstash.conf

9:logback_spring.xml

<?xml version="1.0" encoding="UTF-8"?>

<!--scan: 为true时,配置文档如果发生改变,将会被重新加载,默认值true
    scanPeriod: 设置监测配置文档是否有修改的时间间隔,如果没有给出单位,默认毫秒。 为true时生效,默认时间间隔1分钟
    debug: 为true时,将打印出logback内部日志信息,实时查看logback运行状态。默认值false -->
<configuration scan="true" scanPeriod="10 seconds" debug="false">

    <contextName>logback</contextName>

    <!-- 属性值会被插入到上下文中,在下面配置中可以使“${}”来使用 -->
    <!-- 格式化输出:%d表示日期,%thread表示线程名,%-5level:级别从左显示5个字符宽度 %msg:日志消息,%n是换行符 -->
    <!-- <property name="LOG.PATTERN" value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] [%15.15thread] [%45.45file{15}\(%-4line\\) - %-45.45logger{15}] : %msg%n" /> -->
    <property name="LOG.PATTERN" value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] %15.15thread -- %-45.45logger{15} : %msg%n"/>
    <!-- 日志文件 -->
    <property name="LOG.PATH" value="E:/ELK/logs/logstash.log"/>
    <property name="LOG.FILE.DEBUG" value="${LOG.PATH}/debug.log"/>
    <property name="LOG.FILE.INFO" value="${LOG.PATH}/info.log"/>
    <property name="LOG.FILE.WARN" value="${LOG.PATH}/warn.log"/>
    <property name="LOG.FILE.ERROR" value="${LOG.PATH}/error.log"/>
    <property name="LOG.FILE.NAME" value="springboot.%d{yyyy-MM-dd}.%i.log"/>

    <!-- 输出日志到控制台 -->
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <Pattern>${LOG.PATTERN}</Pattern>
            <charset>UTF-8</charset>
        </encoder>
        <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
            <level>INFO</level>
        </filter>
    </appender>


    <!-- 输出日志到文件:DEBUG -->
    <appender name="FILE-DEBUG" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${LOG.FILE.DEBUG}</file>
        <append>true</append>
        <encoder>
            <Pattern>${LOG.PATTERN}</Pattern>
            <charset>UTF-8</charset>
        </encoder>

        <!-- 日志记录级别 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>DEBUG</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>

        <!-- 时间滚动策略 -->
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>

            <!-- 保留天数 -->
            <maxHistory>7</maxHistory>

            <fileNamePattern>${LOG.FILE.DEBUG}/${LOG.FILE.NAME}</fileNamePattern>
        </rollingPolicy>
    </appender>

    <!-- 输出日志到文件:INFO -->
    <appender name="FILE-INFO" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${LOG.FILE.INFO}</file>
        <append>true</append>
        <encoder>
            <Pattern>${LOG.PATTERN}</Pattern>
            <charset>UTF-8</charset>
        </encoder>

        <!-- 日志记录级别 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>INFO</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>

        <!-- 时间滚动策略 -->
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>

            <!-- 保留天数 -->
            <maxHistory>7</maxHistory>

            <fileNamePattern>${LOG.FILE.INFO}/${LOG.FILE.NAME}</fileNamePattern>
        </rollingPolicy>
    </appender>

    <!-- 输出日志到文件:ERROR -->
    <appender name="FILE-ERROR" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${LOG.FILE.ERROR}</file>
        <append>true</append>
        <encoder>
            <Pattern>${LOG.PATTERN}</Pattern>
            <charset>UTF-8</charset>
        </encoder>

        <!-- 日志记录级别 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>ERROR</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>

        <!-- 时间滚动策略 -->
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>

            <!-- 保留天数 -->
            <maxHistory>7</maxHistory>

            <fileNamePattern>${LOG.FILE.ERROR}/${LOG.FILE.NAME}</fileNamePattern>
        </rollingPolicy>
    </appender>

    <!-- 输出日志到文件:WARN -->
    <appender name="FILE-WARN" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${LOG.FILE.WARN}</file>
        <append>true</append>
        <encoder>
            <Pattern>${LOG.PATTERN}</Pattern>
            <charset>UTF-8</charset>
        </encoder>

        <!-- 日志记录级别 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>WARN</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>

        <!-- 时间滚动策略 -->
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>

            <!-- 保留天数 -->
            <maxHistory>7</maxHistory>

            <fileNamePattern>${LOG.FILE.WARN}/${LOG.FILE.NAME}</fileNamePattern>
        </rollingPolicy>
    </appender>

    <!-- 下面配置一些第三方包的日志过滤级别,用于避免刷屏-->
    <Logger name="org.springframework" level="INFO"/>
    <Logger name="org.springframework.beans.factory.aspectj" level="WARN"/>
    <Logger name="org.springframework.transaction.interceptor" level="WARN"/>
    <Logger name="org.springframework.beans.factory.support" level="WARN"/>
    <Logger name="org.springframework.web.servlet" level="DEBUG"/>
    <Logger name="org.springframework.transaction" level="WARN"/>
    <Logger name="org.springframework.jdbc.datasource.DataSourceTransactionManager" level="WARN"/>
    <Logger name="org.springframework.transaction.support.AbstractPlatformTransactionManager" level="WARN"/>
    <Logger name="org.springframework.security" level="WARN"/>
    <Logger name="org.apache.commons" level="WARN"/>
    <Logger name="org.apache.kafka" level="WARN"/>
    <Logger name="org.apache.http" level="ERROR"/>
    <Logger name="org.logicalcobwebs" level="WARN"/>
    <Logger name="httpclient" level="ERROR"/>
    <Logger name="net.sf.ehcache" level="WARN"/>
    <Logger name="org.apache.zookeeper" level="WARN"/>
    <Logger name="org.I0Itec" level="WARN"/>
    <Logger name="com.meetup.memcached" level="WARN"/>
    <Logger name="org.mongodb.driver" level="INFO"/>
    <Logger name="org.quartz.core" level="INFO"/>
    <Logger name="io.netty" level="INFO"/>
    <Logger name="net.rubyeye.xmemcached" level="WARN"/>
    <Logger name="com.google" level="WARN"/>
    <logger name="com.ibatis" level="DEBUG"/>
    <logger name="com.ibatis.common.jdbc.SimpleDataSource" level="DEBUG"/>
    <logger name="com.ibatis.common.jdbc.ScriptRunner" level="DEBUG"/>
    <logger name="com.ibatis.sqlmap.engine.impl.SqlMapClientDelegate" level="DEBUG"/>
    <logger name="java.sql.Connection" level="DEBUG"/>
    <logger name="java.sql.Statement" level="DEBUG"/>
    <logger name="java.sql.PreparedStatement" level="DEBUG"/>
    <logger name="com.es.elsaticsearch" level="DEBUG"/>

    <!-- 生产环境下,将此级别配置为适合的级别,以免日志文件太多或影响程序性能
        level取值:TRACE < DEBUG < INFO < WARN < ERROR < FATAL 和 OFF -->
    <root level="INFO">
        <appender-ref ref="FILE-DEBUG"/>
        <appender-ref ref="FILE-INFO"/>
        <appender-ref ref="FILE-ERROR"/>
        <appender-ref ref="FILE-WARN"/>

        <!-- 生产环境将请console去掉 -->
        <appender-ref ref="CONSOLE"/>
    </root>

</configuration>

ES核心概念

索引

字段类型(mapping)

文档(document)

elasticsearch是面向文档,一切数据都是json,与关系型数据库对比

DB ElasticSearch
数据库(database) 索引
表(table) types
行(rows) documents
字段(columns) fields

 三:springboot整合ElasticSearch7.7.0

1.添加maven依赖

   <properties>
        <java.version>1.8</java.version>
        <elasticsearch.version>7.7.0</elasticsearch.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
    </dependencies>    

2.编写配置类(ES可以作为一个独立的单个搜索服务器。不过,为了处理大型数据集,实现容错和高可用性,ES可以运行在许多互相合作的服务器上【集群】),本案例是单个服务器

/**
 * @author : ywb
 * @createdDate : 2020/5/31
 * @updatedDate
 */
@Configuration
public class ElasticSearchConfig {
    @Bean
    public RestHighLevelClient restHighLevelClient() {
        RestHighLevelClient restHighLevelClient = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("localhost", 9200, "http")
                )
        );
        return restHighLevelClient;
    }
}

3:ES对索引、文档进行增删改查(java对ES操作的基本API)

@RunWith(SpringRunner.class)
@SpringBootTest
public class ElasticsearchApplicationTests {


    @Autowired
    private RestHighLevelClient restHighLevelClient;

    //创建索引
    @Test
    public void testCreateIndex() throws IOException {
        CreateIndexRequest createIndexRequest = new CreateIndexRequest("ywb");
        CreateIndexResponse response = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
        System.out.println(response);
    }

    /**
     * 测试索引是否存在
     *
     * @throws IOException
     */
    @Test
    public void testExistIndex() throws IOException {
        GetIndexRequest request = new GetIndexRequest("ywb");
        boolean exists = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);
        System.out.println(exists);
    }

    /**
     * 删除索引
     */
    @Test
    public void deleteIndex() throws IOException {
        DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest("ywb");
        AcknowledgedResponse delete = restHighLevelClient.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
        System.out.println(delete.isAcknowledged());
    }

    /**
     * 测试添加文档
     *
     * @throws IOException
     */
    @Test
    public void createDocument() throws IOException {
        User user = new User("ywb", 18);
        IndexRequest request = new IndexRequest("ywb");
        request.id("1");
        request.timeout(TimeValue.timeValueSeconds(1));
        request.timeout("1s");
        //将我们的数据放入请求,json
        request.source(JSON.toJSONString(user), XContentType.JSON);
        //客服端发送请求
        IndexResponse index = restHighLevelClient.index(request, RequestOptions.DEFAULT);
        System.out.println(index.toString());
        //对应我们的命令返回状态
        System.out.println(index.status());
    }

    //判断是否存在文档
    @Test
    public void testIsExist() throws IOException {
        GetRequest getRequest = new GetRequest("ywb", "1");
        //不获取返回的source的上下文
        getRequest.fetchSourceContext(new FetchSourceContext(false));
        getRequest.storedFields("_none_");
        boolean exists = restHighLevelClient.exists(getRequest, RequestOptions.DEFAULT);
        System.out.println(exists);
    }

    //获取文档信息
    @Test
    public void testGetDocument() throws IOException {
        GetRequest getRequest = new GetRequest("ywb", "1");
        GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
        //打印文档信息
        System.out.println(response.getSourceAsString());
        System.out.println(response);
    }

    //更新文档信息
    @Test
    public void testUpdateDocument() throws IOException {
        UpdateRequest request = new UpdateRequest("ywb", "1");
        request.timeout("1s");
        User user = new User("ywb java", 19);
        request.doc(JSON.toJSONString(user), XContentType.JSON);
        UpdateResponse update = restHighLevelClient.update(request, RequestOptions.DEFAULT);
        System.out.println(update);
        System.out.println(update.status());
    }
    //删除文档
    @Test
    public void testDeleteDocument() throws IOException {
        DeleteRequest request = new DeleteRequest("ywb", "1");
        request.timeout("10s");
        User user = new User("ywb java", 19);
        DeleteResponse update = restHighLevelClient.delete(request, RequestOptions.DEFAULT);
        System.out.println(update.status());
    }
    //批量插入数据
    @Test
    public void testBulkRequest() throws IOException {
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.timeout("10s");
        ArrayList<User> users = new ArrayList<>();
        users.add(new User("zhangsan", 1));
        users.add(new User("lishi", 12));
        users.add(new User("wangwu", 13));
        users.add(new User("zhaoliu", 14));
        users.add(new User("tianqi", 15));
        for (int i = 0; i < users.size(); i++) {
            bulkRequest.add(
                    new IndexRequest("ywb")
                            .id("" + i + 1)
                            .source(JSON.toJSONString(users.get(i)), XContentType.JSON)
            );
        }
        BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
        System.out.println(bulk);

    }
}

四:1实战(爬取京东网页数据,添加至ElasticSearch)

  <!--        解析网页-->
        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.10.2</version>
        </dependency>

2:工具类

/**
 * @author : ywb
 * @createdDate : 2020/5/31
 * @updatedDate
 */
@Component
public class HtmlParseUtils {

    /**
     * 爬取京东商城搜索数据
     * @param keyWork
     * @return
     * @throws IOException
     */
    public List<Content> listGoods(String keyWork) throws IOException {
        String url = "https://search.jd.com/Search?keyword="+keyWork;
        Document document = Jsoup.parse(new URL(url),30000);
        Element element = document.getElementById("J_goodsList");
        Elements li = element.getElementsByTag("li");
        List<Content> list = new ArrayList<>();
        for (Element elements : li) {
            String img = elements.getElementsByTag("img").eq(0).attr("src");
            String price = elements.getElementsByClass("p-price").eq(0).text();
            String title = elements.getElementsByClass("p-name").eq(0).text();
            Content content = new Content();
            content.setImg(img);
            content.setPrices(price);
            content.setTitle(title);
            list.add(content);
        }
        return list;
    }
}

3:service(查询京东数据,存储到ES,并显示页面)

/**
 * @author : ywb
 * @createdDate : 2020/5/31
 * @updatedDate
 */
@Service
public class ContentService {
    @Autowired
    private HtmlParseUtils htmlParseUtils;

    @Autowired
    @Qualifier("restHighLevelClient")
    private RestHighLevelClient client;

    /**
     * 插叙数据
     * @param keyWords
     * @return
     * @throws IOException
     */
    public Boolean parseContent(String keyWords) throws IOException {
        List<Content> contents = htmlParseUtils.listGoods("java");
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.timeout("2m");
        for (int i = 0; i < contents.size(); i++) {
            bulkRequest.add(
                    new IndexRequest("jd")
                    .source(JSON.toJSONString(contents.get(i)), XContentType.JSON));

        }
        BulkResponse bulk = client.bulk(bulkRequest, RequestOptions.DEFAULT);
        return !bulk.hasFailures();
    }

    /**
     * 分页查询ElasticSearch数据,并高亮显示
     * @param keyWords
     * @param pageNum
     * @param pageSize
     * @return
     * @throws IOException
     */
    public List<Map<String,Object>> searchPage(String keyWords,Integer pageNum,Integer pageSize) throws IOException {
        if(pageNum<=1){
            pageNum=1;
        }
        SearchRequest searchRequest = new SearchRequest("jd");
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        //分页
        sourceBuilder.from(pageNum);
        sourceBuilder.size(pageSize);
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keyWords);
        sourceBuilder.query(termQueryBuilder);
        sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));

        //高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("title");
        //多个高亮显示设置为true
        highlightBuilder.requireFieldMatch(false);
        highlightBuilder.preTags("<span style = 'color:red'>");
        highlightBuilder.postTags("</span>");
        sourceBuilder.highlighter(highlightBuilder);
        //执行搜索
        searchRequest.source(sourceBuilder);
        SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
        //解析结果
        List<Map<String, Object>> maps = new ArrayList<>();
        for (SearchHit hit : search.getHits().getHits()) {
            Map<String, HighlightField> highlightFields = hit.getHighlightFields();
            HighlightField title = highlightFields.get("title");
            //原来的结果
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            //解析高亮字段 ,将原来的字段换成高亮字段显示
            if(title!=null){
                Text[] fragments = title.fragments();
                String newTitle = "";
                for (Text fragment : fragments) {
                    newTitle+=fragment;
                }
                //高亮字段替换原来内容
                sourceAsMap.put("title",newTitle);
            }
            maps.add(hit.getSourceAsMap());
        }
        return maps;
    }
}

结果:

五、为了部署简单使用docker-compose启动程序

 5.1:编写docker-compose.yaml

version: '3'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.12.1
    container_name: elasticsearch_server
    restart: always
    environment:
      - discovery.type=single-node
      - discovery.zen.minimum_master_nodes=1
      - ES_JAVA_OPTS=-Xms3g -Xmx3g
    volumes:
      - ./elasticsearch/data:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      elk_net:     # 指定使用的网络
        aliases:
          - elasticsearch     # 该容器的别名,在 elk_net 网络中的其他容器可以通过别名 elasticsearch 来访问到该容器

  kibana:
    image: docker.elastic.co/kibana/kibana:7.12.1
    container_name: kibana_server
    ports:
      - "5601:5601"
    restart: always
    networks:
      elk_net:
        aliases:
          - kibana
    environment:
      - ELASTICSEARCH_URL=http://elasticsearch:9200
      - SERVER_NAME=kibana
    # 如需具体配置,可以创建./config/kibana.yml,并映射
    #     # volumes:
    #         #   - ./config/kibana.yml:/usr/share/kibana/config/kibana.yml
    depends_on:
     - elasticsearch

  logstash:
    image: docker.elastic.co/logstash/logstash:7.12.1
    container_name: logstash_server
    restart: always
    environment:
      - LS_JAVA_OPTS=-Xmx256m -Xms256m
    volumes:
      - ./config/logstash.conf:/etc/logstash/conf.d/logstash.conf
    networks:
      elk_net:
        aliases:
          - logstash
    depends_on:
      - elasticsearch
    entrypoint:
      - logstash
      - -f
      - /etc/logstash/conf.d/logstash.conf
    logging:
      driver: "json-file"
      options:
        max-size: "200m"
        max-file: "3"

networks:
  elk_net:
    external:
      name: elk_net

5.2:docker-compose up -d

 5.3:docker ps

 

 

 

 

<<圆规为什么可以画圆?因为脚在走,心不变。你为什么不能圆梦?因为心不定,脚不动!坚定信念和坚持选择比什么都重要>>

posted @ 2020-06-02 12:53  yiwanbin  阅读(14683)  评论(10编辑  收藏  举报