稻草问答-用Elasticsearch搜索

1 Elasticsearch搜索引擎

说到数据检索各位一定会想到使用数据库SQL语句检索数据，数据库SQL语句在有数据库索引技术支持情况下检索性能非常好，在数千万数据中进行检索十几个毫秒就能得到结果。但是数据库索引不是万能的，对于全文检索任务就无法自动使用索引了。

举个例子：

SELECT * FROM question WHERE content LIKE ‘%3ava%'

这个SQL的意义是检索content中包含Java的问题数据。SQL查询条件中以%为开头时候，将不会使用索引优化查询，只能对数据进行逐条比较查询。性能不是一般的慢，数据量少的时候性能还可以接受，如果数据量大比如千万级别的数据，查询一次就需要几分钟了。性能难以接受。传统关系数据库是不能用于全文检索的!

全文检索技术就是为了解决数据内容搜索的性能问题而设计的，全文检索服务器封装的专业的分词索引技术，全文检索服务可以在数亿数据中进行内容匹配检索，往往可以秒级时间就能返回结果。

说明:

如果进行全字段匹配，关系数据库在数据库索引支持情况下其性能很好;
关系数据库全文检索性能差，关系数据库不适合进行全文检索;
专业的搜索引擎内部实现了分词索引技术，适合全文检索。

Elasticsearch是目前最流行的全文搜索服务，利用Elasticsearch就可以实现全文检索功能。

1.1 Elasticsearch

Elasticsearch是一个基于 Lucene的搜索服务器，Lucene是核心技术，Elasticsearch是完整的应用，用电脑作为比喻，Lucene就像是芯片，而Elasticsearch就像是用芯片打造的完整电脑，开箱即用!

它提供了一个分布式多用户能力的全文搜索引擎，通过集群部署能够提供高并发、高性能、高可用搜索服务;
基于REST ful web接口，可以支持任何编程语言调用;
Elasticsearch是用Java语言开发的，并作为Apache许可条款下的开放源码发布，是一种流行的企业级搜索引擎。

安装Elalticsearch，步骤如下:

首先Elalticsearch是用Java语言编写的，需要JDK8 以后的Java环境。请先安装配置Java环境，如果Java环境变量配置不对，很有可能无法启动Elasticsearch程序。

然后下载安装包:

Windows系统下载安装Windows版本、苹果电脑下载Mac版本、Linux系统下载Linux版本。Windows系统下载后释到文件夹:

然后进入bin文件夹，执行elasticsearch.bat，就能启动Elasticsearch程序了:

也可以使用Windows命令启动Elasticsearch:

D:
cd \opt\elasticsearch-7.6.2\bin
    elasticsearch.bat

启动后的窗口不要关闭，如果关闭则Elasticsearch就结束运行了:

测试：打开浏览器访问http://localhost:9200，显示信息就是成功安装了。Elasticsearch也有集群部署功能，是运维层面的问题，与应用开发关系不大，这里不做详细阐述。

Linux、Mac系统安装基本一样，下载并且释放到硬盘，然后使用命令启动就行了:Mac系统:

tar -xvf elasticsearch-7.6.2-darwin-x86_64.tar.gz
cd elasticsearch-7.6.2/bin
./elasticsearch

Linux:

tar -xvf elasticsearch-7.6.2-linux-x86_64.tar.gz
cd elasticsearch-7.6.2/bin
./elasticsearch

1.2 使用RESTful工具访问Elasticsearch

首先在straw项目中创建一个创建一个straw-search模块项目作为搜索功能模块

配置straw-search项目的POM:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>cn.tedu</groupId>
        <artifactId>straw</artifactId>
        <version>0.0.1-SNAPSHOT</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <artifactId>straw-search</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>straw-search</name>
    <description>稻草问答搜素功能</description>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
        </dependency>
        <dependency>
            <groupId>org.junit.vintage</groupId>
            <artifactId>junit-vintage-engine</artifactId>
        </dependency>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

配置straw项目中的模块

<modules>
    <module>straw-portal</module>
    <module>straw-generator</module>
    <module>straw-resource</module>
    <module>straw-eureka</module>
    <module>straw-gateway</module>
    <module>straw-sys</module>
    <module>straw-commons</module>
    <module>straw-faq</module>
    <module>straw-search</module>
</modules>

IDEA提供了REST fulAPI测试工具，使用这个工具可以直观的访问任何REST API,使用步骤如下:

在项目中创建HTTP Request文件:

使用这个文件中发起HTTP请求信息:

这个文件是文本文件，可以保存大量的调试命令，例如:

### 第一个测试
GET http://localhost:9200

### 在ES中创建一个索引
PUT http://localhost:9200/demo_index

### 从ES中删除一个索引
DELETE https://localhost:9200/demo_index

1.3 安装IK分词插件

当一个文档被存储时，ES会使用分词器从文档中提取出若干词元(token) 来支持索引的存储和搜索。

ES内置了很多分词器，但内置的分词器对中文的处理不好。下面通过例子来看内置分词器的处理。在web客户端发起如下的一个REST请求，对英文语句进行分词:

### 英文分词测试
    POST http://localhost:9200/_analyze
Content-Type:application/json

{
    "text":"Hello World!"
}

结果显示"hello world" 语句被分为两个单词，因为英文天生以空格分隔，自然就以空格来分词，这没有任何问题。

POST http://localhost:9200/_analyze

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "tokens": [
            {
                "token": "hello",
                "start_offset": 0,
                "end_offset": 5,
                "type": "<ALPHANUM>",
                "position": 0
            },
            {
                "token": "world",
                "start_offset": 6,
                "end_offset": 11,
                "type": "<ALPHANUM>",
                "position": 1
            }
        ]
    }

Response code: 200 (OK); Time: 287ms; Content length: 179 bytes

下面我们看一个中文的语句例子，请求REST如下:

POST http://localhost:9200/_analyze
Content-Type:application/json

{
    "text": "世界你好!"
}

操作完成后，响应内容如下：

POST http://localhost:9200/_analyze

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "tokens": [
            {
                "token": "世",
                "start_offset": 0,
                "end_offset": 1,
                "type": "<IDEOGRAPHIC>",
                "position": 0
            },
            {
                "token": "界",
                "start_offset": 1,
                "end_offset": 2,
                "type": "<IDEOGRAPHIC>",
                "position": 1
            },
            {
                "token": "你",
                "start_offset": 2,
                "end_offset": 3,
                "type": "<IDEOGRAPHIC>",
                "position": 2
            },
            {
                "token": "好",
                "start_offset": 3,
                "end_offset": 4,
                "type": "<IDEOGRAPHIC>",
                "position": 3
            }
        ]
    }

Response code: 200 (OK); Time: 222ms; Content length: 340 bytes

从结果可以看出，这种分词把每个汉字都独立分开来了，这对中文分词就没有意义了，所以ES默认的分词器对中文处理是有问题的。好在有很多不错的第三方的中文分词器，可以很好地和ES结合起来使用。在ES中，每种分词器(包括内置的、第三方的)都会有个名称。上面默认的操作，其实用的分词器的名称是standard。

### 中文分词测试
    POST http://localhost:9200/_analyze
Content-Type:application/json

{
    "analyzer": "standard",
    "text": "世界你好!"
}

当我们换一个分词器处理分词时，只需将”analyzer"字段设置相应的分词器名称即可。ES通过安装插件的方式来支持第三方分词器，对于第三方的中文分词器，比较常用的是中科院ICTCLAS的smartcn和IKAnanlyzer分词器。我们介绍IKAnanlyzer分词器(下面简称ik)的使用

首先需要下载：

然后在Elasticsearch的plugin文件夹中创建一个子文件夹ik，将下载的文件elasticsearch- analysis-ik-7.8.0.zip释放并且复制到这个文件夹:

重新启动就安装好插件了。Linux、Mac系统参考命令:

cd elasticsearch-7. 6.2\plugin
mkdir ik
cd ik
unzip -/elasticsearch- analysis-ik-7.6.2.zip

Linux、 Mac系统中也需要重新启动Elasticsearch.检查ik分词功能: ik提供了两个分词器，分别是ik. max _ word和ik_ smart,下面我们分别测试下。

先测试ik_smart,输入命令如下:

### 中文分词测试, ik分词器测试, ik_smart  ik_max_word
    POST http://localhost:9200/_analyze
Content-Type:application/json

{
    "analyzer": "ik_smart",
    "text": "世界你好!"
}

测试结果：

POST http://localhost:9200/_analyze

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "tokens": [
            {
                "token": "世界",
                "start_offset": 0,
                "end_offset": 2,
                "type": "CN_WORD",
                "position": 0
            },
            {
                "token": "你好",
                "start_offset": 2,
                "end_offset": 4,
                "type": "CN_WORD",
                "position": 1
            }
        ]
    }

Response code: 200 (OK); Time: 202ms; Content length: 166 bytes

再测试ik_max_word

### 中文分词测试, ik分词器测试, ik_smart  ik_max_word
    POST http://localhost:9200/_analyze
Content-Type:application/json

{
    "analyzer": "ik_max_word",
    "text": "世界你好!"
}

测试结果：(效果不明显可自行换text，比如：世界如此之大)

POST http://localhost:9200/_analyze

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "tokens": [
            {
                "token": "世界",
                "start_offset": 0,
                "end_offset": 2,
                "type": "CN_WORD",
                "position": 0
            },
            {
                "token": "你好",
                "start_offset": 2,
                "end_offset": 4,
                "type": "CN_WORD",
                "position": 1
            }
        ]
    }

Response code: 200 (OK); Time: 181ms; Content length: 166 bytes

比较两个分词器对同一句中文的分词结果 , ik_ max_ word比ik_ smart得到的中文词更多(从两者的英文名含义就可看出来) , 但这样也带来一个问题，使用ik_ max _word会占用更多的存储空间。

1.4 使用Elasticsearch进行搜索

使用ES搜索就必须先在ES中建立索引(index) 结构，有了索引结构才能进行搜索。ES 7.的索引结构如下:

ES服务器节点中可以存储多个index (索引)
每个索引中包含多个document (文档)文档的数据结构是JSON格式

使用REST API可以建立ES的存储结构:

首先建立索引questions:

### 创建一个索引
PUT http://localhost:9200/questions

反馈结果，acknowledged的意思是"确认”:

PUT http://localhost:9200/questions

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "acknowledged": true,
        "shards_acknowledged": true,
        "index": "questions"
    }

Response code: 200 (OK); Time: 595ms; Content length: 68 bytes

如果创建错误，可以删除索引:

### 删除一个索引
DELETE http://localhost:9200/questions

为questions索引中的文档属性设置分词器:

### 设置index中的文档属性采用ik分词
    ### 设定questions索引中文的结构
    POST http://localhost:9200/questions/_mapping
Content-Type: application/json

{
    "properties": {
        "title": {
            "type": "text",
            "analyzer": "ik_max_word"
                "search_analyzer": "ik_max_word"
        }
        "content": {
            "type": "text",
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_max_word"

        }
    }
}

反馈结果：

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8
    {
        "acknowledged": true
    }

在questions索引中存储文档，URL 结尾是文档唯-id，id可以是数字或者字符:

### 向ES的questions索引中添加文档数据
    POST http://localhost:9200/questions/_create/1
Content-Type: application/json

{
    "id":1,
    "title": "Java的基本类型有哪些?",
    "content": "每次面试的时候都有人问Java的基本类型，为啥呀!"
}

反馈结果，文档的存储位置是/questions/_ doc/1 :

POST http://localhost:9200/questions/_create/1

HTTP/1.1 201 Created
    Location: /questions/_doc/1
        content-type: application/json; charset=UTF-8

        {
            "_index": "questions",
            "_type": "_doc",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "_seq_no": 0,
            "_primary_term": 1
        }

Response code: 201 (Created); Time: 740ms; Content length: 157 bytes

继续向questions索引中添加文档:

### 继续添加数据
    POST http://localhost:9200/questions/_create/2
Content-Type: application/json

{
    "id":2,
    "title": "Java中int类型的数据范围?",
    "content": "int类型的有效范围是啥，为啥要记住这个范围?"

}

### 继续添加数据
    POST http://localhost:9200/questions/_create/3
Content-Type: application/json

{
    "id":3,
    "title": "Java中double类型的数据范围?",
    "content": "double类型的有效范围是啥，为啥要记住这个范围?"

}

### 继续添加数据
    POST http://localhost:9200/questions/_create/4
Content-Type: application/json

{
    "id":4,
    "title": "线程创建方式有哪些?",
    "content": "如何创建线程,每一种创建方式适合哪些情况?"

}

在questions索引中更新一个文档，更新文档编号为4的title属性的值:

###更新文档
    POST http://localhost:9200/questions/_doc/4/_update
Content-Type:application/json

{
    "doc": {
        "title": "Java线程创建方式有哪些?"
    }
}

删除questions中的一个文档:

DELETE http://localhost:9200/questions/_doc/2

在questions索引中读取一个文档:

### 查询文档
GET http://localhost:9200/questions/_doc/4

反馈结果：

GET http://localhost:9200/questions/_doc/4

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "_index": "questions",
        "_type": "_doc",
        "_id": "4",
        "_version": 1,
        "_seq_no": 1,
        "_primary_term": 1,
        "found": true,
        "_source": {
            "id": 4,
            "title": "线程创建方式有哪些?",
            "content": "如何创建线程,每一种创建方式适合哪些情况?"
        }
    }

Response code: 200 (OK); Time: 201ms; Content length: 190 bytes

建立了questions的索引存储结构以后，就可以使用ES进行全文检索:

### 搜索 ES
    POST http://localhost:9200/questions/_search
Content-Type: application/json

{
    "query": {"match": {"title": "类型"}}
}

结果：

POST http://localhost:9200/questions/_search

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "took": 264,
        "timed_out": false,
        "_shards": {
            "total": 1,
            "successful": 1,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": {
                "value": 2,
                "relation": "eq"
            },
            "max_score": 0.7290471,
            "hits": [
                {
                    "_index": "questions",
                    "_type": "_doc",
                    "_id": "1",
                    "_score": 0.7290471,
                    "_source": {
                        "id": 1,
                        "title": "Java的基本类型有哪些?",
                        "content": "每次面试的时候都有人问Java的基本类型，为啥呀!"
                    }
                },
                {
                    "_index": "questions",
                    "_type": "_doc",
                    "_id": "3",
                    "_score": 0.6983144,
                    "_source": {
                        "id": 3,
                        "title": "Java中double类型的数据范围?",
                        "content": "double类型的有效范围是啥，为啥要记住这个范围?"
                    }
                }
            ]
        }
    }

Response code: 200 (OK); Time: 283ms; Content length: 493 bytes

多字段搜索：

### 多字段搜索
    POST http://localhost:9200/questions/_search
Content-Type: application/json

{
    "query": {
        "bool": {
            "should": [
                {"match": {"title": "java类型"}},
                {"match": {"content": "java类型"}}
            ]
        }
    }
}

结果：

POST http://localhost:9200/questions/_search

HTTP/1.1 200 OK
    content-type: application/json; charset=UTF-8

    {
        "took": 43,
        "timed_out": false,
        "_shards": {
            "total": 1,
            "successful": 1,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": {
                "value": 2,
                "relation": "eq"
            },
            "max_score": 2.9808133,
            "hits": [
                {
                    "_index": "questions",
                    "_type": "_doc",
                    "_id": "1",
                    "_score": 2.9808133,
                    "_source": {
                        "id": 1,
                        "title": "Java的基本类型有哪些?",
                        "content": "每次面试的时候都有人问Java的基本类型，为啥呀!"
                    }
                },
                {
                    "_index": "questions",
                    "_type": "_doc",
                    "_id": "3",
                    "_score": 1.7646317,
                    "_source": {
                        "id": 3,
                        "title": "Java中double类型的数据范围?",
                        "content": "double类型的有效范围是啥，为啥要记住这个范围?"
                    }
                }
            ]
        }
    }

Response code: 200 (OK); Time: 384ms; Content length: 492 bytes

2 使用SpringBoot整合Elasticsearch

2.1 spring-data-elasticsearch

Elasticsearch提供的Java客户端有一些不太方便的地方 :

很多地方需要拼接Json字符串，在java中拼接字符串有多恐怖你应该懂的
需要自己把对象序列化为json存储
查询到结果也需要自己反序列化为对象

Elasticsearch使用起来稍显繁琐，Spring提供对Spring-Data-Elasticsearch使用非常简洁方便，下面我就采用Spring-Data-Elasticsearch访问Elasticsearch.

Spring家族整合了非常多的软件组件，也提供了对Elasticsearch支持。这个工具是Spring-Data 家族下子项目，其底层封装了Elasticsearch官方提供的客户端程序。

使用Spring-Data-Elasticsearch的第一步，自然是导入包，在straw-search项目中导入包：

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

导入以后在application.properties中配置参数，包括ES服务器位置，以及日志信息输出配置:

service.port=8099
spring.elasticsearch.rest.uris=http://localhost:9200

logging.level.cn.tedu.straw.search=debug
logging.level.org.elasticsearch.client.RestClient=debug

编写测试案例：

package cn.tedu.straw.search;

import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.elasticsearch.core.ElasticsearchOperations;

import javax.annotation.Resource;

@SpringBootTest
@Slf4j
class StrawSearchApplicationTests {

    @Resource
    ElasticsearchOperations elasticsearchOperations;

    @Test
    void contextLoads() {
        log.debug("{}",elasticsearchOperations);
    }
}

2.2 使用ElasticsearchReposItory

Spring推荐使用ElasticsearchRepository作为数据访问接口。具体用法是，首先定义与ES存储映射的实体类:

public class Item {
    Long id;
    String title; //标题
    String category;// 分类
    String brand; //品牌
    Double price; //价格
    String images; // 图片地址
}

然后通过Spring Data注解来声明字段的映射属性，有下面的三个注解:

@Document 作用在类，标记实体类为文档对象
- indexName:对应索引库名称
@ld作用在成员变量，标记一个字段作为id主键
@Field作用在成员变量，标记为文档的字段，并指定字段映射属性:
- type: 字段类型，取值是枚举: FieldType
- index: 是否索引，布尔类型，默认是true
- store: 是否存储，布尔类型，默认是false
- analyzer: 分词器名称

添加注解以后：

package cn.tedu.straw.search.vo;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.Accessors;
import lombok.extern.slf4j.Slf4j;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.io.Serializable;

@Data
@Slf4j
@Accessors(chain=true)
@NoArgsConstructor  //创建无参数构造器
@AllArgsConstructor  //创建全部参数构造器
//spring ES API 会自动的在ES中创建items索引
@Document(indexName="items")
public class Item implements Serializable{

    @Id
    private Long id;

    @Field(type= FieldType.Text, analyzer="ik_smart", searchAnalyzer = "ik_smart")
    private String title;

    @Field(type = FieldType.Keyword)
    private String category;

    @Field(type = FieldType.Keyword)
    private String brand;

    @Field(type = FieldType.Keyword)
    private String images;

    @Field(type = FieldType.Double)
    private Double price;
}

注意：FieldType.Text表示这个是文本字段，设置analyzer = "ik_ smart" 表示这个字段会创建分词索引。
@Field(type= FieldType.Keyword), 表示这个字段整体作为一个查询关键字，不进行分词
@Field(index= false) 表示这个列不参与构建索引，不会占有索引空间，不会被索引，用于参与查询的属性。

然后定义访问接口：

@Repository
public interface ItemRepository extends ElasticsearchRepository<Item, Long> {
}

Spring Data的强大之处，就在于你不用写任何数据层处理逻辑代码，自动根据方法名或类的信息进行CRUD操作。只要你定义一个接口，然后继承Repository提供的一些子接口，就能具备各种基本的CRUD功能。

只需要定义接口，然后继承它就OK了。

继承ElasticsearchRepository接口，Spring会自动提供实现类，还会继承基本的CRUD方法:此时如果创建一个Item对象，调用ItemRepository到保存方法，数据就会被存储到ES中

@SpringBootTest
@Slf4j
public class ItemRepositoryTests {

    @Resource
    ItemRepository itemRepository;

    @Test
    void save(){
        Item item=new Item()
                .setId(1L)
                .setTitle("小米K20手机")
                .setBrand("小米")
                .setCategory("手机")
                .setImage("image/1.png")
                .setPrice(2345.00);
        Item obj=itemRepository.save(item);
        log.debug("{}",obj);
    }
}

ElasticsearchRepository不仅能够插入一行数据，还提供了批量存储数据到方法saveAll:

@Test
void saveAll(){
    List<Item> list=new ArrayList<>();
    list.add(new Item(2L, "小米手机10",
                      "手机" ,"小米", "2.png", 2000.00));
    list.add(new Item(3L, "小米手机9",
                      "手机" ,"小米", "3.png", 1800.00));
    list.add(new Item(4L, "小米手机8",
                      "手机" ,"小米", "4.png", 1600.00));
    list.add(new Item(5L, "小米手机7",
                      "手机" ,"小米", "5.png", 1500.00));
    list.add(new Item(6L, "小米手机6",
                      "手机" ,"小米", "6.png", 135000.00));
    list.add(new Item(7L, "华为手机10",
                      "手机" ,"华为", "7.png", 3000.00));
    list.add(new Item(8L, "华为手机9",
                      "手机" ,"华为", "8.png", 2800.00));
    itemRepository.saveAll(list);
    log.debug("OK");
}

ElasticsearchRepository提供了查询全部ltem到方法，利用这个方法可以检验上述添加的结果:

@Test
void findAll(){
    Iterable<Item> items=itemRepository.findAll();
    items.forEach(item -> log.debug("{}",item));
}

2.3 全文搜索

Spring Data的另一个强大功能，是根据方法名称自动实现功能。

比如:你的方法名叫做: findByTitle, 那么它就知道你是根据title查询,然后自动帮你完成，无需写实现类，这方法底层最终都是执行的Elasticsearch Rest搜索命令。

使用方法自动生成查询功能时候，方法名称要符合一定的约定， IDEA内置的Spring插件支持这种约定，根据开发工具的自动提示就可以拼接出合适的查询。

比如在ItemRepository接口中定义查询，根据查询参数对title进行全文检索:

/**
     * 搜索title包含相关关键字的Item对象
     * @param title title中的关键字，会被自动进行分词检索
     * @return title包含关键字的Item对象
     */
Iterable<Item> queryItemsByTitleMatches(String title);

注意：Spring Data查询方法返回多个数据时候，使用Iterable<元素>作为返回值，Iterable< ltem> 是Java提供的可迭代接口，相当于可迭代集合，可以使用foreach循环和forEach方法进行遍历。根据方法逻辑设计参数个数，这里是一个查询条件title，就设定一个参数title, Spring会自动利用参数查询。

其作用就与如下原生查询作用一样:

POST http://localhost:9200/items/_search
Content-Type: application/json
{ 
   "query": {"match": { "title": "手机” }}
}

测试案例：

@Test
void queryItemsByTitleMatches(){
    Iterable<Item> items=itemRepository.queryItemsByTitleMatches("手机");
    items.forEach(item -> log.debug("{}",item));
}

再举一个两个查询参数的例子，如果需要查询某个品牌下面的某个商品，则根据参数title和brand进行全文检索: .

    /**
     * 查询某个品牌下面的全部包含title商品
     */
Iterable<Item> queryItemsByTitleMatchesAndBrandMatches(
    String title, String brand
);

这个查询相当于原生的Rest查询：

POST http://localhost:9200/questions/_search
Content-Type: application/json

{
    "query": {
        "bool": {
            "must": [
                {"match": {"title": "手机"}},
                {"match": {"content": "小米"}}
            ]
        }
    }
}

测试案例：

@Test
void queryItemsByTitleMatchesAndBrandMatches(){
    Iterable<Item> items=itemRepository.queryItemsByTitleMatchesAndBrandMatches(
        "手机","小米"
    );
    items.forEach(item -> log.debug("{}",item));
}

如果上述结果顺序比较混乱，实际场景中经常希望对查洵结果进行适当的排序，比如如果希望进行按照价格进行排序，则可以定义如下查询, OrderByPriceAsc 的意思就是按照价格降序排列:

    /**
     * 查询某个品牌下面的全部包含title商品
     */
Iterable<Item> queryItemsByTitleMatchesAndBrandMatchesOrderByPriceAsc(
    String title, String brand
);

相当于原生的Rest查询：

POST http://localhost:9200/questions/_search
Content-Type: application/json

{
    "query": {
        "bool": {
            "should": [
                {"match": {"title": "手机"}},
                {"match": {"brand": "小米"}}
            ]
        }
    },"sort":[{"price":"asc"}]
}

测试案例：

@Test
void queryItemsByTitleMatchesAndBrandMatches(){
    Iterable<Item> items=itemRepository.queryItemsByTitleMatchesAndBrandMatchesOrderByPriceAsc(
        "手机","小米"
    );
    items.forEach(item -> log.debug("{}",item));
}

对于大量查询结果，经常需要分页显示，Spring-data 提供了分页支持，具体做法是：返回值利用Page<类型>封装;方法参数上增加分页参数Pageable:

    /**
     *分页查询
     */
Page<Item> queryItemsByTitleMatchesAndBrandMatchesOrderByPriceAsc(
    String title, String brand, Pageable pageable
);

测试案例：

@Test
void pageable(){
    int pageNum=1; //Spring中的页号从0开始
    int pageSize=3; //页面大小，每页显示的行数
    Page<Item> page=itemRepository.queryItemsByTitleMatchesAndBrandMatchesOrderByPriceAsc(
        "手机","小米", PageRequest.of(pageNum,pageSize)
    );
    //显示当前页中的数据
    page.getContent().forEach(item -> log.debug("{}",item));
    //显示分页原数据
    log.debug("当前页号:{}",page.getNumber());
    log.debug("页面大小:{}",page.getSize());
    log.debug("是否是首页:{}",page.isFirst());
    log.debug("是否是最后一页:{}",page.isLast());
    log.debug("前一页号:{}",page.previousOrFirstPageable().getPageNumber());
    log.debug("后一页号:{}",page.nextOrLastPageable().getPageNumber());
}

结果：

request [POST http://localhost:9200/items/_search?pre_filter_shard_size=128&typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=dfs_query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true] returned [HTTP/1.1 200 OK]
2022-04-16 00:25:24.497 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : Item(id=2, title=小米手机10, category=手机, brand=小米, image=2.png, price=2000.0)
2022-04-16 00:25:24.498 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : Item(id=1, title=小米K20手机, category=手机, brand=小米, image=image/1.png, price=2345.0)
2022-04-16 00:25:24.498 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : Item(id=6, title=小米手机6, category=手机, brand=小米, image=6.png, price=135000.0)
2022-04-16 00:25:24.498 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 当前页号:1
2022-04-16 00:25:24.498 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 页面大小:3
2022-04-16 00:25:24.499 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 是否是首页:false
2022-04-16 00:25:24.499 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 是否是最后一页:true
2022-04-16 00:25:24.499 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 前一页号:0
2022-04-16 00:25:24.499 DEBUG 11020 --- [           main] c.tedu.straw.search.ItemRepositoryTests  : 后一页号:1
2022-04-16 00:25:24.522  WARN 11020 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient   : [Consumer clientId=consumer-straw-1, groupId=straw] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
2022-04-16 00:25:24.523  WARN 11020 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient   : [Consumer clientId=consumer-straw-1, groupId=straw] Bootstrap broker localhost:9092 (id: -1 rack: null) disconnected

posted @ 2022-04-16 00:29 指尖上的未来阅读(6) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· 稻草问答-实现搜索功能

· 稻草问答-迁移问答功能

· ElasticSearch

· 5-6 Elasticsearch

· 使用 Elasticsearch 搭建自己的搜索系统，这个厉害了。。

阅读排行：
· 25岁的心里话
· 闲置电脑爆改个人服务器（超详细） #公网映射 #Vmware虚拟网络编辑器
· 基于 Docker 搭建 FRP 内网穿透开源项目（很简单哒）
· 零经验选手，Compose 一天开发一款小游戏！
· 通过 API 将Deepseek响应流式内容输出到前端

公告

昵称：指尖上的未来
园龄： 4年5个月
粉丝： 0
关注： 0

+加关注

2025年3月

日

一

二

三

四

五

六

指尖上的未来

稻草问答-用Elasticsearch搜索

稻草问答-用Elasticsearch搜索

1 Elasticsearch搜索引擎

1.1 Elasticsearch

1.2 使用RESTful工具访问Elasticsearch

1.3 安装IK分词插件

1.4 使用Elasticsearch进行搜索

2 使用SpringBoot整合Elasticsearch

2.1 spring-data-elasticsearch

2.2 使用ElasticsearchReposItory

2.3 全文搜索

公告

搜索

常用链接

随笔档案

阅读排行榜