Elasticsearch探索实践（一）

Elasticsearch是一个开源的分布式、RESTful 风格的搜索和数据分析引擎，它的底层是开源库Apache Lucene。
Lucene 可以说是当下最先进、高性能、全功能的搜索引擎库——无论是开源还是私有，但它也仅仅只是一个库。为了充分发挥其功能，你需要使用 Java 并将 Lucene 直接集成到应用程序中。更糟糕的是，您可能需要获得信息检索学位才能了解其工作原理，因为Lucene 非常复杂。
为了解决Lucene使用时的繁复性，于是Elasticsearch便应运而生。它使用 Java 编写，内部采用 Lucene 做索引与搜索，但是它的目标是使全文检索变得更简单，简单来说，就是对Lucene 做了一层封装，它提供了一套简单一致的 RESTful API 来帮助我们实现存储和检索。
当然，Elasticsearch 不仅仅是 Lucene，并且也不仅仅只是一个全文搜索引擎。它可以被下面这样准确地形容：

一个分布式的实时文档存储，每个字段可以被索引与搜索；
一个分布式实时分析搜索引擎；
能胜任上百个服务节点的扩展，并支持 PB 级别的结构化或者非结构化数据。

由于Elasticsearch的功能强大和使用简单，维基百科、卫报、Stack Overflow、GitHub等都纷纷采用它来做搜索。现在，Elasticsearch已成为全文搜索领域的主流软件之一。

2.Elasticsearch和关系型数据术语对照表

关系数据库数据库表行列(Columns) Elasticsearch 索引(Index) 类型(type) 文档(Docments) 字段(Fields)

聚合

post :http://localhost:9200/person/man/_search

{
    "aggs":{
            "max_aggs":{
            "max":{
                "field":"age"
            }
        },
        "min_aggs":{
            "min":{
                "field":"age"
            }
        }
        }
    }
}

3.查看当前节点的所有 Index。

$ curl -X GET 'http://localhost:9200/_cat/indices?v' #_cat 查看的意思 固定写法，同理_update

4.列出每个 Index 所包含的 Type

http://localhost:9200/_mapping?pretty=true

{
    "person": {
        "mappings": {
            "man": {
                "properties": {
                    "age": {
                        "type": "integer"
                    },
                    "doc": {
                        "properties": {
                            "name": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            }
                        }
                    },
                    "name": {
                        "type": "text"
                    }
                }
            }
        }
    },
    "conference": {
        "mappings": {
            "event": {
                "properties": {
                    "attendees": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "date": {
                        "type": "date"
                    },
                    "description": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "host": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "reviews": {
                        "type": "long"
                    },
                    "title": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
                }
            }
        }
    },
    "lxw_chome": {
        "mappings": {}
    },
    "lxw": {
        "mappings": {}
    },
    "lxw_index": {
        "mappings": {
            "doc": {
                "properties": {
                    "age": {
                        "type": "text"
                    },
                    "description": {
                        "type": "text"
                    },
                    "name": {
                        "type": "text"
                    },
                    "pic": {
                        "type": "text",
                        "index": false
                    },
                    "price": {
                        "type": "scaled_float",
                        "scaling_factor": 100.0
                    },
                    "studymodel": {
                        "type": "keyword"
                    },
                    "timestamp": {
                        "type": "date",
                        "format": "yyyy‐MM‐dd HH:mm:ss||yyyy‐MM‐dd"
                    }
                }
            }
        }
    }
}

5. 新增记录

向指定的 /Index/Type 发送 PUT 请求，就可以在 Index 里面新增一条记录。比如，向/accounts/person发送请求，就可以新增一条人员记录。

$ curl -X PUT 'localhost:9200/accounts/person/1' -d '
{
  "user": "张三",
  "title": "工程师",
  "desc": "数据库管理"
}'

注意，如果没有先创建 Index（这个例子是accounts），直接执行上面的命令，Elastic 也不会报错，而是直接生成指定的 Index。所以，打字的时候要小心，不要写错 Index 的名称。

6. 返回所有记录

使用 GET 方法，直接请求/Index/Type/_search，就会返回所有记录。


$ curl 'localhost:9200/accounts/person/_search'

{
  "took":2,
  "timed_out":false,
  "_shards":{"total":5,"successful":5,"failed":0},
  "hits":{
    "total":2,
    "max_score":1.0,
    "hits":[
      {
        "_index":"accounts",
        "_type":"person",
        "_id":"AV3qGfrC6jMbsbXb6k1p",
        "_score":1.0,
        "_source": {
          "user": "李四",
          "title": "工程师",
          "desc": "系统管理"
        }
      },
      {
        "_index":"accounts",
        "_type":"person",
        "_id":"1",
        "_score":1.0,
        "_source": {
          "user" : "张三",
          "title" : "工程师",
          "desc" : "数据库管理，软件开发"
        }
      }
    ]
  }
}

上面代码中，返回结果的 took字段表示该操作的耗时（单位为毫秒），timed_out字段表示是否超时，hits字段表示命中的记录，里面子字段的含义如下。

total：返回记录数，本例是2条。
max_score：最高的匹配程度，本例是1.0。
hits：返回的记录组成的数组。

返回的记录中，每条记录都有一个_score字段，表示匹配的程序，默认是按照这个字段降序排列。

Elastic 默认一次返回10条结果，可以通过size字段改变这个设置。


$ curl 'localhost:9200/accounts/person/_search'  -d '
{
  "query" : { "match" : { "desc" : "管理" }},
  "size": 1
}'

上面代码指定，每次只返回一条结果。

还可以通过from字段，指定位移。


$ curl 'localhost:9200/accounts/person/_search'  -d '
{
  "query" : { "match" : { "desc" : "管理" }},
  "from": 1,
  "size": 1
}'

上面代码指定，从位置1开始（默认是从位置0开始），只返回一条结果。

6.3 逻辑运算

如果有多个搜索关键字， Elastic 认为它们是or关系。


$ curl 'localhost:9200/accounts/person/_search'  -d '
{
  "query" : { "match" : { "desc" : "软件 系统" }}
}'

上面代码搜索的是软件 or 系统。

如果要执行多个关键词的and搜索，必须使用布尔查询。

加上 -d 参数后就是 POST 请求了，不是 GET


$ curl 'localhost:9200/accounts/person/_search'  -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "desc": "软件" } },
        { "match": { "desc": "系统" } }
      ]
    }
  }
}'

7.“如果要执行多个关键词的and搜索，必须使用布尔查询。”
可以用以下方式：

"match" : {
"message" : {
"query" : "this is a test",
"operator" : "and"
}
}

参考：https://www.jianshu.com/p/689344d5109d

全文搜索引擎 Elasticsearch 入门教程 - 阮一峰的网络日志

posted @ 2021-09-17 15:06 码农编程进阶笔记阅读(52) 评论(0) 编辑收藏举报

刷新页面返回顶部

码农编程进阶笔记

QQ:1836145666 | QQ技术交流群: 282568843 | 微信：l1836145666
欢迎讨论PHP，GO， Python知识，有问题联系我

Elasticsearch探索实践（一）

6.3 逻辑运算

赞赏码

公告

码农编程进阶笔记

QQ:1836145666 | QQ技术交流群: 282568843 | 微信：l1836145666 欢迎讨论PHP，GO， Python知识， 有问题联系我

Elasticsearch探索实践（一）

6.3 逻辑运算

公告

QQ:1836145666 | QQ技术交流群: 282568843 | 微信：l1836145666
欢迎讨论PHP，GO， Python知识，有问题联系我