05-Elasticsearch-DSL高级检索[分页, 分词, 权重, 多条件, 过滤, 排序, 关键词高亮, 深度分页, 滚动搜索, 批量Mget]
DSL搜索
词库准备
骚年
帅气
新闻网
新闻
闻网
新
闻
网
索引准备
PUT /shop { "settings": { "number_of_shards": 5, "number_of_replicas": 0 } } POST /shop/_mapping { "properties": { "id": { "type": "long" }, "age": { "type": "integer" }, "username": { "type": "keyword" }, "nickname": { "type": "text", "analyzer": "ik_max_word" }, "money": { "type": "float" }, "desc": { "type": "text", "analyzer": "ik_max_word" }, "sex": { "type": "byte" }, "birthday": { "type": "date" }, "face": { "type": "text", "index": false } } }
数据准备
POST /shop/_doc/ { "id": 1001, "age": 18, "username": "chinanewsAmazing", "nickname": "中国新闻网", "money": 88.8, "desc": "我在中国新闻网到了很多新闻", "sex": 0, "birthday": "2022-09-01", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/527bc4b462d946be81eb900d7c8e63fe.jpg" } POST /shop/_doc/ { "id": 1002, "age": 19, "username": "justbuy", "nickname": "周杰棍", "money": 77.8, "desc": "今天上下班都很堵,车流量很大", "sex": 1, "birthday": "1993-01-24", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1003, "age": 20, "username": "bigFace", "nickname": "飞翔的巨鹰", "money": 66.8, "desc": "中国新闻网团队和导游坐飞机去海外旅游,去了新马泰和欧洲", "sex": 1, "birthday": "1996-01-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1004, "age": 22, "username": "flyfish", "nickname": "水中鱼", "money": 55.8, "desc": "昨天在学校的池塘里,看到有很多鱼在游泳,然后就去中国新闻网学习了", "sex": 0, "birthday": "1988-02-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1005, "age": 25, "username": "gotoplay", "nickname": "ps游戏机", "money": 155.8, "desc": "今年生日,女友送了我一台play station游戏机,非常好玩,非常不错", "sex": 1, "birthday": "1989-03-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1006, "age": 19, "username": "missimooc", "nickname": "我叫小髦", "money": 156.8, "desc": "我叫髦髦,今年20岁,是一名律师,我在琦䯲星球做演讲", "sex": 1, "birthday": "1993-04-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1007, "age": 19, "username": "msgame", "nickname": "gamexbox", "money": 1056.8, "desc": "明天去进货,最近微软处理很多游戏机,还要买xbox游戏卡带", "sex": 1, "birthday": "1985-05-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1008, "age": 19, "username": "muke", "nickname": "新闻学习", "money": 1056.8, "desc": "大学毕业后,可以到i2.chinanews.com.cn进修", "sex": 1, "birthday": "1995-06-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1009, "age": 22, "username": "shaonian", "nickname": "骚年轮", "money": 96.8, "desc": "骚年在大学毕业后,考研究生去了", "sex": 1, "birthday": "1998-07-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1010, "age": 30, "username": "tata", "nickname": "隔壁老王", "money": 100.8, "desc": "隔壁老外去国外出差,带给我很多好吃的", "sex": 1, "birthday": "1988-07-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1011, "age": 31, "username": "sprder", "nickname": "皮特帕克", "money": 180.8, "desc": "它是一个超级英雄", "sex": 1, "birthday": "1989-08-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" } POST /shop/_doc/ { "id": 1012, "age": 31, "username": "super hero", "nickname": "super hero", "money": 188.8, "desc": "BatMan, GreenArrow, SpiderMan, IronMan... are all Super Hero", "sex": 1, "birthday": "1980-08-14", "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg" }
数据准备完成
QueryString检索
URL QueryString 查询
GET /shop/_search?q=desc:慕课网&q=age:18
keyword 搜索
GET /shop/_search?q=username:super
非keyword 搜索
GET /shop/_search?q=nickname:super
DSL检索
检索
POST /shop/_search { "query": { "match": { "desc": "新闻网" } } }
判断字段是否存在 返回存在该字段的数据
POST /shop/_search { "query":{ "exists": { "field": "desc" } } }
查询全部数据
POST /shop/_search { "query": { "match_all": {} } }
过滤显示字段
POST /shop/_search { "query": { "match_all": {} }, "_source": [ "id", "nickname","age" ] }
分页
POST /shop/_search { "query": { "match_all": {} }, "from": 0, "size": 5 }
不分词搜索 单个词条
POST /shop/_search { "query": { "term": { "desc": "新闻网" } }, "_source": [ "id", "nickname","desc" ] }
不分词搜索 多个词条
POST /shop/_search { "query": { "terms": { "desc": ["新闻网","我"] } }, "_source": [ "id", "nickname","desc" ] }
分词搜索
POST /shop/_search { "query": { "match": { "nickname": "新闻网" } }, "_source": [ "id", "nickname","desc" ] }
多词条匹配 顺序(毕业必须在大学后面)
POST /shop/_search { "query": { "match_phrase": { "desc": { "query": "大学 毕业" } } }, "_source": [ "id", "nickname","desc" ] }
多词条匹配 跨越指定词语数量(跨越大于两个词之间的词汇数才能检索出来)
POST /shop/_search { "query": { "match_phrase": { "desc": { "query": "大学 研究生", "slop": 3 } } }, "_source": [ "id", "nickname","desc" ] }
指定匹配词语之间的关系 and | or, 默认为or
POST /shop/_search { "query": { "match": { "desc": { "query": "新闻网学习", "operator": "and" } } }, "_source": [ "id", "nickname","desc" ] }
根据词汇匹配百分比查询数据
POST /shop/_search { "query": { "match": { "desc": { "query": "女友生日送我好玩的xbox游戏机", "minimum_should_match": "60%" } } }, "_source": [ "id", "nickname","desc" ] }
根据词汇匹配个数查询数据
POST /shop/_search { "query": { "match": { "desc": { "query": "女友生日送我好玩的xbox游戏机", "minimum_should_match": 6 } } }, "_source": [ "id", "nickname","desc" ] }
根据ID查询数据
POST /shop/_search { "query": { "ids": { "values": ["KFuMO4MBrxUdfMwL-_sV","J1uMO4MBrxUdfMwL8_v7"] } }, "_source": [ "id", "nickname","desc" ] }
多字段匹配查询
POST /shop/_search { "query": { "multi_match": { "query": "皮特新闻网", "fields": [ "nickname", "desc" ] } }, "_source": [ "id", "nickname","desc" ] }
提升字段权重 在字段后添加^权重值
POST /shop/_search { "query": { "multi_match": { "query": "皮特新闻网", "fields": [ "nickname^10", "desc" ] } }, "_source": [ "id", "nickname","desc" ] }
bool检索
# bool # must 全部匹配(and) # must_not 全部不匹配(!) # should 包含(or)
多条件 and 查询
POST /shop/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "新闻网", "fields": ["desc","nickname"] } }, { "term": { "sex": { "value": "1" } } }, { "term": { "birthday": { "value": "1996-01-14" } } } ] } }, "_source": [ "id","age","sex","nickname","desc","birthday" ] }
多条件 or 查询
POST /shop/_search { "query": { "bool": { "should": [ { "multi_match": { "query": "新闻网", "fields": ["desc","nickname"] } }, { "term": { "sex": { "value": "0" } } }, { "term": { "birthday": { "value": "1996-01-14" } } } ] } }, "_source": [ "id","age","sex","nickname","desc","birthday" ] }
多条件 ! 查询
POST /shop/_search { "query": { "bool": { "must_not": [ { "multi_match": { "query": "新闻网", "fields": ["desc","nickname"] } }, { "term": { "sex": { "value": "0" } } }, { "term": { "birthday": { "value": "1996-01-14" } } } ] } }, "_source": [ "id","age","sex","nickname","desc","birthday" ] }
多条件 组合 查询
# 多条件 组合 查询 # (sex=1 or desc in ("新闻网"[分词])) # and (age = 19) # and (desc not in ("大学"[分词])) POST /shop/_search { "query": { "bool": { "should": [ { "term": { "sex": { "value": "1" } } }, { "match": { "desc": "新闻网" } } ], "must": [ { "multi_match": { "query": "19", "fields": ["age"] } } ], "must_not": [ { "match": { "desc": "大学" } } ] } }, "_source": [ "id","age","sex","nickname","desc","birthday" ] }
查询匹配词汇加权 boost
POST /shop/_search { "query": { "bool": { "should": [ { "match": { "desc": { "query": "新闻网", "boost": 2 } } }, { "match": { "desc": { "query": "律师", "boost": 10 } } } ] } }, "_source": [ "id","age","sex","nickname","desc","birthday" ] }
数据过滤 gt lt gte lte
POST /shop/_search { "query": { "multi_match": { "query": "新闻网", "fields": [ "nickname", "desc" ] } }, "post_filter": { "range": { "age": { "gte": 19, "lte": 20 } } }, "_source": [ "id", "nickname", "age", "desc" ] }
数据排序 desc asc (text 不支持, keyword 支持)
POST /shop/_search { "query": { "multi_match": { "query": "新闻网", "fields": [ "nickname", "desc" ] } }, "post_filter": { "range": { "age": { "gte": 0 } } }, "sort": [ { "age": { "order": "asc" } } ], "_source": [ "id", "nickname", "age", "desc" ] }
使用附属字段使text支持排序
# 使用附属字段使text支持排序 PUT /shop2 { "settings": { "number_of_shards": 1, "number_of_replicas": 0 } } ## 指定mapping, 附属字段 POST /shop2/_mapping { "properties": { "id": { "type": "long" }, "nickname": { "type": "text", "analyzer": "ik_max_word", "fields": { "keyword":{ "type": "keyword" } } } } } POST /shop2/_doc { "id": 1001, "nickname": "美丽的风景" } POST /shop2/_doc { "id": 1002, "nickname": "漂亮的小哥哥" } POST /shop2/_doc { "id": 1003, "nickname": "飞翔的巨鹰" } POST /shop2/_doc { "id": 1004, "nickname": "完美的天空" } POST /shop2/_doc { "id": 1005, "nickname": "广阔的海域" } POST /shop2/_search { "query": { "match_all": { } }, "sort": [ { "nickname.keyword": { "order": "asc" } } ] }
关键词高亮检索(highlight)
POST /shop/_search { "query":{ "match": { "desc": "新闻网" } }, "highlight": { "pre_tags": "<span>", "post_tags": "</span>", "fields": { "desc": {} } } }
深度分页
## from + size > 10000 会报错 POST /shop/_search { "query": { "match_all": {} }, "from": 9990, "size": 10 }
解决方案1: 最大限制到50-100页, 因为用户不会搜索到那么多了
解决方案2:修改最大限制
获取索引的设置
GET /shop/_settings
修改最大限制
PUT /shop/_settings { "index.max_result_window":100000 }
再次执行查询, 就不会报错了
滚动搜索
# scroll 滚动搜索 POST /shop/_search?scroll=1m { "query": { "match_all": {} }, "sort": [ "_doc" ], "size": 5 } # 将第一次返回的scroll_id获取到放入下面的请求中 POST /_search/scroll { "scroll_id":"***", "scroll":"1m" }
批量查询 mget
# 批量查询 mget POST /shop/_mget { "ids":["KFuMO4MBrxUdfMwL-_sV","PFuNO4MBrxUdfMwLLvvc"] }
基本这些可以完成业务中99%的检索了