Elasticsearch的SQL查询

使用SQL查询ES有一定的局限性,没有原生的Query DSL那么强大,对于嵌套属性和某些函数的支持并不怎么好,但是平时用来查询下数据基本够用了。

在线sql转dsl:http://www.atotoa.com/

elasticsearch-sql插件

注:本人安装的是es 7.x(7.10.2),默认是支持SQL的,所以不用安装elasticsearch-sql插件。

1、下载 https://github.com/NLPchina/elasticsearch-sql

2、下载elasticsearch-sql,修改名字为sql。放在es解压目录的plugins文件中即可。重启es则sql安装完毕

支持的SQL语法

首先我们创建一个索引:

{
    "settings": {
        "index": {
            "number_of_shards": 3,
            "number_of_replicas": 2
        }
    },
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
		"fields":{"keyword": {"type": "keyword","ignore_above": 256}} //可以对一个字段提供多种索引模式,同一个字段的值,一个分词,一个不分词  
            },
            "author": {
                "type": "text",
		"fields":{"keyword": {"type": "keyword","ignore_above": 256}} 
            },
            "release_date": {
                "type": "date",
		"format":"yyyy-MM-dd"
				
            },
            "page_count": {
                "type": "integer"
            }
        }
    }
}

这里要特别注意:如果要用到like这种模糊操作,但是type是text的,那么就还需要为该字段设置不分词的索引方式,即type=keyword,如上面的name和author字段。

批量插入数据(使用Postman)

PUT /book_info/_bulk
{ "index" : {  "_id" : "101" } }
{ "name" : "hello","author":"tom","release_date":"2020-01-26","page_count": 500 }

{ "index" : {  "_id" : "102" } }
{ "name" : "world","author":"Marry","release_date":"2018-01-26","page_count": 352 }

{ "index" : {  "_id" : "103" } }
{ "name" : "test","author":"test","release_date":"2019-03-26","page_count": 436 }

{ "index" : {  "_id" : "104" } }
{ "name" : "demo","author":"demo","release_date":"2020-06-26","page_count": 337 }

{ "index" : {  "_id" : "105" } }
{ "name" : "zhangsan","author":"zhangsan","release_date":"2015-05-21","page_count": 562 }

{ "index" : {  "_id" : "106" } }
{ "name" : "zhangsan02","author":"zhangsan","release_date":"2015-05-21","page_count": 562 }
#这里需要换行

 

语法格式

语法格式:/_sql?format=txt

format 返回格式,不设置则默认返回JSON (支持,csv,txt,json,yaml。

请求体body参数:

  • query:要查询的sql参数,如:"select * from user"
  • fetch_size:一个响应中要返回的最大行(或条目)数,默认是1000
  • filter:可选的Elasticsearch查询DSL,以进行其他过滤。
  • request_timeout:请求失败之前的超时,默认是90s
  • page_timeout:分页请求失败之前的超时,默认是45s
  • columnar:以列方式而不是基于行的方式返回结果。有效期为json,yaml,cbor和smile。默认是false。
  • field_multi_value_leniency:当遇到一个字段的多个值(默认值)或宽大处理时抛出异常,并从列表中返回第一个值(不保证将要返回的值-通常以自然的升序方式第一个)。默认是false。
  • index_include_frozen:是否在查询执行中包括冻结索引(默认值)。默认是false。

条件查询

1、单条件查询

POST /_sql?format=txt
{
  "query": "SELECT * FROM book_info where name = 'hello'"
}

2、多条件查询

POST /_sql?format=txt
{
  "query": "SELECT * FROM book_info where name = 'hello' and author='tom'"
}

排序+限制行数

POST /_sql?format=txt
{
  "query": "SELECT * FROM book_info ORDER BY release_date DESC LIMIT 2"
}

排序+限制返回字段

POST /_sql?format=txt
{
  "query": "SELECT name,release_date,page_count FROM book_info ORDER BY release_date DESC  LIMIT 3"
}

时间比较查询

POST /_sql?format=txt
{
  "query": "SELECT * FROM book_info WHERE release_date < '2020-01-01'  LIMIT 3"
}

聚合查询

POST /_sql?format=txt
{
  "query": "SELECT sum(page_count) FROM book_info"
}

POST /_sql?format=txt
{
  "query": "SELECT sum(page_count),name FROM book_info group by name"
}

POST /_sql?format=txt
{
  "query": "SELECT avg(page_count) FROM book_info"
}

POST /_sql?format=txt
{
  "query": "SELECT min(page_count) FROM book_info"
}

POST /_sql?format=txt
{
  "query": "SELECT max(page_count) FROM book_info"
}

count查询

POST /_sql?format=txt
{
  "query": "SELECT count(*),count(distinct author) FROM book_info "
}

分组查询

POST /_sql?format=txt
{
  "query": "SELECT author,count(*) as count FROM book_info group by author"
}

SQL转DSL

普通sql

POST /_sql/translate 
{ 
  "query": "SELECT name,author FROM book_info where name ='hello'", 
  "fetch_size": 1
} 

右模糊

POST /_sql/translate 
{ 
  "query": "SELECT name,author FROM book_info where name like 'hello%'", 
  "fetch_size": 10 
} 

时间区间

POST /_sql/translate 
{
  "query": "SELECT name,author,release_date FROM book_info WHERE release_date >'2019-01-01' and release_date < '2020-01-01'  LIMIT 3"
}

常用的SQL总结

在ES中使用SQL查询的语法与在数据库中使用基本一致,具体格式如下:

SELECT select_expr [, ...]
[ FROM table_name ]
[ WHERE condition ]
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count ] ]
[ PIVOT ( aggregation_expr FOR column IN ( value [ [ AS ] alias ] [, ...] ) ) ]

WHERE

可以使用WHERE语句设置查询条件,比如查询state字段为VA的记录,查询语句如下:

POST /_sql?format=txt
{
  "query": "SELECT account_number,address,age,balance,state FROM account WHERE state='VA' LIMIT 10 "
}

GROUP BY

我们可以使用GROUP BY语句对数据进行分组,统计出分组记录数量,最大age和平均balance等信息,查询语句如下:

POST /_sql?format=txt
{
  "query": "SELECT state,COUNT(*),MAX(age),AVG(balance) FROM account GROUP BY state LIMIT 10"
}

HAVING

我们可以使用HAVING语句对分组数据进行二次筛选,比如筛选分组记录数量大于15的信息,查询语句如下:

POST /_sql?format=txt
{
  "query": "SELECT state,COUNT(*),MAX(age),AVG(balance) FROM account GROUP BY state HAVING COUNT(*)>15 LIMIT 10"
}

ORDER BY

我们可以使用ORDER BY语句对数据进行排序,比如按照balance字段从高到低排序,查询语句如下:

POST /_sql?format=txt
{
  "query": "SELECT account_number,address,age,balance,state FROM account ORDER BY balance DESC LIMIT 10 "
}

DESCRIBE

我们可以使用DESCRIBE语句查看表(ES中为索引)中有哪些字段,比如查看account表的字段,查询语句如下:

POST /_sql?format=txt
{
  "query": "DESCRIBE account"
}

SHOW TABLES

我们可以使用SHOW TABLES查看所有的表(ES中为索引)。

POST /_sql?format=txt
{
  "query": "SHOW TABLES"
}

支持的函数

使用SQL查询ES中的数据,不仅可以使用一些SQL中的函数,还可以使用一些ES中特有的函数。

1、我们可以使用SHOW FUNCTIONS语句查看所有支持的函数,比如搜索所有带有DATE字段的函数可以使用如下语句:

POST /_sql?format=txt
{
  "query": "SHOW FUNCTIONS LIKE '%DATE%'"
}

2、全文搜索函数是ES中特有的,当使用MATCH或QUERY函数时,会启用全文搜索功能,SCORE函数可以用来统计搜索评分。

(1)使用MATCH函数查询address中包含Street的记录。

POST /_sql?format=txt
{
  "query": "SELECT account_number,address,age,balance,SCORE() FROM account WHERE MATCH(address,'Street') LIMIT 10"
}

(2)使用QUERY函数查询address中包含Street的记录。

POST /_sql?format=txt
{
  "query": "SELECT account_number,address,age,balance,SCORE() FROM account WHERE QUERY('address:Street') LIMIT 10"
}

 

posted @ 2022-02-13 22:12  残城碎梦  阅读(3168)  评论(0编辑  收藏  举报