1 简介
ElasticStack在升级到5.0版本之后,带来了一个新的脚本语言,painless。这里说“新的“是相对与已经存在groove而言的。还记得Groove脚本的漏洞吧,Groove脚本开启之后,如果被人误用可能带来各种漏洞,为什么呢,主要是这些外部的脚本引擎太过于强大,什么都能做,用不好或者设置不当就会引起安全风险,基于安全和性能方面,所以elastic.co开发了一个新的脚本引擎,名字就叫Painless,顾名思义,简单安全,无痛使用,和Groove的沙盒机制不一样,Painless使用白名单来限制函数与字段的访问,针对es的场景来进行优化,只做es数据的操作,更加轻量级,速度要快好几倍,并且支持Java静态类型,语法保持Groove类似,还支持Java的lambda表达式
2 官网文档
1)官网文档入口
https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-guide.html
2)官网文档入门案例
https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-walkthrough.html
3)官网文档painless特性
https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-lang-spec.html
4)官网文档painless API
https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-api-reference-shared.html
3 特性
1)高性能:painless在es的运行速度是其他语言的数倍。
2)安全:使用白名单来限制函数与字段的访问,避免了可能的安全隐患
3)可选类型:你可以在脚本当中使用强类型的编程方式或者动态类型的编程方式。
4)语法:扩展了java的基本语法以兼容groove风格的脚本语言特性,使得plainless易读易写。如果您熟悉java语言,那就太好了!因为painless就继承自java8。我们的你几乎完全可以把painless当做java来写,或者把写好的java代码,放在脚本里边来执行
5)有针对的优化:这门语言是为elasticsearch专门定制的
4 基本语法结构
POST /seats/_update/3
{
"script": {
"source": "ctx._source.sold = true; ctx._source.cost = params.sold_cost",
"lang": "painless",
"params": {
"sold_cost": 26
}
}
}
source: 所谓的脚本内容。我们可以写一些代码逻辑,来处理我们的每一条数据。这里有我们很大的操作空间。想要对数据做什么操作,都可以写在这里,最后被执行。
lang:就是脚本语言。es5版本之后就只支持这一种脚本语言了。(这里可以是一段java代码)
params:实际上就是入参。时间就是一个 Map<String,Object> 对象。我们可以在上边的source里边来访问到 params。语法也很简单, 就通过 params.字段 就可以获取到值。 这就像是.get()。
如果你熟悉java语言的话,完全可以把 source当做方法体,而params就当做是入参。
ctx: ctx是出现在source里边的内容,它看起来不好理解(单请牢记它,它会频繁的出现在脚本中)。实际上,它就是一个document对象。我们通常使用 ctx._source 来获取到我们要操作这条数据对象。
5 官网入门案例
使用最好还是看官方文档,讲的很详细。
1)添加数据
#添加数据
PUT hockey/_bulk?refresh
{"index":{"_id":1}}
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
{"index":{"_id":2}}
{"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
{"index":{"_id":3}}
{"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
{"index":{"_id":4}}
{"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
{"index":{"_id":5}}
{"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
{"index":{"_id":6}}
{"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
{"index":{"_id":7}}
{"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
{"index":{"_id":8}}
{"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
{"index":{"_id":39}}
{"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
{"index":{"_id":10}}
{"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
{"index":{"_id":11}}
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}
2)查询添加的数据
GET hockey/_search
3)案例1
查询时,自定义分数,以goals数组的和为分数
function_score是实现精细化控制评分,更多介绍查看:https://www.cnblogs.com/jthr/p/17141470.html
GET hockey/_search
{
"query": {
"function_score": {
"script_score": {
"script": {
"lang": "painless",
"source": """
int total = 0;
for (int i = 0; i < doc['goals'].length; ++i) {
total += doc['goals'][i];
}
return total;
"""
}
}
}
}
}
4)案例2
查询 script_fields 运行时类型,查询时额外返回新的自定义的字段,这里是total_goals
运行时类型更多查看:https://blog.csdn.net/laoyang360/article/details/120574142
GET hockey/_search
{
"query": {
"match_all": {}
},
"script_fields": {
"total_goals": {
"script": {
"lang": "painless",
"source": """
int total = 0;
for (int i = 0; i < doc['goals'].length; ++i) {
total += doc['goals'][i];
}
return total;
"""
}
}
}
}
5)案例3
下面的示例使用Painless脚本按名字和姓氏的组合对球员进行排序
GET hockey/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"type": "string",
"order": "asc",
"script": {
"lang": "painless",
"source": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value"
}
}
}
}
6)案例4
把编号为1的last值改为hockey,这里带了参数params就是参数
POST hockey/_update/1
{
"script": {
"lang": "painless",
"source": "ctx._source.last = params.last",
"params": {
"last": "hockey"
}
}
}
7)案例5
向编号为1的文档添加字段nick,值为hockey
POST hockey/_update/1
{
"script": {
"lang": "painless",
"source": """
ctx._source.last = params.last;
ctx._source.nick = params.nick
""",
"params": {
"last": "gaudreau",
"nick": "hockey"
}
}
}
8)案例6
查询时获取自定义字段birth_year,取字段born的年
GET hockey/_search
{
"script_fields": {
"birth_year": {
"script": {
"source": "doc.born.value.year"
}
}
}
}
6 简单的几个示例
6.1 数据准备
DELETE /product
PUT /product/_doc/1
{
"name" : "xiaomi phone",
"desc" : "shouji zhong de zhandouji",
"price" : 3999,
"tags": [ "xingjiabi", "fashao", "buka" ]
}
PUT /product/_doc/2
{
"name" : "xiaomi nfc phone",
"desc" : "zhichi quangongneng nfc,shouji zhong de jianjiji",
"price" : 4999,
"tags": [ "xingjiabi", "fashao", "gongjiaoka" ]
}
PUT /product/_doc/3
{
"name" : "nfc phone",
"desc" : "shouji zhong de hongzhaji",
"price" : 2999,
"tags": [ "xingjiabi", "fashao", "menjinka" ]
}
PUT /product/_doc/4
{
"name" : "xiaomi erji",
"desc" : "erji zhong de huangmenji",
"price" : 999,
"tags": [ "low", "bufangshui", "yinzhicha" ]
}
PUT /product/_doc/5
{
"name" : "hongmi erji",
"desc" : "erji zhong de kendeji",
"price" : 399,
"tags": [ "lowbee", "xuhangduan", "zhiliangx" ]
}
6.2 分桶聚合查询
按照tags的内容分组聚合
GET /product/_search
{
"size": 0,
"aggs": {
"group_by_aggs": {
"terms": {
"field": "tags.keyword",
"size": 10
}
}
}
}
6.3 先过滤,再分桶聚合查询
先过滤掉几个小于1000的,再按照tags的内容分组聚合
GET /product/_search
{
"query": {
"bool": {
"filter": [
{
"range": {
"price": {
"lt": 1000
}
}
}
]
}
},
"aggs": {
"group_by_aggs": {
"terms": {
"field": "tags.keyword",
"size": 10
}
}
}
}
6.4 先过滤,再使用painless统计价格
查询出包含fashao的,1000及以下的不打折,1000以上的打5折,统计总价格
1)doc[]的写法
GET /product/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"tags": "fashao"
}
}
]
}
},
"aggs": {
"sum_aggs": {
"sum": {
"script": {
"lang": "painless",
"source":
"""
if(doc['price'].value <= 1000){
return doc['price'].value;
}else{
return doc['price'].value * 0.5;
}
"""
}
}
}
}
}
2)params[_source][xxx]
注意,如果tags里面的是复杂类型如对象,那么通过doc是取不到的,需要通过params[_source][xxx]去取
#第二种写法
GET /product/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"tags": "fashao"
}
}
]
}
},
"aggs": {
"sum_aggs": {
"sum": {
"script": {
"lang": "painless",
"source":
"""
if(params['_source']['price'] <= 1000){
return params['_source']['price'];
}else{
return params['_source']['price'] * 0.5;
}
"""
}
}
}
}
}
6.5 painless来排序
# 按照tags的大小排序
GET /product/_search
{
"sort": {
"_script": {
"type": "number",
"order": "asc",
"script": {
"lang": "painless",
"source":
"""
return doc['desc.keyword'].value.length()
"""
}
}
}
}