Elasticsearch 大量频繁使用UpdateByQuery 脚本更新字段值 报错

下面是出错内容:

WARNING:elasticsearch:POST http://es-cn-09k1o69vj0006jcz9.public.elasticsearch.aliyuncs.com:9200/crawl_basis_pn/_update_by_query [status:500 request:0.015s]
DEBUG:elasticsearch:> {"query":{"term":{"_id":"bQlgboYBwWirVBbOLVBj"}},"script":{"source":"ctx._source.ProductUrl='https://www.bom2buy.com/partIntelligence/TL431AIYDT/';ctx._source.SubStatus=1"}}
DEBUG:elasticsearch:< {"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting","bytes_wanted":0,"bytes_limit":0,"durability":"TRANSIENT"}],"type":"general_script_exception","reason":"Failed to compile inline script [ctx._source.ProductUrl='https://www.bom2buy.com/partIntelligence/TL431AIYDT/';ctx._source.SubStatus=1] using lang [painless]","caused_by":{"type":"circuit_breaking_exception","reason":"[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting","bytes_wanted":0,"bytes_limit":0,"durability":"TRANSIENT"}},"status":500}
ERROR:scrapy.core.engine:Error while obtaining start requests

  ElasticSearch5分钟内执行脚本编译超过75个,编译太多而拒绝编译。编译是非常耗时的,这是ES的自我保护功能。下面是源码:

  这个函数会时刻调用,要更新200w 条

    def update_producturl(self,item):
       time.sleep(0.5)
       productUrl="https://www.bom2buy.com/partIntelligence/"+urllib.parse.quote(item['PN'],safe='')+"/"
       ubq = UpdateByQuery(using=esclient(), index=index_name)  \
      .query("term", _id=item['Id'])   \
      .script(source=f"ctx._source.ProductUrl='{productUrl}';ctx._source.SubStatus=1") 
       res=ubq.execute()
       r=res

 

尝试解决办法:

  将参数写入params,源码source就不需要重复编译。

   def update_producturl(self,item):
       time.sleep(0.5)
       productUrl="https://www.bom2buy.com/partIntelligence/"+urllib.parse.quote(item['PN'],safe='')+"/"
       ubq = UpdateByQuery(using=esclient(), index=index_name)  \
      .query("term", _id=item['Id'])   \
      .script(source=f"ctx._source.ProductUrl=params.productUrl;ctx._source.SubStatus=1",
               params={
                    'productUrl': productUrl
              }) 
       res=ubq.execute()
       r=res

 

   

posted on 2023-06-05 16:41  花阴偷移  阅读(329)  评论(0编辑  收藏  举报

导航