学习用Node.js和Elasticsearch构建搜索引擎(7):零停机时间更新索引配置或迁移索引
上一篇说到如果一个索引的mapping设置过了,想要修改type或analyzer,通常的做法是新建一个索引,重新设置mapping,再把数据同步过来。
那么如何实现零停机时间更新索引配置或迁移索引?这就需要用到索引的别名设置。
思路:
1、假设我们的索引是demo_v1,我们定义了一个别名demo,以后所有的操作都用别名demo操作。
2、现在索引demo_v1的mapping设置或者其他一些设置不满足我们的需求了,我们需要修改。新建一个索引demo_v2,同时设置好最新配置。
3、同步索引demo_v1的数据到索引demo_v2。直到同步完。
4、移除索引demo_v1的别名demo,同时设置索引demo_v2的别名为demo。
5、删除索引demo_v1。
6、迁移完成。以后如果还有设置变更,可以按照这个思路继续设置索引demo_v3、demo_v4……
接下来用一个例子说明实现过程,实际项目中我也是按这个思路做的。如果有一些命令操作看不懂,可参看上一篇文章。
1、创建索引demo_v1
> curl -XPUT 'localhost:9200/demo_v1' {"acknowledged":true,"shards_acknowledged":true}%
2、给索引demo_v1添加几条数据
#给索引demo_v1添加了type=fruit的3条数据,每条数据用name和tag两个字段
> curl -XPOST 'localhost:9200/_bulk?pretty' -d' { "index" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "1" }} { "name" : "苹果","tag":"苹果,水果,红富士"} { "create" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "2" }} { "name" : "香蕉","tag":"香蕉,水果,海南,弯弯,小黄人"} { "index" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "3" }} { "name" : "西瓜","tag":"西瓜,水果,圆形,绿,闰土"} '
#返回 { "took" : 34, "errors" : false, "items" : [ { "index" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "created" : true, "status" : 201 } }, { "create" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "2", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "created" : true, "status" : 201 } }, { "index" : { "_index" : "demo_v1", "_type" : "fruit", "_id" : "3", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "created" : true, "status" : 201 } } ] }
3、给索引demo_v1设置别名demo
#设置别名 > curl -XPUT 'localhost:9200/demo_v1/_alias/demo' {"acknowledged":true}%
4、使用别名查看信息
#使用别名查看一下数据,是可以查询到的 > curl -XGET 'localhost:9200/demo/fruit/_search?pretty' #查看mapping > curl -XGET 'localhost:9200/demo/fruit/_mapping?pretty' #返回 { "demo_v1" : { "mappings" : { "fruit" : { "properties" : { "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "tag" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } } #检索数据 > curl -XGET 'http://localhost:9200/demo/fruit/_search?pretty' -d '{ "query" : { "term" : { "tag" : "水" } } }' #返回 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.28582606, "hits" : [ { "_index" : "demo_v1", "_type" : "fruit", "_id" : "1", "_score" : 0.28582606, "_source" : { "name" : "苹果", "tag" : "苹果,水果,红富士" } }, { "_index" : "demo_v1", "_type" : "fruit", "_id" : "3", "_score" : 0.27233246, "_source" : { "name" : "西瓜", "tag" : "西瓜,水果,圆形,绿,闰土" } }, { "_index" : "demo_v1", "_type" : "fruit", "_id" : "2", "_score" : 0.24257512, "_source" : { "name" : "香蕉", "tag" : "香蕉,水果,海南,弯弯,小黄人" } } ] } }
数据因为先前创建索引时没有设置mapping,所以这些设置都是默认设置,分词器也默认标准分词器。
上面检索标签tag中带有“水”的数据,都查询出来了,说明默认分词器把“水果”这个词拆分了。
如果我们需要tag字段按照逗号分词,“水果”作为一个完整的词不拆分该怎么弄呢?
5、新建一个索引demo_v2,同时自定义逗号分词器,并把逗号分词器应用到tag字段上
#新建索引demo_v2
> curl -XPUT 'http://localhost:9200/demo_v2/' -d'{ "settings": { "index": { "analysis": { "analyzer": { "douhao_analyzer": { "pattern": ",", "type": "pattern" } } }, "number_of_shards": 5, "number_of_replicas": 1 } }, "mappings": { "fruit": { "properties": { "name": { "type": "text", "index": "not_analyzed" }, "tag": { "type": "string", "analyzer": "douhao_analyzer", "search_analyzer": "douhao_analyzer" } } } } }'
#返回 {"acknowledged":true,"shards_acknowledged":true}%
关于mapping设置及分词器设置可参见官方文档:
https://www.elastic.co/guide/en/elasticsearch/reference/5.3/mapping.html#mapping-type
https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-analyzers.html
6、同步索引demo_v1中的数据到demo_v2
我使用工具elasticdump同步数据,ElasticDump是一个ElasticSearch的数据导入导出开源工具包。
官方地址:https://github.com/taskrabbit/elasticsearch-dump
同步命令如下:
> elasticdump --input='http://localhost:9200/demo_v1' --output='http://localhost:9200/demo_v2' --type=data Wed, 21 Jun 2017 09:53:15 GMT | starting dump Wed, 21 Jun 2017 09:53:15 GMT | got 3 objects from source elasticsearch (offset: 0) Wed, 21 Jun 2017 09:53:15 GMT | sent 3 objects to destination elasticsearch, wrote 3 Wed, 21 Jun 2017 09:53:15 GMT | got 0 objects from source elasticsearch (offset: 3) Wed, 21 Jun 2017 09:53:15 GMT | Total Writes: 3 Wed, 21 Jun 2017 09:53:15 GMT | dump complete
7、验证一下demo_v2中的数据
#检索tag中包含“水”的数据,检索不到就是正常的 curl -XGET 'http://localhost:9200/demo_v2/fruit/_search?pretty' -d '{ "query" : { "term" : { "tag" : "水" } } }' #检索tag中包含“水果”的数据,可以全部检索到 curl -XGET 'http://localhost:9200/demo_v2/fruit/_search?pretty' -d '{ "query" : { "term" : { "tag" : "水果" } } }'
8、移除索引demo_v1的别名demo,同时设置索引demo_v2的别名为demo
curl -XPOST 'localhost:9200/_aliases?pretty' -d'{ "actions" : [ { "remove" : { "index" : "demo_v1", "alias" : "demo" } }, { "add" : { "index" : "demo_v2", "alias" : "demo" } } ]}'
9、删除索引demo_v1
curl -XDELETE 'localhost:9200/demo_v1'
至此整个迁移完成
ok!