elasticsearch入门使用(四) 索引、安装IK分词器及增删改查数据
一、查看、创建索引
创建一个名字为user索引:
curl -X PUT 'localhost:9200/stu'
{"acknowledged":true,"shards_acknowledged":true,"index":"stu"}
二、查看索引:http://192.168.56.101:9200/_cat/indices?v IP地址请修改为自己的IP
pri:分片数量 rep:副本集
三、删除索引
curl -X DELETE 'localhost:9200/stu'
{"acknowledged":true}
四、安装ik6.2.2分词器,注意ik的版本最好跟es的版本保持一致
cd /
/usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.2/elasticsearch-analysis-ik-6.2.2.zip
重新启动elasticsearch
systemctl restart elasticsearch
测试IK中文分词器是否安装成功
curl -XGET -H 'Content-Type: application/json' 'http://localhost:9200/_analyze?pretty' -d '{ "analyzer" : "ik_max_word", "text": "中华人民共和国国歌" }'
返回json
{
"tokens" : [
{ "token" : "中华人民共和国", "start_offset" : 0, "end_offset" : 7, "type" : "CN_WORD", "position" : 0 },
{ "token" : "中华人民", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 1 },
{ "token" : "中华", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 2 },
{ "token" : "华人", "start_offset" : 1, "end_offset" : 3, "type" : "CN_WORD", "position" : 3 },
{ "token" : "人民共和国", "start_offset" : 2, "end_offset" : 7, "type" : "CN_WORD", "position" : 4 },
{ "token" : "人民", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 5 },
{ "token" : "共和国", "start_offset" : 4, "end_offset" : 7, "type" : "CN_WORD", "position" : 6 },
{ "token" : "共和", "start_offset" : 4, "end_offset" : 6, "type" : "CN_WORD", "position" : 7 },
{ "token" : "国", "start_offset" : 6, "end_offset" : 7, "type" : "CN_CHAR", "position" : 8 },
{ "token" : "国歌", "start_offset" : 7, "end_offset" : 9, "type" : "CN_WORD", "position" : 9 }
]
}
五、设置索引
假设这个是我们的数据结构,数据类型覆盖还是比较全
stu:索引名称
person:Type名称
analyzer:字段文本分词器 ,默认是analyzed
search_analyzer:搜索分词器,默认是analyzed
ik_max_word:中文分词器
curl -XGET -H 'Content-Type: application/json' 'http://127.0.0.1:9200/stu' -d '
{
"mappings": {
"person": {
"dynamic":true,
"dynamic_date_formats":["yyyy-MM-dd hh:mm:ss", "yyyy-MM-dd" ],
"properties": {
"id": { "type": "integer", "store":true },
"name": { "type": "text", "store": true },
"cname": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word", "store": true },
"age": { "type": "integer" },
"score": { "type": "float" },
"email": { "type": "text", "store":true },
"birthday": { "type": "date", "format":"yyyy-MM-dd","store":true },
"regdate": { "type": "date", "format":"yyyy-MM-dd hh:mm:ss","store":true },
"city": { "type": "keyword", "analyzer": "keyword", "store":true },
"address": { "type": "text", "analyzer": "ik_max_word" }
}
}
}
}'
六、新增数据
curl -XPOST -H 'Content-Type: application/json' '127.0.0.1:9200/stu/person' -d '
{
"id": "11",
"name": "zhang san",
"cnname": "张三",
"age": 20,
"score":80.8,
"email":"zhang.san@163.com",
"birthday":"2000-03-03",
"regdate":"2018-03-03T15:33:33Z",
"city":"PEK",
"address":"上海市闸北区保德路389号"
}'
'127.0.0.1:9200/stu/person' 不指定的话会分配一个ID,如下"_id":"aiO0EWIB1IWtAj8my_8s"
{"_index":"stu","_type":"person","_id":"aiO0EWIB1IWtAj8my_8s","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
注意:以下参数如果在person后指定id=abc123的话会根据这个ID更新或者新增数据,result=created/updated
curl -XPOST -H 'Content-Type: application/json' '127.0.0.1:9200/stu/person/abc123' -d '
{
"id": "111",
"name": "li si",
"cnname": "李四",
"age": 21,
"score":98.9,
"email":"lisi@qq.com",
"birthday":"2008-03-03",
"regdate":"2019-03-03T15:33:33Z",
"city":"SHA",
"address":"江苏省苏州市园区现代大道188号"
}'
{"_index":"stu","_type":"person","_id":"abc123","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
六、删除数据
指定具体的_id删除
curl -X DELETE 'localhost:9200/stu/person/Dpts-2EBY6Wnp0_K3NkH'
根据Query DSL删除,参考语法 。 官方:Delete By Query API
curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/_delete_by_query?pretty' -d '
{
"query": {
"bool":{
"filter": [
{"term":{ "city": "pek" }}
]
}
}
}'
七、修改数据
注意:更新操作会重新更新索引
- 带ID全部字段更新(参考上面带ID新增数据,同样的道理)
略... - 带ID部分字段更新 Elasticsearch Reference [6.2] » Document APIs » Update API
注意doc和script不能同时在一次请求里POST
更新id=abc123设置name="lisi" ,并新增一个属性 "bugs": 0
curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/abc123/_update?pretty' -d '
{
"doc" : {
"name" : "lisi",
"bugs": 0
}
}'
更新id=abc123设置年龄+=4
curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/abc123/_update?pretty' -d '
{
"script" : {
"source": "ctx._source.age += 4"
}
}'
3. _update_by_query根据条件更新 Elasticsearch Reference [6.2] » Document APIs » Update By Query API
在不更新源文件的情况下根据index更新文档,也可以用于新增字段属性
_update_by_query支持script,如果script和doc同时存在会忽略doc
_update_by_query在执时候快照内部索引,当文档生成快照是document正则变更的话将会发生版本冲突,否则的话会更新版本号
0不是一个有效的版本号,因此版本号为0不支持_update_by_query更新
所有的更新和查询失败都会终止,如果只是想进行简单的类似计数器类的功能可以在请求参数里加conflicts=proceed重新尝试更新
URL参数
除了标准的pretty参数外,Update_By_Query还可以支持refresh, wait_for_completion, wait_for_active_shards, timeout and scroll
refresh:URL发送refresh参数会在update完之后更新所有分片的索引,与Index API中的refresh不一样的是只会接受新数据进行索引
wait_for_completion:如果请求中包含wait_for_completion=false,则会进行与检查启动request返回一个task,可以被Index API取消或者查看状态
wait_for_active_shards:控制在处理请求之前必须激活多少个分片副本
timeout:设置分片的从不可用变成可用的时间,
scroll:由于Update_By_Query会进行所有上下文检索,默认时间是5分钟,实例 ?scroll=10m 修改为10分钟
requests_per_second:设置一个正整数,控制等待时间内每个批次操作索引的数量
完整实例:
GET stu/_update_by_query?pretty&conflicts=proceed&refresh=true&timeout=1s
{
"script": {
"source": "ctx._source.name=\"lisi2\";ctx._source.bugs=10"
},
"query": {
"bool": {
"filter": [
{"term": { "id": "111" } }
]
}
}
}
下面错误示范:无法更新bugs。
curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/stu/person/_update_by_query?conflicts=proceed&pretty' -d '
{
"script": {
"source": "ctx._source.age++",
"bugs": 10
},
"query": {
"bool":{
"filter": [
{"term":{ "id": "111" }},
{"term":{ "name": "lisi" }}
]
}
}
}'
八、查询数据
查询部分请参考 elasticsearch入门使用(三) Query DSL