Elastic Stack:es 索引index入门
一.索引操作
直接put数据 PUT index/_doc/1,es会自动生成索引,并建立动态映射dynamic mapping。
在生产上,我们需要自己手动建立索引和映射,为了更好地管理索引。就像数据库的建表语句一样。
创建索引语法:
PUT /index { "settings": { ... any settings ... }, "mappings": { "properties" : { "field1" : { "type" : "text" } } }, "aliases": { "default_index": {} } }
举例:
PUT /my_index { "settings": { "number_of_shards": 1, "number_of_replicas": 1 }, "mappings": { "properties": { "field1":{ "type": "text" }, "field2":{ "type": "text" } } }, "aliases": { "default_index": {} } }
查询索引:
GET /my_index/_mapping GET /my_index/_setting
修改副本数:
PUT /my_index/_settings { "index" : { "number_of_replicas" : 2 } }
删除索引:
DELETE /my_index
DELETE /my_index*
为了安全起见,防止恶意删除索引,删除时必须指定索引名:在elasticsearch.yml配置:action.destructive_requires_name: true
二.定制分词器
默认分词器:
standard
分词三个组件:
character filter:在一段文本进行分词之前,先进行预处理
tokenizer:分词
token filter:对词标准化,lowercase,stop word,synonymom
standard tokenizer:以单词边界进行切分
standard token filter:什么都不做
lowercase token filter:将所有字母转换为小写
启用english停用词token filter
PUT /my_index { "settings": { "analysis": { "analyzer": { "es_std": { "type": "standard", "stopwords": "_english_" } } } } }
定制化自己的分词器
PUT /my_index { "settings": { "analysis": { "char_filter": { "&_to_and": { "type": "mapping", "mappings": ["&=> and"] } }, "filter": { "my_stopwords": { "type": "stop", "stopwords": ["the", "a"] } }, "analyzer": { "my_analyzer": { "type": "custom",
#三个组件都可以定制 "char_filter": ["html_strip", "&_to_and"], "tokenizer": "standard", "filter": ["lowercase", "my_stopwords"] } } } } }
三.定制dynamic mapping
true:遇到陌生字段,就进行dynamic mapping
false:新检测到的字段将被忽略。这些字段将不会被索引,因此将无法搜索,但仍将出现在返回点击的源字段中。这些字段不会添加到映射中,必须显式添加新字段。
strict:遇到陌生字段,就报错
创建mapping时:
PUT /my_index { "mappings": { "dynamic": "strict", "properties": { "title": { "type": "text" }, "address": { "type": "object", "dynamic": "true" } } } }
date_detection:日期探测,默认会按照一定格式识别date,yyyy-MM-dd,如果有需要,自己手动指定某个field为date类型
numeric_detection:数字探测,默认不开启
定制自己的dynamic mapping template:
PUT /my_index { "mappings": { "dynamic_templates": [ { "en": { "match": "*_en", "match_mapping_type": "string", "mapping": { "type": "text", "analyzer": "english" } } } ] } }