Elasticsearch Ingest pipelines
Ingest pipelines
允许对数据在索引之前进行转换,例如过滤、转换字段等。
前置要求:
集群中至少有1个以上节点的角色是 ingest
,在 Elasticsearch Node节点角色 章节中我们说到,如果ingest的工作量大,建议使用专用ingest节点。
如果ES开启了安全管理,需要有 manage_pipeline
权限。
创建pipeline 支持的processors参考 Processor reference
PUT _ingest/pipeline/my-pipeline
{
"description": "My optional pipeline description",
"processors": [
{
"set": {
"description": "My optional processor description",
"field": "my-long-field",
"value": 10
}
},
{
"set": {
"description": "Set 'my-boolean-field' to true",
"field": "my-boolean-field",
"value": true
}
},
{
"lowercase": {
"field": "my-keyword-field"
}
}
]
}
测试pipeline
POST _ingest/pipeline/my-pipeline/_simulate
{
"docs": [
{
"_source": {
"my-keyword-field": "FOO"
}
},
{
"_source": {
"my-keyword-field": "BAR"
}
}
]
}
或
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"lowercase": {
"field": "my-keyword-field"
}
}
]
},
"docs": [
{
"_source": {
"my-keyword-field": "FOO"
}
},
{
"_source": {
"my-keyword-field": "BAR"
}
}
]
}
使用pipeline
使用参数方式
POST my-data-stream/_doc?pipeline=my-pipeline
PUT my-data-stream/_bulk?pipeline=my-pipeline
POST my-data-stream/_update_by_query?pipeline=my-pipeline
对reindex
POST _reindex
{
"source": {
"index": "my-data-stream"
},
"dest": {
"index": "my-new-data-stream",
"op_type": "create",
"pipeline": "my-pipeline"
}
}
对index settings或index template settings 使用 index.default_pipeline
,当没有pipeline参数时起作用。
"settings": {
"index": {
"default_pipeline": "sw_segment_pipeline"
}
}
对index settings或index template settings 使用 index.final_pipeline
,当没有pipeline参数和default_pipeline
时起作用。