后置过滤Post filter

post_filter应用于搜索请求最后的搜索结果,此时已经计算了聚合。它的目的可以用例子来解释:

想象一下,你正在销售的衬衫具有以下特性:

PUT /shirts
{
    "mappings": {
        "_doc": {
            "properties": {
                "brand": { "type": "keyword"},
                "color": { "type": "keyword"},
                "model": { "type": "keyword"}
            }
        }
    }
}

PUT /shirts/_doc/1
{
  "brand":"gucci",
  "color":"red",
  "model":"slim"
}
PUT /shirts/_doc/2
{
  "brand":"senma",
  "color":"blue",
  "model":"shirt"
}
PUT /shirts/_doc/3
{
  "brand":"gucci",
  "color":"red",
  "model":"shirt"
}

PUT /shirts/_doc/4
{
  "brand":"gucci",
  "color":"blue",
  "model":"shirt"
}

假设用户指定了两个过滤器:
颜色:red和品牌:gucci。你只希望在搜索结果中显示Gucci的red衬衫。通常你会这样做一个bool查询:

GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  }
}
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "1",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "slim"
        }
      },
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "3",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "shirt"
        }
      }
    ]
  }
}

但是,您还希望使用导航来显示用户可以单击的其他选项的列表。也许您有一个model字段,允许用户将其搜索结果限制为红色的Gucci shirt或slim。

这可以通过一个terms aggregations来实现:

GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  },
  "aggs": {
    "models": {
      "terms": { "field": "model" } 
    }
  }
}
{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "1",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "slim"
        }
      },
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "3",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "shirt"
        }
      }
    ]
  },
  "aggregations": {
    "models": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "shirt",
          "doc_count": 1
        },
        {
          "key": "slim",
          "doc_count": 1
        }
      ]
    }
  }
}

但也许你还想告诉用户有多少其他颜色的gucci衬衫可供选择。如果您只是在颜色字段上添加一个术语聚合,那么您将只返回红色,因为您的查询只返回Gucci的红色衬衫。

GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  },
  "aggs": {
    "models": {
      "terms": { "field": "color" } 
    }
  }
}

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "1",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "slim"
        }
      },
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "3",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "shirt"
        }
      }
    ]
  },
  "aggregations": {
    "models": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 2
        }
      ]
    }
  }
}

相反,您希望在聚合期间包含所有颜色的衬衫,然后仅对搜索结果应用color 过滤器。这是post_filter的目的:

GET /shirts/_search
GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "brand": "gucci"   }}
      ]
    }
  },
  "aggs": {
    "models": {
      "terms": { "field": "color" } 
    }
  },
  "post_filter": {
    "term":{
        "color":"red"
      }
  }
}
  • 现在的主要查询是古琦的所有衬衫,不管什么颜色。
  • agg还推出了古驰(Gucci)衬衫的流行颜色。
  • 而配色红色的agg则将模特们的分群限制在了红色的Gucci衬衫上。
  • 最后,post_filter从搜索结果中删除红色以外的颜色。
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "1",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "slim"
        }
      },
      {
        "_index": "shirts",
        "_type": "_doc",
        "_id": "3",
        "_score": 0,
        "_source": {
          "brand": "gucci",
          "color": "red",
          "model": "shirt"
        }
      }
    ]
  },
  "aggregations": {
    "models": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 2
        },
        {
          "key": "blue",
          "doc_count": 1
        }
      ]
    }
  }
}
posted @ 2022-10-30 23:22  寒小韩  阅读(148)  评论(0编辑  收藏  举报