使用 elasticsearch 进行 搜索热门词汇的展示和排序

产品需求: 统计app 用户输入搜索框的热门词汇,找出前十个排列展示;用户点击对应词汇,继续跳转搜索页面;

实现方式有很多种,欢迎大家指出问题;

我用的是Elasticsearch 来进行热门词汇的存储和统计;

首先,用户每次在搜索框中搜索东西,我都会将输入的内容存储到es 中单独一个index,用来存储信息;

存储的结构信息可以很简单,如下内容 :

输入词的内容;

下次点击搜索进行的埋点处理这里可以是内容也可以是指定搜索的内容;

以及每条记录的id值;

可以有时间戳信息;

这里是java API 代码
public ResultPoJo addIndexSearchOtherThye(HotKeyWordsPo hotKeyWordsPo) { ResultPoJo<?> poJo = new ResultPoJo<>(ConstantError.NOMAL); // 组装搜索词 HotKeyWordsSearchPojo hotKeyWordsSearchPojo = new HotKeyWordsSearchPojo(); hotKeyWordsSearchPojo.setSearchInput(hotKeyWordsPo.getSearchInput()); hotKeyWordsSearchPojo.setId(ConverterUtil.getUUID()); hotKeyWordsSearchPojo.setCreateDate(OffsetDateTimeUtils.getDateNow().toEpochSecond()); // 插入到索引库数据 if (ConverterUtil.isNotEmpty(hotKeyWordsSearchPojo)) { elasticsearchRestHighLevelClient.index("search-hotkey", SearchType.HotKeyWords.toString(), ConverterUtil.toString(hotKeyWordsSearchPojo.getId()), hotKeyWordsSearchPojo); } return poJo; }

将搜索词存储进去以后,就是对索引内容的搜索了:

/**
     * 热搜列表搜索
     *
     * @param pojo
     * @return
     * @throws JsonProcessingException
     */
    public ResultPoJo<Map<String, Object>> restSearchHotWords(CommditySearchPojo pojo) throws JsonProcessingException {

        ResultPoJo<Map<String, Object>> resultPoJo = new ResultPoJo(ConstantCodeMsg.NOMAL);
        int start = (pojo.getPageNo() - 1) * pojo.getPageSize();
        String s = "searchInput";
        String index = "search-hotkey";
       
        // 根据指定字段进行分组,对应的docCount就是 文档中出现的次数
        TermsAggregationBuilder teamAgg = AggregationBuilders.terms("hotWords_count").field(s.concat(".").concat("keyword"));
        teamAgg.order(BucketOrder.count(true));
        SearchResponse searchResp;
        SearchResponse allResp;
        try {
            SearchRequest searchRequest = new SearchRequest(index);
            searchRequest.types(SearchType.HotKeyWords.toString());
            searchRequest.searchType(org.elasticsearch.action.search.SearchType.DFS_QUERY_THEN_FETCH);
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            sourceBuilder.query(bqb);
            searchRequest.source(sourceBuilder);
            allResp = elasticsearchRestHighLevelClient.client.search(searchRequest, COMMON_OPTIONS);

            SearchRequest searchRequests = new SearchRequest(index);
            searchRequests.types(SearchType.HotKeyWords.toString());
            searchRequests.searchType(org.elasticsearch.action.search.SearchType.DFS_QUERY_THEN_FETCH);
            SearchSourceBuilder sourceBuilders = new SearchSourceBuilder();

            sourceBuilders.aggregation(teamAgg);
            sourceBuilders.from(start);
            sourceBuilders.size(pojo.getPageSize());
            sourceBuilders.explain(true);
            sourceBuilders.query(bqb);
            searchRequests.source(sourceBuilders);
            searchResp = elasticsearchRestHighLevelClient.client.search(searchRequests, COMMON_OPTIONS);
        } catch (Exception e) {
            System.out.println(e);
            Map<String, Object> stringObjectMap = elasticsearchRestHighLevelClient.responseToJson(pojo.getPageNo(), pojo.getPageSize());
            resultPoJo.setResult(stringObjectMap);
            return resultPoJo;
        }
        Map<String, Object> stringObjectMap = elasticsearchRestHighLevelClient.responseToJsonHotKey(searchResp, allResp, pojo.getPageNo(), pojo.getPageSize());
        resultPoJo.setResult(stringObjectMap);
        return resultPoJo;
    }
hotWords_count 即对应index 中 输入词的内容信息;
/**
     * 序列化查询结果,包含分页信息
     *
     * @param searchResponse 当前页的查询结果
     * @param allResponse    所有满足条件的查询结果
     * @param pageNo         当前页
     * @param pageSize       页面大小
     * @return
     * @throws JsonProcessingException
     */
    public Map<String, Object> responseToJsonHotKey(SearchResponse searchResponse, SearchResponse allResponse, int pageNo, int pageSize) throws JsonProcessingException {
        Map<String, Object> jsonMap = new HashMap<String, Object>();
        // 总数据条数
        long count = allResponse.getHits().getTotalHits();

        Map<String, Aggregation> asMap = searchResponse.getAggregations().getAsMap();
        ParsedStringTerms hotWordsCount = (ParsedStringTerms) asMap.get("hotWords_count");
        List<? extends Terms.Bucket> buckets = hotWordsCount.getBuckets();
        long totalCount = buckets.size();
        // 总页数
        int totalPageNum = ((int) count + pageSize - 1) / pageSize;
        Map<String, Object> pageMap = new HashMap<String, Object>();
        pageMap.put("pageNo", pageNo);
        pageMap.put("pageSize", pageSize);
        pageMap.put("count", count);
        pageMap.put("totalPageNum", totalPageNum);
        pageMap.put("totalCount", totalCount);
        jsonMap.put("page", pageMap);
        List<Map<String, Object>> list = new ArrayList<Map<String, Object>>();
        for (Terms.Bucket bucket : buckets) {
            Map<String, Object> listMap = new HashMap<String, Object>();
            listMap.put("count", bucket.getDocCount());
            listMap.put("hotWordsName", bucket.getKeyAsString());
            list.add(listMap);
        }
        jsonMap.put("list", list);
        return jsonMap;
    }
根据指定字段进行分组,对应的docCount就是 文档中出现的次数;

 

posted on 2020-12-30 11:44  时倏珍慧  阅读(3894)  评论(0编辑  收藏  举报