摘要: 概要结构如下图。 图中显示:Search Index和Read Replicas等系统是Databus的消费者。当主OLTP数据库发生写操作时,连接其上的中继系统会将数据拉到中继中。签入在Search Index或是缓存中的Databus消费者客户端,就会从中继中拉出数据,并更新索引或缓存。 Dat 阅读全文
posted @ 2017-02-27 20:29 bonelee 阅读(3668) 评论(0) 推荐(1) 编辑
摘要: Ignoring TF/IDF Ignoring TF/IDF Ignoring TF/IDF Ignoring TF/IDF Sometimes we just don’t care about TF/IDF. All we want to know is that a certain word 阅读全文
posted @ 2017-02-27 19:38 bonelee 阅读(5531) 评论(0) 推荐(0) 编辑
摘要: Query-Time Boosting Query-Time Boosting Query-Time Boosting Query-Time Boosting In Prioritizing Clauses, we explained how you could use the boost para 阅读全文
posted @ 2017-02-27 19:23 bonelee 阅读(9791) 评论(0) 推荐(0) 编辑
摘要: For multiterm queries, Lucene takes the Boolean model, TF/IDF, and the vector space model and combines them in a single efficient package that collect 阅读全文
posted @ 2017-02-27 19:16 bonelee 阅读(727) 评论(1) 推荐(0) 编辑
摘要: Vector Space Model Vector Space Model The vector space model provides a way of comparing a multiterm query against a document. The output is a single 阅读全文
posted @ 2017-02-27 14:52 bonelee 阅读(530) 评论(1) 推荐(0) 编辑
摘要: Theory Behind Relevance Scoring Theory Behind Relevance Scoring Theory Behind Relevance Scoring Theory Behind Relevance Scoring Lucene (and thus Elast 阅读全文
posted @ 2017-02-27 14:46 bonelee 阅读(590) 评论(1) 推荐(0) 编辑
摘要: Field-length norm How long is the field? The shorter the field, the higher the weight. If a term appears in a short field, such as a title field, it i 阅读全文
posted @ 2017-02-27 14:45 bonelee 阅读(1725) 评论(1) 推荐(0) 编辑
摘要: When we run a simple term query with explain set to true (see Understanding the Score), you will see that the only factors involved in calculating the 阅读全文
posted @ 2017-02-27 12:21 bonelee 阅读(901) 评论(0) 推荐(0) 编辑
摘要: 改变Lucene的打分模型 随着Apache Lucene 4.0版本在2012年的发布,这款伟大的全文检索工具包终于允许用户修改默认的基于TF/IDF原理的打分算法。Lucene API变得更加容易修改和扩展打分公式。但是,对于文档的打分计算,Lucene并只是允许用户在打分公式上修修补补,Luc 阅读全文
posted @ 2017-02-27 11:27 bonelee 阅读(5252) 评论(0) 推荐(0) 编辑
摘要: Tuning BM25 Tuning BM25 One of the nice features of BM25 is that, unlike TF/IDF, it has two parameters that allow it to be tuned: k1This parameter con 阅读全文
posted @ 2017-02-27 11:14 bonelee 阅读(5316) 评论(0) 推荐(0) 编辑
摘要: Pluggable Similarity Algorithms Before we move on from relevance and scoring, we will finish this chapter with a more advanced subject: pluggable simi 阅读全文
posted @ 2017-02-27 11:13 bonelee 阅读(3595) 评论(0) 推荐(0) 编辑
摘要: Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similaritysetting provides a simple way of choosing a similarit 阅读全文
posted @ 2017-02-27 11:00 bonelee 阅读(2173) 评论(0) 推荐(1) 编辑