lucence(补)
删除索引
IndexWriter提供deleteDocuments(Term term); //会删除索引文件里含有指定Term的所有Document。
IndexReader也提供了deleteDocuments(Term term);
8. 更新索引
IndexWriter提供updateDocument(Term term, Document doc); //实际上是先删除再创建索引。
9. 常用查询器
1) TermQuery : 按Term(关键字)查询。构造方法:TermQuery(Term t)
Query query = new TermQuery(new Term("contents", keyword));
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
TopDocs ts = isearcher.search(query, null, 100);
Query query = new TermQuery(new Term("contents", keyword));
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
TopDocs ts = isearcher.search(query, null, 100);
2) BooleanQuery: 布尔查询。组合多个查询器。
Query query1 = new TermQuery(new Term("contents", keyword));
Query query2 = new TermQuery(new Term("contents", keyword2));
BooleanQuery query = new BooleanQuery();
query.add(query1, Occur.SHOULD);
query.add(query2, Occur.SHOULD);
isearcher = new IndexSearcher(directory, true);
TopDocs ts = isearcher.search(query, null, 100);
Query query1 = new TermQuery(new Term("contents", keyword));
Query query2 = new TermQuery(new Term("contents", keyword2));
BooleanQuery query = new BooleanQuery();
query.add(query1, Occur.SHOULD);
query.add(query2, Occur.SHOULD);
isearcher = new IndexSearcher(directory, true);
TopDocs ts = isearcher.search(query, null, 100);
3) MultiFieldQueryParser: 多Field中查询。
QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, new String[]{"path", "contents"}, analyzer);
Query query = parser.parse(keyword);
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
TopDocs ts = isearcher.search(query, null, 100);
QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, new String[]{"path", "contents"}, analyzer);
Query query = parser.parse(keyword);
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
TopDocs ts = isearcher.search(query, null, 100);
10. 高亮器Highlighter:在网页中对搜索结果予以高亮显示。
1) 在classpath添加contrib/highlighter/lucene-highlighter-2.9.1.jar
2) 示例伪代码
SimpleHTMLFormatter shf = new SimpleHTMLFormatter("<span style="color:red" mce_style="color:red">", "</span>"); //默认是<b>..</b>
// 构造高亮器:指定高亮的格式,指定查询计分器
Highlighter highlighter = new Highlighter(shf, new QueryScorer(query));
//设置块划分器
highlighter.setTextFragmenter(new SimpleFragmenter(Integer.MAX_VALUE));
String content = highlighter.getBestFragment(Analyzer, "fieldName", "fieldValue");
SimpleHTMLFormatter shf = new SimpleHTMLFormatter("<span style="color:red" mce_style="color:red">", "</span>"); //默认是<b>..</b>
// 构造高亮器:指定高亮的格式,指定查询计分器
Highlighter highlighter = new Highlighter(shf, new QueryScorer(query));
//设置块划分器
highlighter.setTextFragmenter(new SimpleFragmenter(Integer.MAX_VALUE));
String content = highlighter.getBestFragment(Analyzer, "fieldName", "fieldValue");
11. 优化
1) 使用IndexWriter须注意
修改索引后,需flush()或close()方能生效
2) 使用IndexSearcher须注意
一旦打开,不会搜索到以后添加的索引
线程安全,多个线程仅需一个实例
3) 最佳实践
多个线程共享一个IndexSearcher, 只有当索引修改后才重新打开IndexSearcher
多个线程共享一个IndexWriter并严格同步
异步修改索引提高性能(JMS)
为每个Document创建单独的索引目录