lucence(补)

删除索引
      IndexWriter提供deleteDocuments(Term term);  //会删除索引文件里含有指定Term的所有Document。
      IndexReader也提供了deleteDocuments(Term term);

8. 更新索引
      IndexWriter提供updateDocument(Term term, Document doc); //实际上是先删除再创建索引。

9. 常用查询器
  1) TermQuery : 按Term(关键字)查询。构造方法:TermQuery(Term t)
Query query = new TermQuery(new Term("contents", keyword));  
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);  
TopDocs ts = isearcher.search(query, null, 100);  
      Query query = new TermQuery(new Term("contents", keyword));
      isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
      TopDocs ts = isearcher.search(query, null, 100); 

  2) BooleanQuery: 布尔查询。组合多个查询器。
Query query1 = new TermQuery(new Term("contents", keyword));  
Query query2 = new TermQuery(new Term("contents", keyword2));  
BooleanQuery query = new BooleanQuery();  
query.add(query1, Occur.SHOULD);  
query.add(query2, Occur.SHOULD);  
 
isearcher = new IndexSearcher(directory, true);   
 
TopDocs ts = isearcher.search(query, null, 100);  
  Query query1 = new TermQuery(new Term("contents", keyword));
  Query query2 = new TermQuery(new Term("contents", keyword2));
  BooleanQuery query = new BooleanQuery();
  query.add(query1, Occur.SHOULD);
  query.add(query2, Occur.SHOULD);

  isearcher = new IndexSearcher(directory, true);

  TopDocs ts = isearcher.search(query, null, 100);
 

  3) MultiFieldQueryParser: 多Field中查询。
QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, new String[]{"path", "contents"}, analyzer);  
Query query = parser.parse(keyword);  
isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);  
TopDocs ts = isearcher.search(query, null, 100); 
      QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, new String[]{"path", "contents"}, analyzer);
      Query query = parser.parse(keyword);
      isearcher = new IndexSearcher(FSDirectory.open(indexDir), true);
      TopDocs ts = isearcher.search(query, null, 100);

10. 高亮器Highlighter:在网页中对搜索结果予以高亮显示。
   1) 在classpath添加contrib/highlighter/lucene-highlighter-2.9.1.jar
   2) 示例伪代码
SimpleHTMLFormatter shf = new SimpleHTMLFormatter("<span style="color:red" mce_style="color:red">", "</span>"); //默认是<b>..</b>   
// 构造高亮器:指定高亮的格式,指定查询计分器     
Highlighter highlighter = new Highlighter(shf, new QueryScorer(query));     
//设置块划分器  
highlighter.setTextFragmenter(new SimpleFragmenter(Integer.MAX_VALUE));    
String content = highlighter.getBestFragment(Analyzer, "fieldName", "fieldValue"); 
       SimpleHTMLFormatter shf = new SimpleHTMLFormatter("<span style="color:red" mce_style="color:red">", "</span>"); //默认是<b>..</b>
       // 构造高亮器:指定高亮的格式,指定查询计分器  
       Highlighter highlighter = new Highlighter(shf, new QueryScorer(query));  
       //设置块划分器
       highlighter.setTextFragmenter(new SimpleFragmenter(Integer.MAX_VALUE)); 
       String content = highlighter.getBestFragment(Analyzer, "fieldName", "fieldValue");

11. 优化
  1) 使用IndexWriter须注意
      修改索引后,需flush()或close()方能生效
  2) 使用IndexSearcher须注意
      一旦打开,不会搜索到以后添加的索引
      线程安全,多个线程仅需一个实例
  3) 最佳实践
      多个线程共享一个IndexSearcher, 只有当索引修改后才重新打开IndexSearcher
      多个线程共享一个IndexWriter并严格同步
      异步修改索引提高性能(JMS)
      为每个Document创建单独的索引目录

posted @ 2012-04-18 11:51  狼里格朗  阅读(431)  评论(0编辑  收藏  举报