FuzzyQuery与WildCardQuery(通配符)

转自:http://blog.csdn.net/caoxu1987728/article/details/2324644

 

 

FuzzyQuery:

创建索引:

 

IndexWriter writer = new IndexWriter(path, new StandardAnalyzer(), 
                
false); 
                writer.setUseCompoundFile(
false); 

                Document doc1 
= new Document(); 
                Document doc2 
= new Document(); 
                Document doc3 
= new Document(); 
                Document doc4 
= new Document(); 
                Document doc5 
= new Document(); 
                Document doc6 
= new Document(); 

                Field f1 
= new Field("content""word", Field.Store.YES, 
                Field.Index.TOKENIZED); 
                Field f2 
= new Field("content""work", Field.Store.YES, 
                Field.Index.TOKENIZED); 
                Field f3 
= new Field("content""seed", Field.Store.YES, 
                Field.Index.TOKENIZED); 
                Field f4 
= new Field("content""sword", Field.Store.YES, 
                Field.Index.TOKENIZED); 
                Field f5 
= new Field("content""world", Field.Store.YES, 
                Field.Index.TOKENIZED); 
                Field f6 
= new Field("content""ford", Field.Store.YES, 
                Field.Index.TOKENIZED); 

                doc1.add(f1); 
                doc2.add(f2); 
                doc3.add(f3); 
                doc4.add(f4); 
                doc5.add(f5); 
                doc6.add(f6); 

                writer.addDocument(doc1); 
                writer.addDocument(doc2); 
                writer.addDocument(doc3); 
                writer.addDocument(doc4); 
                writer.addDocument(doc5); 
                writer.addDocument(doc6); 

                writer.close(); 

注:IndexWriter中的create的变量值一般设为true

搜索:

 

IndexSearcher searcher = new IndexSearcher(path); 
               
//构建一个Term,然后对其进行模糊查找 
                Term t = new Term("content""work"); 
                FuzzyQuery query 
= new FuzzyQuery(t); 
              
//FuzzyQuery还有两个构造函数,来限制模糊匹配的程度 
              
// 在FuzzyQuery中,默认的匹配度是0.5,当这个值越小时,通过模糊查找出的文档的匹配程度就 
               
// 越低,查出的文档量就越多,反之亦然 
                FuzzyQuery query1 = new FuzzyQuery(t, 0.1f); 
                FuzzyQuery query2 
= new FuzzyQuery(t, 0.1f1); 
                Hits hits 
= searcher.search(query2); 
                
for (int i = 0; i  < hits.length(); i++
                { 
                   System.out.println(hits.doc(i)); 
                } 
                searcher.close(); 

模糊搜索的三种构造函数,具体讲一下参数的用法(以第三个为例);

第一个参数当然是词条对象,第二个参数指的是levenshtein算法的最小相似度,第三个参数指的是要有多少个前缀字母完全匹配:

通配符就更简单了,只要知道“*”表示0到多个字符,而使用“?”表示一个字符就行了:

 

IndexSearcher searcher=new IndexSearcher(path);
                Term t1
=new Term("content","?o*");
                WildcardQuery query
=new WildcardQuery(t1);
                Hits hits
=searcher.search(query);
                
for(int i=0;i<hits.length();i++)
                {
                    System.out.println(hits.doc(i));
                }

That“s all!

posted on 2013-04-24 11:48  ——阿文  阅读(1670)  评论(0编辑  收藏  举报

导航