lucence.Net - 随笔分类 - 周骏

TOKENIZED,UN_TOKENIZED 解释

摘要：网上很多例子用的是lucene1.4.3，新版本的lucene在doc.add(new Field("content",curArt.getContent(),Field.Store.NO,Field.Index.TOKENIZED)); 这些地方与旧版本有很大的区别。Field有两个属性可选：存储和索引。通过存储属性你可以控制是否对这个Field进行存储；通过索引属性你可以控制是否对该Field... 阅读全文

posted @ 2009-12-02 11:33 周骏阅读(1973) 评论(0) 推荐(0)

Lucene核心功能详解

摘要：1、注意false和true区别IndexWriter writer = new IndexWriter(indexpath, getAnalyzer(),false);IndexWriter writer = new IndexWriter(indexpath, getAnalyzer(),true); IndexReader ir=IndexReader.open(indexpath); ... 阅读全文

posted @ 2009-11-26 13:03 周骏阅读(1159) 评论(1) 推荐(0)

lucene.net2.0 搜索

摘要：#region 建立索引 public void CreateIndex() { DataSet ds = 取得数据库信息; //取得索引输出 IndexWriter writer = new IndexWriter(这个地方写保存路径, true); writer.SetMergeFactor(20); // 调整segment合并的频率和大小 //建立索引字段 if (ds.Tables[0... 阅读全文

posted @ 2009-07-27 09:31 周骏阅读(486) 评论(0) 推荐(0)

Lucene.Net2.0搜索结果排序问题

摘要：对于数据量大（索引文件大于50M）的索引，尽量不要用索引中的字段排序，要用索引ID排序（INDEXORDER）；两者效率相差近10倍，以下从内存占用与CPU处理时间来比较：内存占用比较：图一：使用整型的唯一标识字段排序图二：使用索引ID（INDEXORDER）排序拿占用内存最多的对象来比较：我们可以看到，图一比图二多 2,900,766 bytes（索引文件大小：61M）处理时间比较... 阅读全文

posted @ 2009-07-27 09:30 周骏阅读(538) 评论(0) 推荐(0)

Lucene常用查询小结

摘要：第一、按词条搜索－ TermQuery query = new TermQuery(new Term("name","word1")); hits = searcher.search(query); 这样就可以把 field 为 name 的所有包含 word1 的文档检索出来了。第二、 “与或”搜索－ BooleanQuery 它实际是一个组合 que... 阅读全文

posted @ 2009-07-27 09:25 周骏阅读(556) 评论(0) 推荐(0)

Lucene.net搜索结果排序（单条件和多条件）

摘要：string INDEX_STORE_PATH = Server.MapPath("index");//INDEX_STORE_PATH 为索引存储目录 string keyword = TextBox2.Text;//搜索内容 Hits myhit = null; IndexSearcher mysea = new IndexSearcher(INDEX_STORE_PATH); QueryP... 阅读全文

posted @ 2009-07-27 09:24 周骏阅读(913) 评论(1) 推荐(1)

lucene.net应用大全

摘要：1. 基本应用using System;using System.Collections.Generic;using System.Text;using Lucene.Net;using Lucene.Net.Analysis;using Lucene.Net.Analysis.Standard;using Lucene.Net.Documents;using Lucene.Net.Index;u... 阅读全文

posted @ 2009-07-27 09:23 周骏阅读(533) 评论(0) 推荐(0)

Lucene.Net基本用法

摘要：1. 基本应用 using System; using System.Collections.Generic; using System.Text; using Lucene.Net; using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Documents; using Lucene.Net... 阅读全文

posted @ 2009-07-27 09:22 周骏阅读(567) 评论(0) 推荐(1)

lucene 全文检索简介

摘要：一，信息检索的过程简介全文检索和数据库应用最大的不同在于：让最相关的头100条结果满足98%以上用户的需求 1，构建文本库在开发功能前，一个信息检索系统需要做些准备工作，首先，必须要构建一个文本数据库，这个文本数据库用来保存所有用户可能检索的信息。在这些信息的基础上，确定索引中的文本类型，文本类型是被系统所认可的一种信息格式，这种格式应当具有可识别，冗余程度低的特点。一旦文本模型确定下来后... 阅读全文

posted @ 2009-07-27 09:20 周骏阅读(670) 评论(0) 推荐(0)

lucene.net索引文件存储简析

摘要：在lucene.net中，典型的索引文件操作代码如下: IndexWriter writer = new IndexWriter("c:\index", new StandardAnalyzer(), true); try { Document doc = new Document(); doc.Add(Field.Keyword("name", "name name"); doc.Add(Fie... 阅读全文

posted @ 2009-07-27 09:19 周骏阅读(674) 评论(0) 推荐(0)

Lucene.net多字段(Fields)、多索引目录(IndexSearcher)搜索

摘要：Lucene.net是目前在.net环境中被普遍使用的全文索引的开源项目，这次在项目的开发中也使用它进行全文索引。在开发过程中碰到一些小问题就是对多字段和多索引目录进行搜索。 1、多字段搜索就是同时要一个以上的字段中的内容进行比较搜索，类似概念在SQL中就是select * from Table where a like '%query%' or b like '%query%'。 Lucen... 阅读全文

posted @ 2009-07-27 09:17 周骏阅读(672) 评论(0) 推荐(1)

lucene 笔记

摘要：1. 有时对于一个Document来说，有一些Field会被频繁地操作，而另一些Field则不会。这时可以将频繁操作的Field和其他Field分开存放，而在搜索时同时检索这两部分Field而提取出一个完整的Document。这要求两个索引包含的Document的数量必须相同。在创建索引的时候，可以同时创建多个IndexWriter，将一个Document根据需要拆分成多个包含部分Field... 阅读全文

posted @ 2009-07-27 09:15 周骏阅读(532) 评论(0) 推荐(0)

周骏

随笔分类 - lucence.Net

公告