RavenDb进行全文检索实现及数据统计

实现目标：项目中使用到了ravendb进行数据的保存，而ravendb对字段的检索是经过lucene进行处理的，而默认的分词器是小写分词器，查找网上的资料看中了中科院的ictclas分词器和盘古分词。选择使用盘古分词和lucene结合，ravendb最新版2.0使用的lucene版本是3.0版本的而网上找到的分词器代码都不能很好的支持3.0，最终选择使用ravendb低版本的方式解决，另外一个问题是引用的json.net版本比较高，解决方式是删除就的高版本引用使用低版本的进行编译。另外一个问题是ravendb中无法直接使用groupby进行数据的分组统计，需要创建一个mapreduce生成统计索引

1，建立分词索引

public class ByFormNameIndex : AbstractIndexCreationTask<MobileForm>
{
    public ByFormNameIndex()
    {

        Map = mobileForms => from form in mobileForms
                             select new
                             {
                                 form.FormName,
                                 form.BelongTo,
                                 form.RequestType
                             };
        Analyzers.Add(x => x.FormName, "Lucene.Net.Analysis.PanGu.PanGuAnalyzer,PanGu.Lucene.Analyzer, Version=1.3.1.0, Culture=neutral, PublicKeyToken=null");
        Indexes.Add(x => x.FormName, FieldIndexing.Analyzed);
        Indexes.Add(x => x.BelongTo, FieldIndexing.NotAnalyzed);
        Indexes.Add(x => x.RequestType, FieldIndexing.NotAnalyzed);
    }
}

2，创建统计索引

public class StatisticResult
{
    public string UserName { get; set; }
    public int Count { get; set; }
}
public class GroupUserIndex : AbstractIndexCreationTask<MobileForm, StatisticResult>
{
    public GroupUserIndex()
    {

        Map = mobileForms => from form in mobileForms
                             select new StatisticResult
                             {
                                 UserName = form.BelongTo,
                                 Count = 1
                             };
        Reduce = results => from result in results
                            group result by result.UserName
                                into g
                                select new StatisticResult
                                {
                                    UserName = g.Key,
                                    Count = g.Sum(x => x.Count)
                                };
    }
}

3，Global.asax中进行配置

 DataDocumentStore.Initialize();
 PanGu.Segment.Init();
 IndexCreation.CreateIndexes(typeof(ByFormNameIndex).Assembly, DataDocumentStore.Instance);
 IndexCreation.CreateIndexes(typeof(GroupUserIndex).Assembly, DataDocumentStore.Instance);

posted @ 2013-02-23 20:49 sdhjl2000 阅读(601) 评论(1) 编辑收藏举报

刷新页面返回顶部

return null;

RavenDb进行全文检索实现及数据统计

公告