使用Redis构建简单的ORM

Reids相关的资料引用

http://www.tuicool.com/articles/bURJRj [Reids各种数据类型的应用场景]

https://github.com/antirez/redis [Github Reids]

https://github.com/StackExchange/StackExchange.Redis [Github StackExchangeReids]

目标

在Redis的基础上提供强类型的访问入口
分页支持
主键支持

几个方案[数据类型]的选择分析

为了实现上述目标，针对以下几种类型进行了思考：

[基于字符串类型]

使用字符串类型来存储集合对象。这种方式存在以下几个问题：

每次更新操作涉及到整个集合对象
序列化/反序列化会导致性能瓶颈
无法支持分页（仅支持内存分页，每次应用服务器都需要加载所有数据）

[基于集合类型]

使用集合类型（LIST/SET)来存储集合类型对象。相对于字符串而言，有如下改进：

每次更新操作不会影响到整个集合
序列化/反序列化不会导致性能瓶颈
支持分页，分页无需加载所有数据

但是仍然存在以下问题：

无法支持主键（无法根据Key来获取数据）
每次更新的粗细粒度为整个数据"行"

[基于HashSet类型]

使用HashSet来存储一个对象的每个FIELD，使用一个对应的KEY来访问对象。这种方式解决了以下问题：

为数据访问提供了键支持
可以根据指定字段来更新数据

但是无法提供集合支持。

[混合的方案]

使用一个SortedSet来记录数据集合的所有的KEY，使用不同的KEY指向的HashSet存储集合元素数据。这个方案满足了上述所有的需求，是目前采取的方式。但是仍然有以下问题：

每次读取一个对象就需要一次通信开销（访问一次HashSet）

KEY的设计

为了保证存储在Redis的键值对逻辑上的唯一性，在实现上述方案的时候使用了较长的KEY。一个KEY由以下几个部分组成：

WellKnownReidsKeys，这是一个功能性的划分，表明这个key对应的值的用途
TypeSpecifiedKey，这个部分反应了这个key对应的值被“结构化”之后的类型信息
CustomizedKey，这个是一个自定义的Key，方便使用的时候扩展

在Redis中，一个KEY应该形如：[WellKnownReidsKeys][TypeSpecifiedKey][CustomizedKey]。其中，CustomizedKey可以将同类型的数据集合拆分成不同的区块，独立管理。

几个性能问题

[强类型对象转字典问题]

使用了运行时构造表达式目录树进行编译的方式来减少反射开销，代码如下：

    public Func<T, IDictionary<string, string>> Compile(string key)
    {
        var outType = typeof (Dictionary<string, string>);
        var func = ConcurrentDic.GetOrAdd(key, k =>
        {
            var tType = typeof (T);
            var properties = tType.GetProperties();
            var expressions = new List<Expression>();
            //public T xxx(IDataReader reader){
            var param = Expression.Parameter(typeof (T));

            //var instance = new T();
            var newExp = Expression.New(outType);
            var varExp = Expression.Variable(outType, "instance");v
            var varAssExp = Expression.Assign(varExp, newExp);
            expressions.Add(varAssExp);

            var indexProp = typeof (IDictionary<string, string>).GetProperties().Last(p => p.Name == "Item");

            var strConvertMethod = typeof (object).GetMethod("ToString");
            foreach (var property in properties)
            {
                var propExp = Expression.PropertyOrField(param, property.Name);
                Expression indexAccessExp = Expression.MakeIndex(varExp, indexProp,
                    new Expression[] {Expression.Constant(property.Name)});
                var strConvertExp = Expression.Condition(Expression.Equal(Expression.Constant(null), Expression.Convert(propExp,typeof(object))),
                    Expression.Constant(string.Empty), Expression.Call(propExp, strConvertMethod));
                var valueAssignExp = Expression.Assign(indexAccessExp, strConvertExp);
                expressions.Add(valueAssignExp);
            }

            //return instance; 
            var retarget = Expression.Label(outType);
            var returnExp = Expression.Return(retarget, varExp);
            expressions.Add(returnExp);
            //}
            var relabel = Expression.Label(retarget, Expression.Default(outType));
            expressions.Add(relabel);

            var blockExp = Expression.Block(new[] {varExp}, expressions);
            var expression = Expression.Lambda<Func<T, IDictionary<string, string>>>(blockExp, param);
            return expression.Compile();
        });
        return func;
    }

对于单次转换，表达式的编译结果根据类型信息和字典的KEY信息做了缓存，从而提升性能。对于集合转换，对于每个集合的操作，每次使用的委托都是同一个从而减少了字典索引的开销。以下是一个以硬编码代码为了测试基准的性能比对：

    public void ModelStringDicTransfer()
    {
        var customer = new ExpressionFuncTest.Customer
        {
            Id = Guid.NewGuid(),
            Name = "TestMap",
            Age = 25,
            Nick = "Test",
            Sex = 1,
            Address = "Hello World Street",
            Tel = "15968131264"
        };
        const int RunCount = 10000000;
        GetDicByExpression(customer);

        var time = StopwatchHelper.Timing(() =>
        {
            int count = RunCount;
            while (count-- > 0)
            {
                GetDicByExpression(customer);
            }
        });
        var baseTime = StopwatchHelper.Timing(() =>
        {
            int count = RunCount;
            while (count-- > 0)
            {
                GetDicByHardCode(customer);
            }
        });

        Console.WriteLine("time:{0}\tbasetime:{1}", time, baseTime);
        Assert.IsTrue(baseTime * 3 >= time);
    }

    private Func<ExpressionFuncTest.Customer, IDictionary<string, string>> _dicMapper;
    private IDictionary<string, string> GetDicByExpression(ExpressionFuncTest.Customer customer)
    {
        _dicMapper = _dicMapper ?? ModelStringDicTransfer<ExpressionFuncTest.Customer>.Instance.Compile(
                         typeof(ExpressionFuncTest.Customer).FullName);
        return _dicMapper(customer);
    }

    private Dictionary<string, string> GetDicByHardCode(ExpressionFuncTest.Customer customer)
    {
        var dic = new Dictionary<string, string>();
        dic.Add("Name", customer.Name);
        dic.Add("Address", customer.Address);
        dic.Add("Nick", customer.Nick);
        dic.Add("Tel", customer.Tel);
        dic.Add("Id", customer.Id.ToString());
        dic.Add("Age", customer.Age.ToString());
        dic.Add("Sex", customer.Sex.ToString());
        return dic;
    }

对于10M的转换量，硬编码耗时6s左右，动态转换耗时10s左右。

[整体的性能测试]

以下是一个针对已经完成的实现的测试：

    public void PerformanceTest()
    {
        var amount = 1000000;
        var key = "PerformanceTest";
        Fill(amount, key);
        PageGetFirst(1, key);

        int i = 1;
        while (i <= 100000)
        {
            var count = i;
            var fTime = StopwatchHelper.Timing(() => PageGetFirst(count, key));
            var lTime = StopwatchHelper.Timing(() => PageGetLast(count, key));
            Console.WriteLine("{0}:第一页耗时:{1}\t最后一页耗时:{2}", count, fTime, lTime);
            i = i*10;
        }
    }

    private void Fill(int count,string partKey)
    {
        var codes = Enumerable.Range(1000, count).Select(i => i.ToString());
        codes.Foreach(i =>
        {
            var customer = new Customer
            {
                Id = i == "1000" ? Guid.Empty : Guid.NewGuid(),
                Name = "Customer" + i,
                Code = i,
                Address = string.Format("XX街{0}号", DateTime.Now.Millisecond),
                Tel = "15968131264"
            };
            _pagableHashStore.UpdateOrInsertAsync(customer, customer.Code + "", partKey).Wait();
        });
    }

    private void PageGetFirst(int count,string partKey)
    {
        var pageInfo = new PageInfo(count, 1);
        _pagableHashStore.PageAsync(pageInfo, partKey).Result
            .Foreach(i => i.Wait());
    }

    private void PageGetLast(int count, string partKey)
    {
        var pageInfo = new PageInfo(count, (100000 - 1)/count + 1);
        _pagableHashStore.PageAsync(pageInfo, partKey).Result
            .Foreach(i => i.Wait());
    }

对于10M数据的分页测试（默认的插入时间排序，不同的页长）的结果（时间单位：毫秒）：

1页长第一页耗时:1，最后一页耗时:1
10页长第一页耗时:0，最后一页耗时:0
100页长第一页耗时:2，最后一页耗时:5
1000页长第一页耗时:33，最后一页耗时:35
10000页长第一页耗时:214，最后一页耗时:316
100000页长第一页耗时:3251，最后一页耗时:3163

收获

打开了脑洞
开始编写单元测试
开始更新单元测试

所有的源码：

http://pan.baidu.com/s/1c2LQjSG

posted @ 2016-03-28 17:26 LibraJM 阅读(2725) 评论(0) 收藏举报

刷新页面返回顶部

黑白

长路漫漫明明月，残影俩俩细细声。