IL入门之旅（一）——创建对象 - Zhenway

1.创建对象的方式

作为入门的第一篇，先从最简单的创建对象开始吧。

首先，回顾一下，在c#中如何创建某个类型的对象，最基础的当然是使用new关键字，例如：new object()，当然这个方式也有缺点，那就是编译时必须明确的知道类型（也就是这里的object）。

如果编译时不知道类型，但运行时却知道具体类型，那怎么帮哪？

第一个相到的当然是使用Activator.CreateInstance方法，例如：

static void Main(string[] args)
{
    Type type = typeof(object);
    Console.WriteLine(Create(type));
}

static object Create(Type type)
{
    return Activator.CreateInstance(type);
}

或者使用泛型的重载：

static void Main(string[] args)
{
    Console.WriteLine(Create<object>());
}

static T Create<T>()
{
    return Activator.CreateInstance<T>();
}

当然，这里的type或者T都必须有默认构造函数，否则将会出现运行时错误。

第二个方式是采用泛型约束new()，例如：

static void Main(string[] args)
{
    Console.WriteLine(Create<object>());
}

static T Create<T>() where T : new()
{
    return new T();
}

当T不具备一个默认构造函数时，编译时将会报错。

当然，还有第三种方式——反射，例如：

static void Main(string[] args)
{
    Console.WriteLine(Create(typeof(object)));
}

static object Create(Type type)
{
    var ctor = type.GetConstructor(Type.EmptyTypes);
    return ctor.Invoke(new object[0]);
}

2.性能对比

当然这三种方式的性能各有不同，第三种反射的方式，拥有最差的性能，泛型的new约束在T为引用类型时，编译器将new T()自动转换为Activator.CreateInstance<T>的调用，而T为值类型的情况下，效率及高。Activator.CreateInstance的两个重载的效率取决于调用次数和非泛型版的缓存命中率，当调用次数足够多的时候，并且大量命中的情况下非泛型版具有相对较高的效率，也就是说非泛型版的在缓存不命中的情况下需要较高的代价，但是缓存命中的情况下，却可以拥有相对较小的代价，测试代码如下：

Type t = typeof(object);
var sw = new Stopwatch();
int count = 1; // 10, 100, 1000
sw.Start();
for (int i = 0; i < count; i++)
    Activator.CreateInstance(t);
Console.WriteLine("Method:CreateInstance, Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());
sw.Reset();
sw.Start();
for (int i = 0; i < count; i++)
    Activator.CreateInstance<object>();
Console.WriteLine("Method:CreateInstance<T>, Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());

为了避免被非泛型版缓存，需要每次重新执行，可以看到运行结果如下：

Method:CreateInstance, Count:1, Tick:1399
Method:CreateInstance<T>, Count:1, Tick:42

Method:CreateInstance, Count:10, Tick:3009
Method:CreateInstance<T>, Count:10, Tick:112

Method:CreateInstance, Count:100, Tick:3343
Method:CreateInstance<T>, Count:100, Tick:820

Method:CreateInstance, Count:1000, Tick:4092
Method:CreateInstance<T>, Count:1000, Tick:7989

很容易发现泛型版的代价相当的稳定，除了第一次外，每次调用的代价约8-10个Tick，而非泛型版的代价则变化很大，100次调用的代价中前10次调用的就占有了90%（事实上是前2次），但是之后的代价明显下降，但是与直接new object()相比，代价还是比较大。

3.初识IL

回到主题IL，如果采用Emit，能不能进一步提高性能哪？那么首先，需要用Emit来写一个创建对象的方法，但是，不会怎么写办？

好吧，现在是工具出场的时候了，首先是c#编译器，写上一个方法，例如：

static object MyCreateInstance()
{
    return new object();
}

然后编译（本文中所有的编译均使用Release方式），这样可以获得一个dll或者exe文件，然后，轮到reflector出场了，打开这个文件，找到类和方法，反编译的语言选择为IL，这样就可以获得下面的内容：

.method private hidebysig static object MyCreateInstance() cil managed
{
    .maxstack 8
    L_0000: newobj instance void [mscorlib]System.Object::.ctor()
    L_0005: ret 
}

还是觉得无从下手？好吧，可以到这里下载一些reflector的插件，例如：ReflectionEmitLanguage，加载插件后，reflector就会多一个语言选择：Reflection.Emit，选择该语言后，就可以看到：

public MethodBuilder BuildMethodMyCreateInstance(TypeBuilder type)
{
    // Declaring method builder
    // Method attributes
    System.Reflection.MethodAttributes methodAttributes =
          System.Reflection.MethodAttributes.Private
        | System.Reflection.MethodAttributes.HideBySig
        | System.Reflection.MethodAttributes.Static;
    MethodBuilder method = type.DefineMethod("MyCreateInstance", methodAttributes);
    // Preparing Reflection instances
    ConstructorInfo ctor1 = typeof(Object).GetConstructor(
        BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic,
        null,
        new Type[]{
    },
        null
        );
    // Setting return type
    method.SetReturnType(typeof(Object));
    // Adding parameters
    ILGenerator gen = method.GetILGenerator();
    // Writing body
    gen.Emit(OpCodes.Newobj, ctor1);
    gen.Emit(OpCodes.Ret);
    // finished
    return method;
}

不过，这个是生成类型的方法，现在需要的仅仅是生成一个方法，稍微来改造一下吧：

static Func<object> BuildMethodMyCreateInstance()
{
    DynamicMethod dm = new DynamicMethod(string.Empty, typeof(object), Type.EmptyTypes);
    var gen = dm.GetILGenerator();
    gen.Emit(OpCodes.Newobj, typeof(object).GetConstructor(Type.EmptyTypes));
    gen.Emit(OpCodes.Ret);
    return (Func<object>)dm.CreateDelegate(typeof(Func<object>));
}

这样，就用一个Func<object>包装了刚才的MyCreateInstance方法的内容，来看看能不能正常工作：

Func<object> func = BuildMethodMyCreateInstance();
Console.WriteLine(func());

看看运行结果：

System.Object

工作的不错，但是问题来了，我们要的不是一个简单到只能创建object的方法，需要的是一个能传入类型的，并且能创建出这个类型的对象的方法，所以，需要进一步改进：

static Func<object> BuildMethodMyCreateInstance(Type type)
{
    DynamicMethod dm = new DynamicMethod(string.Empty, typeof(object), Type.EmptyTypes);
    var gen = dm.GetILGenerator();
    gen.Emit(OpCodes.Newobj, type.GetConstructor(Type.EmptyTypes));
    gen.Emit(OpCodes.Ret);
    return (Func<object>)dm.CreateDelegate(typeof(Func<object>));
}

加了个参数，小改了一下，来看看测试吧：

Func<object> func = BuildMethodMyCreateInstance(typeof(object));
Console.WriteLine(func());

输出：

System.Object

很好，接下来我们进一步测试一下。

4.性能对比（二）

把测试方法来修改一下，加入Emit和直接早期绑定的性能测试：

Type t = typeof(object);
var sw = new Stopwatch();
int count = 1; // 10, 100, 1000
sw.Reset();
sw.Start();
for (int i = 0; i < count; i++)
    Activator.CreateInstance(t);
Console.WriteLine("Method:CreateInstance, Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());
sw.Reset();
sw.Start();
for (int i = 0; i < count; i++)
    Activator.CreateInstance<object>();
Console.WriteLine("Method:CreateInstance<T>, Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());
sw.Reset();
sw.Start();
Func<object> func = BuildMethodMyCreateInstance(t);
for (int i = 0; i < count; i++)
    func();
Console.WriteLine("Method:MyCreateInstance, Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());
sw.Reset();
sw.Start();
for (int i = 0; i < count; i++)
    new object();
Console.WriteLine("Method:new object(), Count:{0}, Tick:{1}",
    count.ToString(), sw.ElapsedTicks.ToString());

来看看测试结果：

Method:CreateInstance, Count:1, Tick:1449
Method:CreateInstance<T>, Count:1, Tick:56
Method:MyCreateInstance, Count:1, Tick:11288
Method:new object(), Count:1, Tick:10

哇，就仅执行一次的性能而言，还是乖乖的用CreateInstance<T>吧（或者使用泛型的new()约束）。

Method:CreateInstance, Count:10, Tick:3046
Method:CreateInstance<T>, Count:10, Tick:113
Method:MyCreateInstance, Count:10, Tick:10802
Method:new object(), Count:10, Tick:32

10次的Emit结果还是差不多（Emit的时间减少是属于误差）

Method:CreateInstance, Count:100, Tick:3211
Method:CreateInstance<T>, Count:100, Tick:811
Method:MyCreateInstance, Count:100, Tick:9442
Method:new object(), Count:100, Tick:13

100次的Emit结果还是很稳定，基本没有增长

Method:CreateInstance, Count:1000, Tick:3964
Method:CreateInstance<T>, Count:1000, Tick:8031
Method:MyCreateInstance, Count:1000, Tick:10982
Method:new object(), Count:1000, Tick:90

1000次的Emit结果依然稳定（误差比消耗的还要多），但是还比不上CreateInstance，那么来一次加赛吧。

Method:CreateInstance, Count:10000, Tick:13396
Method:CreateInstance<T>, Count:10000, Tick:81018
Method:MyCreateInstance, Count:10000, Tick:10706
Method:new object(), Count:10000, Tick:564

10000次执行终于超过了CreateInstance

Method:CreateInstance, Count:100000, Tick:102373
Method:CreateInstance<T>, Count:100000, Tick:768419
Method:MyCreateInstance, Count:100000, Tick:15689
Method:new object(), Count:100000, Tick:4778

100000次执行的情况下，Emit的优势就完全体现出来了，性能是CreateInstance的6倍多，并且是CreateInstance<T>的近50倍。并且如果减去第一次Emit的代价，将非常接近直接new object()的代价。（15689-约10000的第一次代价，得到的之后约5000多Tick，与直接new object() 5000次的Tick非常接近）

所以，如果需要大量创建某对象，应该尽量使用Emit或者早期绑定的方式；而如果仅仅创建几次的话，则应该尽量选择CreateInstance<T>或者早期绑定。

5.进阶

当是真的很好了吗？还记得前言里面说的IL需要特别处理值类型吗？来看看这个测试用例吧：

未处理的异常: System.ArgumentNullException: 值不能为空。
参数名: con
在 System.Reflection.Emit.DynamicILGenerator.Emit(OpCode opcode, ConstructorI
nfo con)

出错了，原因是ConstructorInfo没有拿到实例，为什么哪？因为值类型通常没有默认构造函数（几乎所有的高级语言都不允许写值类型的默认构造函数，但是IL并不阻止）。

那么平时的：

static void X()
{
    int x = new int();
    Console.WriteLine(x);
}

是怎么回事哪？继续使用Reflector看一下吧：

.method private hidebysig static void X() cil managed
{
    .maxstack 1
    .locals init (
        [0] int32 x)
    L_0000: ldc.i4.0 
    L_0001: stloc.0 
    L_0002: ldloc.0 
    L_0003: call void [mscorlib]System.Console::WriteLine(int32)
    L_0008: ret 
}

c#编译器直接把

int x = new int();

翻译成了

int x = 0;

那如果换成其它值类型哪？例如：

static void X()
{
    int? x = null;
    Console.WriteLine(x);
}

c#编译器则翻译成：

.method private hidebysig static void X() cil managed
{
    .maxstack 1
    .locals init (
        [0] valuetype [mscorlib]System.Nullable`1<int32> x)
    L_0000: ldloca.s x
    L_0002: initobj [mscorlib]System.Nullable`1<int32>
    L_0008: ldloc.0 
    L_0009: box [mscorlib]System.Nullable`1<int32>
    L_000e: call void [mscorlib]System.Console::WriteLine(object)
    L_0013: ret 
}

可以看到在处理int? x = null;时，c#编译器使用了：

ldloca.s x
initobj [mscorlib]System.Nullable`1<int32>

这两句，ldloca.s是ldloca的短格式，ldloca可以理解为Load Local variable Address的意思，也即是加载x变量的地址。然后是使用了initobj这个操作符，这也就是.net的高级语言都不需要定义值类型的原因，initobj就是CLR专门用初始化一个值类型的操作符。

所以，可以进一步改进之前的方法，来进一步支持值类型，当然还可以利用原来的方法，获得相应的帮助，得到下面的改良版：

static Func<object> BuildMethodMyCreateInstance(Type type)
{
    DynamicMethod dm = new DynamicMethod(string.Empty, typeof(object), Type.EmptyTypes);
    var gen = dm.GetILGenerator();
    if (type.IsValueType)
    {
        gen.DeclareLocal(type);
        gen.Emit(OpCodes.Ldloca_S, 0);
        gen.Emit(OpCodes.Initobj, type);
        gen.Emit(OpCodes.Ldloc_0);
        gen.Emit(OpCodes.Box, type);
    }
    else
    {
        gen.Emit(OpCodes.Newobj, type.GetConstructor(Type.EmptyTypes));
    }
    gen.Emit(OpCodes.Ret);
    return (Func<object>)dm.CreateDelegate(typeof(Func<object>));
}

现在这个方法就可以同时处理值类型和引用类型的情况。

posted on 2010-02-27 14:36 Zhenway 阅读(2505) 评论(3) 编辑收藏举报

刷新页面返回顶部