AOT漫谈专题(第四篇): C#程序如何编译成Native代码

一：背景

1. 讲故事

大家都知道所谓的.NET Native AOT即通过AOT编译器直接将C#代码编译成机器码，大家也习惯用C/C++的编译过程来类比，都是静态编译本质上都差不多，这篇我们借助工具从宏观层面去看一看AOT的编译过程。

二：C/C++ 的编译过程

用gcc编译过c代码的朋友都知道，分别可以用 -E, -S, -c，-o 来显示编译的各个阶段，即：

预处理阶段：落地 define，include文件和代码。
编译阶段：将C转为汇编代码。
汇编阶段：汇编代码转为机器码。
链接阶段：链接libc库及系统库，生成可执行文件。

画一张图如下：

这个世界上虽然说隔行如隔山，但隔行不隔理，有了这些知识，接下来就是按图索骥的对号入座即可。

三：AOT编译过程

在.NET中AOT编译器叫做ilc.exe，它是用C#代码写的，并且随.NET版本更新，比如我这里的 C:\Users\Administrator\.nuget\packages\runtime.win-x64.microsoft.dotnet.ilcompiler\8.0.8\tools\ilc.exe 。

对应的源码是在 D:\sources\runtime\src\coreclr\tools\aot 下。

还有一点要注意的是 ilc.exe 接收的是 MSIL 代码，而不是 C# 代码，有些朋友要问了 MSIL 何处来，自然是 dotnet publish 的时候先调用 Rolysn 来准备了，画个图如下：

接下来就是正式的ilc阶段。

1. 预处理阶段在哪里

这个阶段其实就对应着AOT的 依赖图构建和优化 ，当然C#这里比较复杂，包括的东西也比较多，比如：

构建依赖图
Pinvoke，COM，Delegate 的IL代码二次处理
ValueType 的 GetHashCode 和 Equals 生成。
对反射的有限支持，提供了一些元数据。
摇树优化

为依赖图构建的所有物料，可以参考 obj\Debug\net8.0\win-x64\native 文件夹下的 Example_21_2.ilc.rsp。

感兴趣的朋友可以重点研究下这个库下的代码以及 DependencyAnalyzer 类，截图如下：


    /// <summary>
    /// Implement a dependency analysis framework. This works much like a Garbage Collector's mark algorithm
    /// in that it finds a set of nodes from an initial root set.
    ///
    /// However, in contrast to a typical GC in addition to simple edges from a node, there may also
    /// be conditional edges where a node has a dependency if some other specific node exists in the
    /// graph, and dynamic edges in which a node has a dependency if some other node exists in the graph,
    /// but what that other node might be is not known until it may exist in the graph.
    ///
    /// This analyzer also attempts to maintain a serialized state of why nodes are in the graph
    /// with strings describing the reason a given node was added to the graph. The degree of logging
    /// is configurable via the MarkStrategy
    ///
    /// </summary>
    public sealed class DependencyAnalyzer<MarkStrategy, DependencyContextType> : DependencyAnalyzerBase<DependencyContextType> where MarkStrategy : struct, IDependencyAnalysisMarkStrategy<DependencyContextType>
    {
        private MarkStrategy _marker = new MarkStrategy();
        private IComparer<DependencyNodeCore<DependencyContextType>> _resultSorter;
        private RandomInsertStack<DependencyNodeCore<DependencyContextType>> _markStack;
        private List<DependencyNodeCore<DependencyContextType>> _rootNodes = new List<DependencyNodeCore<DependencyContextType>>();
    }

官方注释中写的挺有意思，这玩意就像 GC Mark 算法，看字段也是一个 深度优先算法。

有些朋友可能比较好奇，这个依赖树最后变成了什么样子，可以在 csproj 上配置 <IlcGenerateMapFile>true</IlcGenerateMapFile> 节点，然后通过 dotnet publish 就会生成一个 Example_21_2.map.xml 文件，打开即可看到类型和方法节点。