为什么0x00400000是可执行文件的默认基址?EXE base address start with 400000H,Why is 0x00400000 the default base address for an executable?
DLL的默认基址是0x10000000,但EXE的默认基址是0x00400000。为什么EXE特别值?4 兆字节
有什么特别之处它与x86上单页目录条目映射的地址空间量和1987年
的设计决策有关。对EXE基地址的唯一技术要求是它是一个64KB的倍数。但基地址的一些选择比其他选择更好。
选择基址的目的是最小化模块必须重新定位的可能性。这意味着不会与地址空间中已有的东西发生冲突(这将迫使您重新定位),也不会与稍后可能到达地址空间的东西发生冲突(迫使他们重新定位)。对于可执行文件,不会与稍后可能到达的内容发生冲突部分意味着避免地址空间的区域往往填充DLL。由于操作系统本身将DLL置于高地址并且非操作系统DLL的默认基址为0x10000000,这意味着可执行文件的基址应该低于0x10000000,并且越低,您拥有的空间越大在开始与DLL冲突之前。但是你能走多远?
第一部分意味着你也想避免已经存在的东西。Windows NT在低地址时没有很多东西。唯一已经存在的是一个PAGE_NOACCESS
映射为零的页面,以便捕获空指针访问。因此,在Windows NT上,您可以将可执行文件基于0x00010000,并且许多应用程序都是这样做的。
但是在Windows 95上,已经存在很多东西。Windows 95虚拟机管理器将第一个64 KB的物理内存永久映射到第一个64 KB的虚拟内存,以避免CPU错误。(Windows 95 必须解决许多CPU错误和固件错误。)此外,整个第一兆字节的虚拟地址空间映射到活动虚拟机的逻辑地址空间。(Nitpickers:实际上略高于一兆字节。)x86处理器的virtual-8086模式需要这种映射行为。
Windows 95与其前身Windows 3.1一样,在一个特殊的虚拟机(称为系统虚拟机)中运行Windows,为了兼容性,它仍然通过16位代码路由各种各样的东西,以确保诱饵以正确的方式嘎嘎作响。因此,即使CPU正在运行Windows应用程序(而不是基于MS-DOS的应用程序),它仍然保持虚拟机映射处于活动状态,因此它不必进行页面重映射(以及来自昂贵的TLB刷新)与它)每一次它需要去到MS-DOS兼容层。
好的,所以第一兆字节的地址空间已经不在了。那另外三兆呢?
现在我们回到文章顶部的那个小提示。
为了快速进行上下文切换,Windows 3.1虚拟机管理器将每个VM上下文“舍入”到4 MB。这样做可以通过简单地更新页面目录中的单个32位值来执行内存上下文切换。(Nitpickers:你还必须标记实例数据页面,但这只是翻了十几个位。)这种舍入导致我们失去3兆字节的地址空间,但考虑到有4千兆字节的地址空间,损失少了超过十分之一的百分比被认为是显着的绩效改善的公平权衡。(特别是因为当时没有任何应用程序接近开始划破这个限制的表面。你的整个计算机首先只有2MB的RAM!)
此内存映射已转移到Windows 95中,并进行了一些调整以处理32位Windows应用程序的单独地址空间。因此,可执行文件可以在Windows 95上加载的最低地址是4MB,即0x00400000。
极客琐事:为防止Win32应用程序访问MS-DOS兼容性区域,平面数据选择器实际上是一个向下扩展选择器,它停在4MB边界。(类似地,16位Windows应用程序中的空指针会导致访问冲突,因为空选择器无效。它不会访问中断向量表。)
链接器选择0x0400000可执行文件的默认基址,以便生成的二进制文件可以在Windows NT和Windows 95上无需重定位加载。没有人真正关心Windows 95的目标,所以原则上,链接器人员可以选择不同的现在是默认基地址。但除了使图表看起来更漂亮之外,没有真正的动机去做,特别是因为ASLR无论如何都让整个问题都没有实际意义。此外,如果他们改变它,那么人们会问,“为什么一些可执行文件的基地址为0x04000000,一些可执行文件的基地址为0x00010000?”
TL; DR:快速进行上下文切换。
The default base address for a DLL is 0x10000000, but the default base address for an EXE is 0x00400000. Why that particular value for EXEs? What’s so special about 4 megabytes
It has to do with the amount of address space mapped by a single page directory entry on an x86 and a design decision made in 1987.
The only technical requirement for the base address of an EXE is that it be a multiple of 64KB. But some choices for base address are better than others.
The goal in choosing a base address is to minimize the likelihood that modules will have to be relocated. This means not colliding with things already in the address space (which will force you to relocate) as well as not colliding with things that may arrive in the address space later (forcing themto relocate). For an executable, the not colliding with things that may arrive later part means avoiding the region of the address space that tends to fill with DLLs. Since the operating system itself puts DLLs at high addresses and the default base address for non-operating system DLLs is 0x10000000, this means that the base address for the executable should be somewhere below 0x10000000, and the lower you go, the more room you have before you start colliding with DLLs. But how low can you go?
The first part means that you also want to avoid the things that are already there. Windows NT didn’t have a lot of stuff at low addresses. The only thing that was already there was a PAGE_NOACCESS
page mapped at zero in order to catch null pointer accesses. Therefore, on Windows NT, you could base your executable at 0x00010000, and many applications did just that.
But on Windows 95, there was a lot of stuff already there. The Windows 95 virtual machine manager permanently maps the first 64KB of physical memory to the first 64KB of virtual memory in order to avoid a CPU erratum. (Windows 95 had to work around a lot of CPU bugs and firmware bugs.) Furthermore, the entire first megabyte of virtual address space is mapped to the logical address space of the active virtual machine. (Nitpickers: actually a little more than a megabyte.) This mapping behavior is required by the x86 processor’s virtual-8086 mode.
Windows 95, like its predecessor Windows 3.1, runs Windows in a special virtual machine (known as the System VM), and for compatibility it still routes all sorts of things through 16-bit code just to make sure the decoy quacks the right way. Therefore, even when the CPU is running a Windows application (as opposed to an MS-DOS-based application), it still keeps the virtual machine mapping active so it doesn’t have to do page remapping (and the expensive TLB flush that comes with it) every time it needs to go to the MS-DOS compatibility layer.
Okay, so the first megabyte of address space is already off the table. What about the other three megabytes?
Now we come back to that little hint at the top of the article.
In order to make context switching fast, the Windows 3.1 virtual machine manager “rounds up” the per-VM context to 4MB. It does this so that a memory context switch can be performed by simply updating a single 32-bit value in the page directory. (Nitpickers: You also have to mark instance datapages, but that’s just flipping a dozen or so bits.) This rounding causes us to lose three megabytes of address space, but given that there was four gigabytes of address space, a loss of less than one tenth of one percent was deemed a fair trade-off for the significant performance improvement. (Especially since no applications at the time came anywhere near beginning to scratch the surface of this limit. Your entire computer had only 2MB of RAM in the first place!)
This memory map was carried forward into Windows 95, with some tweaks to handle separate address spaces for 32-bit Windows applications. Therefore, the lowest address an executable could be loaded on Windows 95 was at 4MB, which is 0x00400000.
Geek trivia: To prevent Win32 applications from accessing the MS-DOS compatibility area, the flat data selector was actually an expand-down selector which stopped at the 4MB boundary. (Similarly, a null pointer in a 16-bit Windows application would result in an access violation because the null selector is invalid. It would not have accessed the interrupt vector table.)
The linker chooses a default base address for executables of 0x0400000 so that the resulting binary can load without relocation on both Windows NT and Windows 95. Nobody really cares much about targeting Windows 95 any more, so in principle, the linker folks could choose a different default base address now. But there’s no real incentive for doing it aside from making diagrams look prettier, especially since ASLR makes the whole issue moot anyway. And besides, if they changed it, then people would be asking, “How come some executables have a base address of 0x04000000 and some executables have a base address of 0x00010000?”
TL;DR: To make context switching fast.