代码改变世界

Introduction

2011-11-03 11:50  iRead  阅读(281)  评论(0编辑  收藏  举报

Why This Book Was Written

  To tell the truth, I don't think I had much choice in this matter. Let me explain. With Microsoft .NET technology taking the world by storm, with more and more information professionals getting involved, large numbers of books covering various aspects of this technology have started to arrive—and none too soon. Alas, virtually all of these books are dedicated to .NET-based programming in high-level languages and rapid application development (RAD) environments. No doubt this is extremely important, and I am sure all these books will have to be reprinted to satisfy the demand. But what about the plumbing?

  The .NET universe, like other information technology universes, resembles a great pyramid turned upside down and standing on its tip. The tip on which the .NET pyramid stands is the common language runtime. The runtime converts the intermediate language (IL) binary code into platform-specific (native) machine code and executes it. Resting on top of the runtime are the .NET Framework class library, the compilers, and environments such as Microsoft Visual Studio .NET. And above them begin the layers of application development, from instrumental to end-user-oriented. The pyramid quickly grows higher and wider.

  This book is not exactly about the common language runtime—even though it's only the tip of the .NET pyramid, the runtime is too vast a topic to be described in detail in any book of reasonable (say, luggable) size. Rather, this book focuses on the next best thing: the .NET IL Assembler. IL assembly language (ILAsm) is a low-level language, specifically designed to describe every functional feature of the common language runtime. If the runtime can do it, ILAsm must be able to express it.

  Unlike high-level languages, and like other assembly languages, ILAsm is platform-driven rather than concept-driven. An assembly language usually is an exact linguistic mapping of the underlying platform, which in this case is the common language runtime. It is, in fact, so exact a mapping that this language is used for describing aspects of the runtime in the ECMA standardization documents regarding the .NET common language infrastructure. (ILAsm itself, as a part of the common language infrastructure, is a subject of this standardization effort as well.) As a result of the close mapping, it is impossible to describe an assembly language without going into significant detail about the underlying platform. So, to a great extent, this book is about the common language runtime after all.

  IL assembly language is very popular among .NET developers. No, I am not claiming that all .NET developers prefer to program in ILAsm rather than in Microsoft Managed C++, Microsoft Visual C# .NET, or Microsoft Visual Basic .NET. But all .NET developers use the IL Disassembler (ILDASM) now and then, and many use it on a regular basis. A cyan thunderbolt—the ILDASM icon (a silent praise for David Drake)—glows on the computer screens of .NET developers regardless of their language preferences and problem areas. And ILDASM text output is…? Yes, ILAsm source code.

  Virtually all books on .NET-based programming that are devoted to high-level programming languages, such as Visual C# .NET or Visual Basic .NET, or to techniques such as ADO.NET at some moment mention the IL Disassembler as a tool of choice to analyze the innards of a .NET IL executable. But these volumes stop short of explaining what the disassembly text means and how to interpret it. This is an understandable choice, given the topics of these books; the detailed description of metadata structuring and IL assembly language represents a separate issue.

  Now perhaps you see what I mean when I say I had no choice but to write this book. Someone had to, and because I had been given the responsibility of designing and developing IL Assembler and ILDASM, it was my obligation to see it through all the way.

History of ILAsm, Part I

  The first versions of IL Assembler and ILDASM (under the names Asm and Dasm, respectively) were developed in early 1998 by Jonathan Forbes. The current language is very different from this original one, the only distinct common feature being the leading dots in the directive keywords. The assembler and disassembler were built as purely internal tools facilitating the ongoing development of the common language runtime and were used rather extensively inside the runtime development team.

  When Jonathan went to work on Microsoft Messenger in the beginning of 1999, the assembler and disassembler fell in the lap of Larry Sullivan, head of a development group with the colorful name CROEDT (Common Runtime Odds and Ends Development Team). In April of that year, I joined the team, and Larry passed the assembler and disassembler to me. When an alpha version of the common language runtime was presented at a Technical Preview in May 1999, Asm and especially Dasm attracted significant attention, and I was told to rework the tools and bring them up to production level. So I did, with great help from Larry, Vance Morrison, and Jim Miller. Because the tools were still considered internal, we (Larry, Vance, Jim, and I) could afford to redesign the language—not to mention the implementation of the tools—radically.

  A major breakthrough occurred in the second half of 1999, when IL Assembler input and ILDASM output were synchronized enough to achieve limited round-tripping. Round-tripping means that you can take a managed (IL) executable compiled from a particular language, disassemble it, add or change some ILAsm code, and reassemble it back into a modified executable. Round-tripping technique opened new avenues, and shortly thereafter it began to be used in certain production processes both inside Microsoft and by its partners.

  At about the same time, third-party .NET-oriented compilers that used ILAsm as a base language started to appear. The best-known is probably Fujitsu's COBOL.NET, which made quite a splash at the Professional Developers Conference in July 2000, where the first pre-beta version of the common language runtime, along with the .NET Framework class library, compilers, and tools, was released to the developer community.

  Since the release of the beta 1 version in late 2000, IL Assembler and ILDASM have been fully functional in the sense that they reflect all the features of metadata and IL, support complete round-tripping, and maintain synchronization of their changes with the changes in the runtime itself.

Who Should Read This Book

  This book targets all the .NET-oriented developers who, because they work at a sufficiently advanced level, care about what their programs compile into or who are willing to analyze the end results of their programming. Here these readers will find the information necessary to interpret disassembly texts and metadata structure summaries, allowing them to develop more efficient programming techniques.

  Because this analysis of disassemblies and metadata structuring is crucial in assessing the correctness and efficiency of any .NET-oriented compiler, this book should also prove especially useful for compiler developers who are targeting .NET. A narrower but growing group of readers who will find the book extremely helpful includes developers who use IL assembly language directly: for example, compiler developers targeting ILAsm as an intermediate step, developers contemplating multilanguage projects, and developers willing to exploit the capabilities of the common language runtime that are inaccessible through the high-level languages.

  Finally, this book can be valuable in all phases of software development, from conceptual design to implementation and maintenance.

Organization of This Book

  I begin in Part I, “Quick Start,” with a quick overview of ILAsm and common language runtime features, based on a simple sample program. This overview is in no way complete; rather, it is intended to convey a general impression about the runtime and ILAsm as a language.

  The following parts discuss features of the runtime and corresponding ILAsm constructs in a detailed, bottom-up manner. Part II, “Underlying Structures,” describes the structure of a managed executable file and general meta-data organization. Part III, “Fundamental Components,” is dedicated to the components that constitute a necessary base of any application: assemblies, modules, classes, methods, fields, and related topics. Part IV, “Inside the Execution Engine,” brings you, yes, inside the execution engine, describing the execution of IL instructions and managed exception handling. Part V, “Special Components,” discusses metadata representation and usage of the additional components: events, properties, and custom and security attributes. And Part VI, “Interoperation,” describes the interoperation between managed and unmanaged code and discusses practical applications of IL Assembler and ILDASM to multilanguage projects.

  The book's five appendixes contain references concerning ILAsm grammar, metadata organization, and the IL instruction set and tool features, including IL Assembler, ILDASM, and the offline metadata validation tool.

  返回目录