转:《Java编程思想》作者:C++不垃圾,只是Java很傲慢

《Thinking in C++》及《Thinking in Java》的作者Bruce Eckel向来是个“拥C++反Java”派,他曾经不止一次的提到,C++语言特性的添加有多么的深思熟虑,而Java又是如何的把一些奇怪的东西不停 的加进去。Bruce认为,理解语言特性为什么会存在是非常有帮助的。他将其称之为“语言考古学”。

【51CTO外电精选】本文节选自《Thinking in C++》及《Thinking in Java》作者Bruce Eckel的博文,文章写在一次C++规范委员会例常会议之后,Bruce受C++设计师(常被称为C++之父)Bjarne Stroustrup邀请而参与了这次会议,并写下了参会感想如下(节选):

在C++委员会会议上我所能找到的,是C++社区里最聪明的一群人,群英荟萃,为我答疑解惑。我很快意识到,这种方式之好,远超我在任何一门研究生课程中之所得。如果考虑到研究生的机会成本,这还是一笔在财务上要划算得多的生意。

我被深深吸引住了,坚持出席了有大约8年的时间。在我走后,委员会仍继续前行;虽标准仍未制定完成,但彼时Java已经出现了,还有一些其他(语言)的草案也问世了(这是技术刺激成瘾者的毛病——我的确钻研某一门语言,但我也一直在寻找更有生产力的手段:那些前景看起来很光明的语言特性可以毫不费力地分散我的注意力)。

每次大家见面的时候,我都会抛出一列清单,这是我累积下来的有关C++的棘手问题列表。通常我会请他们在几日内予以澄清。出席委员会能看到的最有价值的东西就是这个,当然,还包括得以早早接触到即将公布的新特性。

从长远来看,把语言特性添加进C++的谜团里面并观察它,是一门深奥的学问。现在说三道四是一件很简单的事情,说什么C++太烂了,设计太糟糕了等等。在对C++设计时所受的约束都没有任何理解时,很多人就这样脱口而出了。Stroustrup(51CTO编者注:这个Stroustrup也就是邀请作者参会的Stroustrup,也就是C++语言的设计师Stroustrup)的约束是,C程序应该稍作改动,或者最好不做改动,就能在C++下编 译。且不管这是不是完全合乎逻辑,但它给C程序员提供了一个很好的演进路径。不过这存在较大的局限性,需要把每一项大家抱怨不已的困难特性都一一虚拟化。 由于这些特性难以理解,许多人就直接得出结论说C++设计糟糕,而这远非事实。

在语言设计上,Java用傲慢的态度对待这一认识。关于这一点,我在《Java编程思想》及许多博文上都写过了。因此我的长期追随者都知道,由于 Gosling(Java语言之父)和Java语言设计者对C++的否定态度,Java一开始就把我拧到了错误的方向。说实话,我与Gosling的首次“邂逅”印象糟糕——那是很多年以前的事了,当时我刚进入第一家公司,第一次开始使用UNIX(Fluke,生产电子测试设备;我在里面做嵌入式系统编程)。有一位软件工程师辅导我,教我使用emacs。不过当时公司里唯一的工具只有 Gosling Emacs的商用版(Unipress)。如果你做错了什么,程序会侮辱你,把你叫做火鸡,并把屏幕填满垃圾。这样的东西出现在了一个商用产品上,而我们公司可是花了相当一笔钱的。不消说,等到Gnu emacs变得稳定起来后,公司马上就换到了Gnu emacs上(我见过Richard Stallman。当然,他是个疯狂的家伙。不过他也是绝顶聪明的:他知道当遇到麻烦的时候,你需要的是帮助,而不是侮辱)。(51CTO编者注:Richard Stallman即Gnu emacs的开发人员,美国一位著名黑客。他曾在05年坐客新浪,与洪峰大谈黑客道培训。)

我不知道对Gosling印象的这段形成经历在多大程度上影响了我后面对他工作的看法,但事实上,“我们看见它太差劲了,就决定拿出自己的语言”, 对C++的这种态度于事无补。尤其是当我开始在《Java编程思想》的写作过程中把它弄清楚,并屡次发现,那些草率决定的语言特性与库,都不得不予以修订.——确实如此,其中的大部分都必须要修订,有些修订还是在程序员已经忍受了多年之后才落实。在许多场合下,Gosling坦诚他们必须快马加鞭,否则就要被互联网革命超越了。

我发现,理解语言特性为什么会存在是非常有帮助的。如果是由大学教授一下子和盘托出,把它们端到你面前,你势必就会构想出这门语言的一个神话,说 “这种语言特性之所以存在,肯定有一些真正重要的原因,这些原因只有创建这门语言的聪明人才能理解,我是理解不了的,我信赖它就是了”。从某方面来说,对语言特性这种基于信仰的接受是一种负担;它阻止你对所发生的事情进行分析和理解。在我的主旨演讲中(Bruce将在未来几天参与一个主旨演讲),我会关注一些特性,并检查一下它们在不同语言中是如何被实现的,以及为什么被实现。

这里就有个例子。对象创建。在C语言中,声明了变量之后编译器就会为你创建堆栈空间(未经初始化,除非你初始化,否则会有垃圾数据)。但是如果你想 要动态地做这件事情,你就得使用 malloc() 和 free()这两个标准库函数,还要小心翼翼地手工执行完所有的初始化及清理工作。如果你忘了,就会出现内存泄漏等类似灾难,这是常有的事。

有关动态对象创建:一般来说,编译器将内存分为三部分:静态存储区域、栈、堆。静态存储区主要保存全局变量和静态变量,栈存储调用函数相关的变量、地址等,堆存储动态生成的变量,在c中是指由malloc,free运算产生释放的存储空间,在c++中就是指new和delete运算符作用的存储区域。(来源:51CTO树洞的技术博客

因为malloc() 和 free() “仅仅”是库函数,在基本编程课上,应有的相关知识通常没有被传授,令人既疑惑不解又胆颤心惊。当程序员需要分配大量的内存空间时,他们就不去学如何来使 用这些函数进行处理,取而代之的是常常就分配一个巨型数组的全局变量了事(不是开玩笑),数组之大,远远超过他们曾自认为所需的空间。程序似乎工作了,再说了,好像谁都不会用到产生越界——因此,当多年之后它的确发生的时候,程序中断了,而某个可怜的家伙就得一头钻进去,把错误在哪里这个谜底给找出来。

Stroustrup认为动态分配需要更简单、更安全——这一块得放到语言核心中,而不是降格为库函数。还必须要与初始化和清理一起协同工作,初始化和清理必须由构造函数和析构函数分别提供,以便为所有对象提供相同的保证。

这个问题是影响了全部C++决策的一块里程碑:对C的向后兼容性。理想情况下,对象的堆栈(heap)分配可只需忽略即可。但C的兼容性要求进行堆 栈(stack)分配,因此必须对heap对象和stack对象进行区分。为了解决这个问题,C++从SmallTalk挪用了new 这个关键字。创建 stack 对象只需声明即可,像这样: Cat x;或者带参数的情况下, Cat x("mittens");。而创建heap 对象时,就使用new,像这样: new Cat x; 或者 new Cat x("mittens");。利用这个约束,我们得到一个优雅而一致的解决方案。

自从判定C++的一切都做得不好且过于复杂之后,Java就产生了。具有讽刺意味的是,Java 决定把 stack 彻底抛弃了(特别是忽略了基本类型上的失败,这点我已经在别的地方指出过了)。真好啊,既然所有对象都是在heap上分配的,区分stack和heap的分配就没有必要了。他们可以轻易说Cat x = Cat() 或者 Cat x = Cat("mittens")。或者甚至更好地,用联合的类型引用来消除重复(不过那样——还有像闭包(closure)之类的其他特性——就显得“太长”了。因此我们反而离不开Java的平凡版;类型推导已经讨论过了,但我敢打赌那不会发生,也不该发生。因为这会在给Java增加特性的同时带来问题)。

Guido Van Rossum (Python的创建者)采用了一个最小化的方案——经常为人所痛斥的空白的使用,正说明了他对语言简洁性的追求。既然new 关键字不再必要,他就省去了,好像这样: x = Cat("mittens")。Ruby 也可能用了这种方法,不过Ruby其中一个主要的约束是尽可能追随Smalltalk,因此在Ruby是这样的:x = Cat.new("mittens") 。但Java以贬低C++做事的方式为准则,以至于用了new 这个关键字成了一个迷了。自研究了该语言在其他地方所做的决策后,我的猜测是,他们是不是从来就没有意识到,这东西根本就是可有可无的?

因此这就是我所说的语言考古学的意思。我希望人们能用一个更好的视角来看待语言设计,并在学习一门编程语言时,能有更多的批判性思维过程。

 

 

附:

英文原文:

 

Why? Language Archaeology ... and Metaprogramming
by Bruce Eckel
June 16, 2009
Summary
I showed up at the organizational meeting for the ANSI/ISO C++ standards committee because Bjarne Stroustrup asked me to. I knew him from my early C++ work and from conferences, and I suspect he considered me a friendly influence.


I had no intention of committing to the time, money and travel necessary to meet three times a year, for a week each time, but the organizational meeting was in Washington DC and I was living in Pennsylvania at the time so I thought "what the heck, I'll drive a few hours and check it out."

Note: My keynote at the upcoming EuroPython conference, June 30-July 2 in Birmingham, UK will be on language archaeology. I'll also be giving a talk on Metaprogramming, sitting on a panel, and generally participating for the whole 3 days. If you're there, feel free to draft me for an open-spaces discussion, or start a hallway conversation -- I'm going there because I want to talk about Python.

You can find out more about the conference, and register, here. There are also pre-conference tutorials beginning on June 28th.

Language Archaeology

What I found at the C++ committee meeting were most of the smartest people in the C++ community, gathered together in one place, available to answer my questions. I quickly realized this was far better than I could ever find in any graduate school program. And if you factor in the opportunity costs of graduate school, a far better deal financially as well.

I was hooked, and kept attending for about 8 years. The committee continued after I wandered away; the standard hadn't been completed yet but Java had appeared by that time and some others were also drifting off (It's the problem with being a technological stimulus junkie -- I do delve deep, but I'm always looking for more productivity leverage so it's not too hard to distract me with promising language features).

Each time we met, I would show up with a list of the thorniest C++ questions that had accumulated for me in the interim. I would usually have them all clarified within a couple of days. That, and being exposed to upcoming features early, was the most valuable short-term benefit of being on the committee.

In the long term, watching the addition of language features into the C++ puzzle was deep learning. It's easy now to monday-morning quarterback and say that C++ sucks and was badly designed, and many people do so without any understanding of the constraints under which it was designed. Whether or not it was entirely legitimate, Stroustrup's constraint was that a C program should compile with either trivial or (preferably) no changes under C++. This provided an easy evolution path for C programmers, but it was a big limitation and accounts for virtually every difficult feature that people complain about. But because those features are hard to understand, many people jump to the conclusion that C++ was badly designed, which is far from the truth.

Java fed this perception with its cavalier attitude about language design. I've written about this in Thinking in Java and in many weblogs, so longtime followers already know that Java tweaked me the wrong way from the start, because of the dismissive attitude of Gosling and the language designers. To be honest, my first "encounter" with Gosling left a bad taste in my mouth -- it was many years before, when I first began using Unix at one of the first companies I worked (Fluke, which makes electronic test equipment; I was doing embedded systems programming). One of the other software engineers was coaching me and had guided me towards emacs. But the only tool available in the company at that time was the commercial (Unipress) version of Gosling Emacs. And if you did something wrong, the program would insult you by calling you a turkey and filling the screen with garbage. This, in a commercial product for which the company had paid fair amounts of cash. Needless to say, as soon as Gnu emacs became stable the company switched over to that (I've met Richard Stallman. He's crazy, sure. But he's wicked smart, and smart enough to know that when you are in trouble, you need help, lots of it, and not insults).

I have no idea how much this formative experience with Gosling influenced my later feelings about his work, but the fact that the attitude about C++ was "we looked at it and it sucked so we decided to whip out a language of our own" didn't help. Especially when I began to tease it apart in the process of writing Thinking in Java and discovered, time after time, features and libraries where the decisions were slapdash -- indeed, most of these had to be repaired, sometimes after years of programmer suffering. And on numerous occasions Gosling admitted that they had to cut corners to hurry and get it out or else the internet revolution would have passed them by.

So the reason I'm giving this keynote is that I find it very helpful to understand why language features exist. If they're just handed to you on a platter by a college professor, you tend to develop a mythology around the language and to say "there's some really important reason for this language feature that the smart people who created the language understand and that I don't, so I'll just take it on faith." And at some point, faith-based acceptance of language features is a liability; it prevents you from being able to analyze and understand what's going on. In this keynote, I look at a number of features and examine how they are implemented in different languages, and why.

Here's an example: object creation. In C, you declare variables and the compiler creates stack space for you (uninitialized, containing garbage unless you initialize it). But if you want to do it dynamically, you must use the malloc() and free() standard library functions, and carefully perform all the initialization and cleanup by hand. If you forget, you have memory leaks and similar disasters, which happened frequently.

Because malloc() and free() were "only" library functions, and confusing and scary at that, they often didn't get taught in basic programming classes like they should have. And when programmers needed to allocate lots of memory, instead of learning about and dealing with these functions they would often (I kid you not) just allocate huge arrays of global variables, more than they ever thought they'd need. The program seemed to work, and no one would ever exceed those bounds anyway -- so when it did happen, years later, the program would break and some poor sod would have to go in and puzzle it out.

Stroustrup decided that dynamic allocation needed to be easier and safer -- it needed to be brought into the core of the language and not relegated to library functions. And it needed to be coupled with the same guaranteed initialization and cleanup that constructors and destructors provide for all objects.

The problem was the same millstone that dogged all C++ decisions: backward compatibility with C. Ideally, stack allocation of objects could simply have been discarded. But C compatibility required stack allocation, so there needed to be some way to distinguish heap objects from stack objects. To solve this problem, the new keyword was appropriated from Smalltalk. To create a stack object, you simply declare it, as in Cat x; or, with arguments, Cat x("mittens");. To create a heap object, you use new, as in new Cat; or new Cat("mittens");. Given the constraints, this is an elegant and consistent solution.

Enter Java, after deciding that everything C++ is badly done and overly complex. The irony here is that Java could and did make the decision to throw away stack allocation (pointedly ignoring the debacle of primitives, which I've addressed elsewhere). And since all objects are allocated on the heap, there's no need to distinguish between stack and heap allocation. They could easily have said Cat x = Cat() or Cat x = Cat("mittens"). Or even better, incorporated type inference to eliminate the repetition (but that -- and other features like closures -- would have taken "too long" so we are stuck with the mediocre version of Java instead; type inference has been discussed but I will lay odds it won't happen. And shouldn't, given the problems in adding new features to Java).

Guido Van Rossum (creator of Python) took a minimalist approach -- the oft-lambasted use of whitespace is an example of how clean he wanted the language. Since the new keyword wasn't necessary, he left it out, so you say x = Cat("mittens"). Ruby could have also used this approach, but one of Ruby's main constraints is that it follows Smalltalk as much as possible, so in Ruby you say x = Cat.new("mittens") (here's a nice introduction to Ruby). But Java made a point of dissing the C++ way of doing things, so the inclusion of the new keyword is a mystery. My guess, after studying decisions made in the rest of the language, is that it just never occurred to them that they could get away without it.

So that's what I mean about language archaeology. I have a list of other features for similar analysis during the keynote. I hope that people will come away with a better perspective on language design, and a more critical thought process when they are learning programming languages.

Metaprogramming

Ordinary programs manipulate data. Most of the time, ordinary programming is all you need. But sometimes you find yourself writing the same kind of code over and over, thus violating the most fundamental principle in programming: DRY ("Don't Repeat Yourself"). And yet, there doesn't seem to be any way in your programming language to fix the problem directly, so you are left duplicating code, knowing that your project continues to scatter this duplicated code throughout, and that if you need to change anything you're going to have to find every piece of duplicated code and fix it and test it over again. And on top of that, it's just plain inelegant.

This is where metaprogramming comes in. Metaprograms manipulate programs. Metaprogramming is code that modifies other code. So when you find yourself duplicating code, you can write a metaprogram and apply that instead, and you're back to following DRY.

Because metaprogramming is so clever and feels so powerful, it's easy to look for applications everywhere. It's worth repeating that most of the time, you don't need it. But when you do, it's amazingly useful.

This is why it keeps springing up, even in languages that really fight against code modification. In C++, which has no run-time model (everything compiles to raw code, another legacy of C compatibility), template metaprogramming appeared. Because it's such a struggle, template metaprogramming was very complex and almost no one figured out how to do it.

Java does have a runtime model and even a way to perform dynamic modifications on code, but the static type checking of the language is so onerous that raw metaprogramming was almost as hard to do as in C++ (I show these techniques in Thinking in Java). Clearly the need kept reasserting itself, because first we saw Aspect-Oriented Programming (AOP), which turned out to be a bit too complex for most programmers. Somehow, (and I'm kind of amazed that such a useful feature got through so late in Java's evolution), annotations got added, and these are really a metaprogramming construct -- annotations can even be used to modify code during compilation, much like C++ template metaprogramming (this form is more complex, but you don't need it as often). So far, annotations seem to be fairly successful, and have effectively replaced AOP.

Ruby takes a different approach, appropriate to Ruby's philosophy. Like Smalltalk, everything in Ruby is fungible (this is the very thing that makes dynamic languages so scary to folks who are used to static languages). I have no depth in Ruby metaprogramming, but as I understand it, it is just part of the language, with no extra constructs required. In combination with the optional-parameter syntax, it makes Ruby very attractive for creating Domain Specific Languages (DSLs), an example of which is Ruby on Rails.

Python also felt the pressure to support metaprogramming, which began to appear in 2.x versions of the language in the form of metaclasses. Metaclasses turned out to be (like AOP in Java) too much of a mental shift to allow easy use by mainstream Python programmers, so in the most recent versions of Python, decorators were added. These are similar in syntax to Java annotations, but more powerful in use, primarily because Python is as fungible as Ruby. What's interesting is that the intermediate step of actually applying the decorator turns out to be less of an intrusion, (as it might initially appear) but rather a beneficial annotation that makes the code more understandable. Indeed, one of the main problems with metaclasses (other than the complexity) was that you couldn't easily see by looking at the code what it does; you had to know that the magic had been folded in because of the metaclass. Whereas, with decorators, it is clear that metaprogramming actions are being applied to a function or class.

Although we haven't seen an explosion in the creation of DSLs in Python as we have in Ruby (it's certainly possible to create DSLs in Python), decorators provide an interesting alternative: instead of creating an entirely new language syntax, you can create a decorated version of Python. Although this might not be tightly targeted to domain users as DSLs are, it has the benefit of being more easily understandable to someone who is already familiar with Python (rather than learning a new syntax for each DSL).

Python decorators (both function and class decorators) have almost completely replaced the need for metaclasses, but there is still one obscure situation where metaclasses are still necessary, and this is when you must perform special activities as part of object creation, before initialization occurs. Thankfully, this situation tends to be a rather small edge case, so most people can get by using decorators for metaprogramming.

In the EuroPython presentation, I will be introducing metaprogramming with decorators, as well as demonstrating the special case where metaclasses are still necessary.


 

posted @ 2012-08-01 14:33  Mr.Rico  阅读(337)  评论(0编辑  收藏  举报