C++0x中一些有用的新特性
This is the first of a three-part on what's new and important in the C++0x standard. This first article covers features that programmers are apt to use directly; later articles will cover some of the ways in which C++0x simplifies programming for library users and authors
From our experience, we have developed firm opinions about which aspects of programming languages are important to programmers and which ones aren't. As a result, we are going to treat these articles as an opportunity to point out a handful of C++0x features that we think are particularly useful, and to talk about the contexts in which we think those features are best utilized.
增强的类型推导,简化的迭代器和新的容器构造器
Let the Compiler Figure Out Types
让编译器去决定变量类型
A battle has been raging for at least 50 years between languages such as FORTRAN and Algol, which require programmers to say what types they intend their variables to have, and those such as Lisp, that view assigning a value to a variable gives that variable the same type as the value. In C and C++ (until now), the programmer must state the type of every variable:
int answer = 42.1; # answer is an int, so 42.1 is converted to 42
whereas in Python (for example), a variable has the type of whatever value it was given most recently:
answer = 42 # answer is now an int answer = 42.1 # answer is now a float
In recent decades, a third treatment has crept into some languages: If the compiler can figure out during compilation what type an expression has, why not make it possible to say that a variable has that same type, whatever the type might be? This notion is subtly different from how Lisp and Python do things, because types are figured out during compilation rather than during execution. As a result, there's no way to change the type of a variable after creating it, and the compiler can still generate efficient machine code by using its knowledge of variables' types.
C++ first adopted this idea as part of instantiating template functions:
template <typename T> void swap(T& a, T& b) { T temp = a; a = b; b = temp; }
If we have two variables x
and y
, and we call
swap(x, y)
, the compiler figures out for us the type to associate with
T
. As a result, we don't need to know the type of x
and
y
in order to call swap(x, y)
.
In the 1998 and 2003 C++ standards, this kind of type inference was limited to templates. C++0x lets us ask for the same kind of type inference without having to write a template function. We do so by writing
auto
as part of a variable's type when we define it. For example:
C++0x允许我们做类似的类型推导而不用写模版函数。我们通过auto关键字来定义变量。例如:
auto answer = 42; # answer is an int because 42 is an int
says that we want the variable answer
to take its type from its initializer. Because each variable is initialized only once, there is still no possibility of changing a variable's type after we have created it.
表示我们希望answer的类型在初始化的时候确定。因为每个变量只会初始化一次,这样在变量创建以后依然没有其它机会去改变变量类型。
Why is this feature important? We think the main reason is that it lets us avoid having to figure out the sometimes complicated types that come from using libraries. For example, we can write statements such as:
为什么说这个特性很重要呢?我想主要的原因是它让我们避免去决定,来自某些库函数返回的复杂类型。比如,我们可以这样写而不用去查看get_date的返回类型。
auto today = get_date();
without having to look in the documentation for get_date
to figure out what type it returns.
更复杂的情况,考虑需要遍历一个数据的时候。例如:
As a more complicated example, consider the code we have to write in order to look at the elements of a data structure. Suppose we have a variable named
word_counts
that keeps track of how many times each distinct word appears in a body of text:
map<string, int> word_counts;
We might process this container's elements by writing something like:
for (map<string, int>::const_iterator iter = word_counts.begin(); iter != word_counts.end(); ++iter) { process(iter->first, iter->second); }
Here, we're assuming that we've already written a function named process
that does whatever we intend with each word (iter->first
) and its associated counter (iter->second
).
如果用auto可以简化成这样:
We can use auto
to avoid having to write the type of iter
explicitly:
for (auto iter = word_counts.begin(); iter != word_counts.end(); ++iter { process(iter->first, iter->second); }
There is one fine point worth making here: word_counts.begin()
yields an iterator that is capable of changing
word_counts
. In the earlier version, we assigned that iterator to a
const_iterator
, thereby removing that permission. C++0x containers offer
cbegin
and cend
members that yield const_iterator
s, so we should really rewrite our example this way:
for (auto iter = word_counts.cbegin(); iter != word_counts.cend(); ++iter { process(iter->first, iter->second); }
More generally, C++0x programmers should use cbegin
and cend
whenever they intend to iterate through a container without changing its contents.
We think that programmers will be able to use auto
to simplify nearly every program they write.
Streamlining Iteration
流迭代器
We can make the previous loop even easier to write. The idiom of using a container's
begin
and end
members to obtain iterators to use as starting and ending values for iteration is so common that C++0x has defined a shorter way to do so, called a
range for
statement:
上面的循环有更简单的写法。C++0x称为range for语句:
for (const auto& word: word_counts) { process(word.first, word.second); }
Using a colon in a for
statement asks the compiler to iterate through a sequence (word_counts
in this example) using
begin
and end
to find the sequence's bounds. The effect is similar to:
auto end_iter = word_counts.end(); for (auto iter = word_counts.begin(); iter != end_iter; ++iter) { const auto& word = *iter; process(word.first, word.second); }
In this code, iter
and end_iter
represent hidden variables that the compiler generates. In effect, this statement uses each element of the sequence in turn to initialize the variable (word
in this example) on the left
of the colon. Because we have stated that the type of variable is const auto&
, the "initialization" involves binding
word
to each element, with the promise that we will not use word
to change the element's value.
If we omit the &
from the variable that we define in a range
for
statement, it will copy each element of the sequence into our variable. If we omit the
const
but retain the &
, we can use the variable to change the container elements. For example, we can zero all of the counters in
word_counts
by writing:
for (auto& word: word_counts) word.second = 0;
Our experience is that programmers who use containers almost always wind up wanting to access their containers' elements sequentially at some point in their programs. This new form of
for
statement makes this kind of sequential access much easier.
Extracting Type Information
提取类型信息
auto在初始化变量的时候很有用,但是如果在定义变量的时候没有初始化值呢。例如下面的定义:
The auto
specifier is useful for initializing variables when we define them, but we sometimes define variables without initializing them. For example, a variable might be part of a data structure:
struct When { ??? date; // What types do we put here? ??? time; };
这种情况下,我们不能使用auto关键字,因为没有初始化语句。但是可以这样写:
In this case, we can't use auto
to say that date
and
time
have the same types as whatever get_date
and
get_time
return, because we don't have initializer expressions to supply types for
date
and time
. Instead, we can write:
struct When { decltype(get_date()) date; decltype(get_time()) time; };
意思是如果e是一个表达式,decltype(e)表示e求值后的类型--即使这个表达式实际上没有被求值。所以,数据成员data和time用的是get_date()和get_time()的返回类型。
The idea is that if e
is an expression, decltype(e)
is the type that
e
would have if it were evaluated — even though the expression is not actually evaluated. As a result, the data members
date
and time
get the types of get_date()
and
get_time()
, respectively, and the author of When
is spared having to look up those types' names.
As with auto
, we expect that decltype
will be widely used to make everyday programs easier to write.
Initializing Variables' Contents
初始化变量内容
The usual way to construct an object in C++ is to supply arguments to its constructor. Sometimes these arguments are data for the object to store:
string hello("Hello, world!"); // hello contains Hello, world!
and sometimes the object uses the constructor arguments in ways other than storing them directly:
vector<string> strings(42); // strings contains 42 empty strings
A common early mistake among C++ programmers was to supply an argument that a constructor interpreted differently from what was intended:
double average(const vector<int>&); … average(42) … // Takes the average of a 42-element vector of zeroes!
Such mistakes came about because there was no convenient way to initialize a container from a series of element values:
vector<int> primes; primes.push_back(2); // Put the first few prime numbers into the vector primes.push_back(3); primes.push_back(5);
A programmer who wished to define a vector
that started with three specific values didn't really have an easier way to do so.
C++98 let us avoid mistakes such as our call to average
above by introducing
explicit
constructors. An explicit
constructor can be called only as part of explicitly constructing an object of a known type. The C++98 standard
vector
constructor that expects a size is explicit
, so
average(42)
does not compile. A programmer who really wants to construct a 42-element vector of zeroes can still do so by calling
average(vector(42))
.
C++0x makes it easier to initialize a container's elements by letting us use curly braces whenever we initialize an object. The values in the curly braces can be arguments to an ordinary constructor. More usefully for this example, classes can also define
constructors that take a sequence of values. All the values must have the same type, but the number of values we can supply is not fixed. Classes such as
vector
can use this new kind of constructor to let us supply a sequence of element values. For example:
vector<int> primes{2, 3, 5}; // three elements, values 2, 3, 5 vector<int> small{1000}; // one element, value 1000 vector<int> large(1000); // 1,000 elements, all with value 0
Like vector
, each of the C++0X library containers defines a constructor that lets us provide a sequence of initial element values.
This description only scratches the surface: The rules for initialization and constructors have always been both important and complicated. Although we think that the effort to make initialization more uniform will make life easier for users of class libraries, it will sometimes do so at the expense of increasing the burden on library designers. This tradeoff is probably for the good, because there are many more library users than there are designers. (At least, we hope there are.)
By letting programmers use curly braces to initialize class objects, C++0x makes library containers look even more like the built-in types. In doing so, C++0x also makes it possible for programmers to say whether a constructor argument controls a container parameter such as its size, or whether the argument is intended to be stored in the container.
Summary
The C++03 standard was not quite 800 pages long; the C++0x standard will be more than 1,300 pages. It would be absurd for us to try to discuss every new idea in C++0x, and even more absurd for us to expect you to read such a discussion. However, most C++
programmers will not care about many of the additions to the C++0x standard. Accordingly, this article has talked about four C++0x features that we think you'll care about:
auto
, range for
, decltype
, and brace initialization.
Our next article discusses how C++0x makes its standard libraries easier and more efficient to use. Our final article will discuss how C++0x makes life easier for library authors as well.