C++ Knowledge series 2
Programming language evolves always along with Compiler's evolvement
The semantics of constructors
- One of the most often heard complaints about C++ is that the compiler does things behind the programmer’s back. We shall know how underlying compiler do with our programming, what it does to our program.
- The keyword explicit, in fact, was introduced into the language in order to give the programmer a method by which to suppress application of a single argument constructor as a conversion operator. Although it is easy (from a distance, anyway) to be amused by tales of the Schwarz Error, conversion operators in practice are difficult to use in a predictable, well-behaved manner.
- The problem, however, is more in the nature of the compiler’s taking your intentions far too literally than of its actually doing something behind your back! Although it is often difficult to convince a programmer bitten by a SchwarzError of this.
- Behind the back type of activities are much more likely to occur in support of member-wise initialization or in the application of what is referred to as the name return value optimization (NRV). In this chapter, I look at compiler “meddlings” in terms of object construction and the impact that has on the form and performance of our programs.
- The initialization of variable of class is different from Java and C#. In Java and C#, when the class is loaded, or in new creation expression, all variables firstly are allocated in heap and initialized with default value, and then initializer of variable will be performed, and then constructor is called. So in base class constructor, the variable of derived class can be referenced to via overriding method in derived class.
- When a constructor has no constructor initializer or a constructor initializer of the form base(...), the constructor implicitly performs the initializations specified by the variable-initializers of the instance fields declared in the class. This corresponds to a sequence of assignments that are executed immediately upon entry to the constructor and before the implicit invocation of the direct base class constructor. The variable initializers are executed in the textual order they appear in the class declaration.
- Note that variable initializers are transformed into assignment statements, and that these assignment statements are executed before the invocation of the base class constructor. This ordering ensures that all instance fields are initialized by their variable initializers before any statements that have access to the instance are executed
Default Constructor Construction
- The C++ Annotated Reference Manual (ARM) tells us that "default constructors…are generated (by the compiler) where needed…." The crucial word here is needed, needed by whom and to do what?
- The distinction is that between the needs of the program and the needs of the implementation. A program's need for a default constructor is the responsibility of the programmer;
- Global objects are guaranteed to have their associated memory "zeroed out" at program start-up, allocated in program’s data segment like static variables. Local objects allocated on the program stack and heap objects allocated on the free-store do not have their associated memory zeroed out; rather, the memory retains the arbitrary bit pattern of its previous use.
- The Standard states: If there is no user-declared constructor for class X, a default constructor is implicitly declared…. A constructor is trivial if it is an implicitly declared default constructor….
Member Class Object with Default Constructor:
- If a class without any constructors contains a member object of a class with a default constructor, the implicit default constructor of the class is nontrivial and the compiler needs to synthesize a default constructor for the containing class. This synthesis, however, takes place only if the constructor actually needs to be invoked.
- In practice, this is solved by having the synthesized default constructor, copy constructor, destructor, and/or assignment copy operator defined as inline. An inline function has static linkage and is therefore not visible outside the file within which it is synthesized. If the function is too complex to be inlined by the implementation, an explicit non-inline static instance is synthesized.
- To simplify our discussion, these example ignore the insertion of the implicit this pointer.
- The synthesized default constructor only contains the member object’s default constructor, it is not responsible for other member’s initialization, like int, pointer, reference.
- Again, note that the synthesized default constructor meets only the needs of the implementation, not the needs of the program.
- If any constructor designed by user: the language requires that the constructors be invoked in the order of member declaration within the class. This is accomplished by the compiler. It inserts code within each constructor, invoking the associated default constructors for each member in the order of member declaration. This code is inserted just prior to the explicitly supplied user code.
Base Class with Default Constructor:
- Similarly, if a class without any constructors is derived from a base class containing a default constructor, the default constructor for the derived class is considered nontrivial and so needs to be synthesized. The synthesized default constructor of the derived class invokes the default constructor of each of its immediate base classes in the order of their declaration. To a subsequently derived class, the synthesized constructor appears no different than that of an explicitly provided default constructor.
If any constructor designed by user:
- The compiler augments each constructor with the code necessary to invoke all required default constructors. However, it does not synthesize a default constructor because of the presence of the other user-supplied constructors. If member class objects with default constructors are also present, these default constructors are also invoked after the invocation of all base class constructors.
Class with a Virtual Function:
- There are two additional cases in which a synthesized default constructor is needed:
- the class either declares (or inherits) a virtual function,
- the class is derived from an inheritance chain in which one or more base classes are virtual.
- In both cases, in the absence of any declared constructors, implementation bookkeeping necessitates the synthesis of a default constructor.
- The following two class "augmentations" occur during compilation:
- A virtual function table (referred to as the class vtbl in the original cfront implementation) is generated and populated with the addresses of the active virtual functions for that class;
- Within each class object, an additional pointer member (the vptr) is synthesized to hold the address of the associated class vtbl.
- In vtbl, actually the first slot is associated to type_info for RTTI implementation.
- In classes that do not declare any constructors, the compiler synthesizes a default constructor in order to correctly initialize the vptr of each class object.
Class with a Virtual Base Class:
- Virtual base class implementations vary widely across compilers. However, what is common to each implementation is the need to make the virtual base class location within each derived class object available at runtime.
- In the original cfront implementation, for example, this is accomplished by inserting a pointer to each of the virtual base classes within the derived class object. All reference and pointer access of a virtual base class is achieved through the associated pointer.
- There are four characteristics of a class under which the compiler needs to synthesize a default constructor for classes that declare no constructor at all. The Standard refers to these as implicit, nontrivial default constructors. The synthesized constructor fulfills only an implementation need. It does this by invoking member object or base class default constructors or initializing the virtual function or virtual base class mechanism for each object. Classes that do not exhibit these characteristics and that declare no constructor at all are said to have implicit, trivial default constructors. In practice, these trivial default constructors are not synthesized.
- Within the synthesized default constructor, only the base class sub-objects and member class objects are initialized. All other non-static data members, such as integers, pointers to integers, arrays of integers, and so on, are not initialized. These initializations are needs of the program, not of the implementation. If there is a program need for a default constructor, such as initializing a pointer to 0, it is the programmer's responsibility to provide it in the course of the class implementation.
Copy Constructor Construction
- There are three program instances in which a class object is initialized with another object of its class: 1. explicit initialization of one class object with another; 2. an object is passed as an argument to a function; 3. a function returns a class object.64. Say the class designer explicitly defines a copy constructor (a constructor requiring a single argument of its class type),
- Each class object initialized with another object of its class is initialized by what is called default memberwise initialization. Default memberwise initialization copies the value of each built-in or derived data member (such as a pointer or an array) from the one class object to another. A member class object, however, is not copied; rather, memberwise initialization is recursively applied.
- In practice, a good compiler can generate bitwise copies for most class objects since they have bitwise copy semantics….
- Default constructors and copy constructors…are generated (by the compiler) where needed.
Bitwise Copy Semantics:
- The aliasing problem with regard to member str can be solved only by overriding default memberwise initialization with an explicit copy constructor implemented by the designer of the class (or by disallowing copying altogether). This, however, is independent of whether a copy constructor is synthesized by the compiler.
Bitwise Copy Semantics not:
- When are bitwise copy semantics not exhibited by a class? There are four instances:
- when the class contains a member object of a class which a copy constructor exists;
- When the class is derived from a base class for which a copy constructor exists;
- when the class declares one or more virtual functions;
- when the class is derived from an inheritance chain in which one or more base class are virtual.
Resetting the Virtual Table Pointer:
- Recall that two program “augmentations” occur during compilation whenever a class declares one or more virtual functions.
- a virtual function table that contains the address of each active virtual function associated with that class (the vtbl) is generated.
- a pointer to the virtual function table is inserted within each class object (vptr), note here: insert, not copy from another object. (why? Maybe another object’s vptr is associated to unkown sub-class object.)
- Obviously, things would go terribly wrong if the compiler either failed to initialize or incorrectly initialized the vptr of each new class object. Hence, once the compiler introduces a vptr into a class, the affected class no longer exhibits bitwise semantics. Rather, the implementation now needs to synthesize a copy constructor in order to properly initialize the vptr.
- The copying of an object's vptr value, however, ceases to be safe when an object of a base class is initialized with an object of a class derived from it.
Handling the Virtual Base Class Subobject: The presence of a virtual base class also requires special handling.
- The initialization of one class object with another in which there is a virtual base class sub-object also invalidates bitwise copy semantics.
- Each implementation's support of virtual inheritance involves the need to make each virtual base class subobject's location within the derived class object available at runtime. Maintaining the integrity of this location is the compiler's responsibility. Bitwise copy semantics could result in a corruption of this location, so the compiler must intercede with its own synthesized copy constructor.
- We have looked at the four conditions under which bitwise copy semantics do not hold for a class and the default copy constructor, if undeclared, is considered nontrivial. Under these conditions, the compiler, in the absence of a declared copy constructor, must synthesize a copy constructor in order to correctly implement the initialization of one class object with another.
Program Transformation Semantics
Explicit initialization: X x1(xo); X x2 = xo; X x3 = X(x0); The last two actually are initialized with two steps: default constructor or temporary object, copy constructor.
- The required program transformation is two-fold: each definition is rewritten with the initialization stripped out; an invocation of the class copy constructor is inserted.
- The Standard C++ states: passing a class object as an argument to a function or as that function’s return value is equivalent to the following form of initialization: X x1 = x0;
- One implementation strategy is to introduce a temporary object, initialize it with a call of the copy constructor, and then pass that temporary object to the function.
Return Value Initialization:Stroustrup's solution in cfront is a two-fold transformation:
-
- Add an additional argument of type reference to the class object. This argument will hold the copy onstructed "return value.";
- Insert an invocation of the copy constructor prior to the return statement to initialize the added argument with the value of the object being returned. What about the actual return value, then? A final transformation rewrites the function to have it not return a value.
- return X( y, z ); transformed to __result.X::X();
- This compiler optimization, sometimes referred to as the Named Return Value (NRV) optimization.
- The presence of the copy constructor "turns on" the NRV optimization within the C++ compiler. (The optimization is not performed by a separate optimizer. In this case, the optimizer's effect on the performance is negligible.)
- Although the NRV optimization provides significant performance improvement, there are several criticisms of this approach.
- One is that because the optimization is done silently by the compiler, whether it was actually performed is not always clear (particularly since few compilers document the extent of its implementation or whether it is implemented at all).
- A second is that as the function becomes more complicated, the optimization becomes more difficult to apply. In cfront, for example, the optimization is applied only if all the named return statements occur at the top level of the function.
- The default copy constructor is considered trivial. There are no member or base class objects with a copy constructor that need to be invoked. Nor is there a virtual base class or virtual function associated with the class. So, by default, a memberwise initialization of one Point3d class object with another results in a bitwise copy. This is efficient.
- Use of both memcpy() and memset(), however, works only if the classes do not contain any compiler-generated internal members. If the Point3d class declares one or more virtual functions or contains a virtual base class, use of either of these functions will result in overwriting the values the compiler set for these members.
Note: vptr is reset with class’s vtbl, not is copied from other object.
- Application of the copy constructor requires the compiler to more or less transform portions of your program. In particular, consider a function that returns a class object by value for a class in which a copy constructor is either explicitly defined or synthesized. The result is profound program transformations both in the definition and use of the function. Also, the compiler optimizes away the copy constructor invocation where possible, replacing the NRV with an additional first argument within which the value is stored directly. Programmers who understand these transformations and the likely conditions for copy constructor optimization can better control the runtime performance of their programs.
Member Initialization List
- When you write a constructor, you have the option of initializing class members either through the member initialization list or within the body of the constructor.
- You must use the member initialization list in the following cases in order for your program to compile:
-
- when initializing a reference member;
- when initializing a const member;
- when invoking a base or member class constructor with a set of argument.
- when initializing a user-defined class’s data member;
- If we use assignment to initialize a data member, it will result in the creation and the destruction of a temporary String object. Like: X x1 = xo; it will cause temporary object.
- The compiler iterates over the initialization list, inserting the initializations in the proper order within the constructor prior to any explicit user code.
- The order in which the list entries are set down is determined by the declaration order of the members within the class declaration, not the order within the initialization list.
- This apparent anomaly between initialization order and order within the initialization list can lead to the following nasty pitfall.
- If the declaration order is preserved, this code fails badly. The code is correct, however, because the initialization list entries are placed before explicit user code.
- The order within the initialization only take effect on the same place: initialization list or body of constructor. The initialization list entries are placed before explicit user code, so you can use a variant initialized in list to initialize another variant which is declared before the first one.
- I reiterate my advice to initialize one member with another inside the constructor body, not in the member initialization list.
- The use of the member function is valid (apart from the issue of whether the members it accesses have been initialized). This is because the this pointer associated with the object being constructed is well formed and the expansion simply takes a form like the following.
- In summary, the compiler iterates over and possibly reorders the initialization list to reflect the declaration order of the members. It inserts the code within the body of the constructor prior to any explicit user code.
From: <<Inside the C++ Object Model>>