代码改变世界

Understanding Smart Pointers

2011-08-15 23:10  Daniel Zheng  阅读(316)  评论(0编辑  收藏  举报

C++ programmers do not necessarily need to use plain pointer types when managing memory on the heap (or the free store); they can make use of smarter option.

What Are Smart Pointers?

Very simply said, a smart pointer in C++ is a class, with overloaded operators, which behaves like a conventional pointer yet supplies additional value by ensuring proper and timely destruction of dynamically allocated data and/or implementation of a well-defined object ilfe-cycle management strategy.

What Is the Problem with Using Conventional (Raw) Pointers?

Unlike other modern programming languages, C++ supplies full flexibility to the programmer in memory allocation, deallocation, and management. Unfortunately, this flexibility is double-edged sword. On one side it makes C++ a powerful language, but on the other hand it allows the programmer to create memory-related problems, such as memory leaks, when dynamically allocated objects are not correctly released.

For example:

CData * pData = mObject.GetData();
pData
->Display();

In the preceding line of code, there is no obvious way to tell whether the memory pointed to by pData

  • Was allocated on the heap, and therefore eventually needs to be deallocated
  • Is the responsibility of the caller to deallocate
  • Will automatically be destroyed by the object's destructor
Although such ambiguities can be partially solved by inserting comments and enforcing coding practices, these mechanisms are much too loose to efficiently avoid all errors caused by abuse of dynamically allocated data and pointers.
How Do Smart Pointers Help?
Given the problems with using conventional pointer can conventional memory management techniques, it should be noted that the C++ programmer is not forced to use them when he needs to manage data on the heap/free store. The programmer can choose a smarter way to allocate and manager dynamic data by adopting the use of smart pointers in his programs:
smart_poiner<CData> spData = mObject.GetData();
spData
->Display();
(
*spData).Display();

Thus, smart pointers behave like conventional pointers but supply useful features via their overloaded operators and destructors to ensure that dynamically allocated data is destroyed in a timely manner.

How Are Smart Pointers Implemented?

This question can for the moment be simplified to "How did the smart pointer spData function like a conventional pointer?" The answer is this: Smart pointer classes overload operator * (dereferencing operator) and operator -> (member selection operator) to make you, the programmer, use them as conventional pointers. 

Additionally, to allow you to manager a tyoe of your choice on the heap, almost all good smart pointer classes are template classes that contain a generic implementation of their functionality. Being templates, they are versatile and can be specialized to manage an object of a type of your choice.

A sample implementationof a simple smart pointer class is like this:

template <typename T>
class smart_pointer
{
private:
T
* m_pRawPointer;
public:
smart_pointer (T
* pData) : m_pRawPointer (pData) {} // constructor
~smart_pointer () {delete pData;}; // destructor

// copy constructor
smart_pointer (const smart_pointer & anotherSP);
// assignment operator
smart_pointer& operator= (const smart_pointer& anotherSP);

T
& operator* () const // dereferencing operator
{
return *(m_pRawPointer);
}

T
* operator-> () const // member selection operator
{
return m_pRawPointer;
}
};

For example, if you have a class CDog, you would be able to use the smart pointer on an object of type CDog like this:

smart_pointer <CDog> pSmartDog (new CDog);
pSmartDog
->Bark ();
int nAge = (*pSmartDog).GetAge ();

The implementation that makes a smart pointer really “smart” is the implementation of the copy constructor, the assignment operator, and the destructor. They determine the behavior of the smart pointer object when it is passed across functions, assigned, or goes out of scope (that is, gets destructed like any other class-object). So, before looking at a complete smart pointer implementation, you should understand some smart pointer types.

Types of Smart Pointers

The managerment of the memory resource is what sets smart pointer class apart. Smart pointers decide what they do with the resource when they are copied and assigned to. The simplest implementations often result in performance issues, whereas the fastest ones might not suit all applications. IN the end, it is for the programmer to understand how a smart pointer functions before he decides to use it in his application.

Classification of smart pointers is actually a classification of their memory resource management strategies. These are

  • Deep Copy
  • Copy on Write (COW)
  • Reference counted
  • Reference linked
  • Destructive copy
Let take a brief look into each of these strategies before studying the smart pointer supplied by the C++ standard library, the std::auto_ptr.
Deep Copy
In a smart pointer that implements deep copy, every smart pointer instance holds a complete copy of the object that is being managed. Whenever the smart pointer is copied, the object to is also copied (thus, deep copy). When the smart pointer goes out of scope, it releases the memory it points to (via the destructor).
Although the deep-copy-based smart pointer does not seem to render any value over passing objects by valus, its advantage becomes apparent in the treatment to polymorphic objects, as seen in the following, where it can avoid slicing:
// Example of Slicing When Passing Polymorphic Objects by Value
// CAnimal is a base class for CDog and CCat.
void MakeAnimalTalk (CAnimal mAnimal) // note parameter type
{
  mAnimal.Talk (); // virtual function
}

// ... Some function
CCat mCat;
MakeAnimalTalk (mCat);
// Slicing: only the CAnimal part of mCat is sent to MakeAnimalTalk
CDog mDog;
MakeAnimalTalk (mDog); // Slicing again

Slicing issues are resolved when the programmer chooses a deep-copy smart pointer:

template <typename T>
class deepcopy_smart_pointer
{
private:
  T* m_pObject;
public:
//... other functions

// copy constructor of the deepcopy pointer
deepcopy_smart_pointer (const deepcopy_smart_pointer& source)
{
  // Use a virtual clone function defined in the derived class
  // to get a complete copy of the object
  m_pObject = source->Clone ();
}
};

void MakeAnimalTalk (deepcopy_smart_pointer<CAnimal> mAnimal)
{
  mAnimal.Talk ();
}

As you can see, deepcopy_smart_pointer implements a copy constructor that allows a deep copy of the polymorphic object via a Clone function that the object needs to implement. For the sake of simplicity, it is taken for granted in this example that the virtual function implemented by the base class CAnimal is called Clone. Typically, smart pointers that implement deep-copy models will have this function supplied as either a template parameter or a function object.

A sample usage of the deepcopy_smart_pointer is as follows:

deepcopy_smart_pointer <CAnimal> pDog (new CDog());
MakeAnimalTalk (pDog);
// No slicing issues as pDog is deep-copied
deepcopy_smart_pointer <CAnimal> pAnimal (new CCat());
MakeAnimalTalk (pCat);
// No slicing

Thus, when the smart pointer itself is passed as a pointer to base class type CAnimal, the deep-copy implemented in the smart pointer’s constructor kicks in to ensure that the object being passed is not sliced, even though syntactically only the base part of it is required by the destination function MakeAnimalTalk().

The disadvantage of the deep-copy-based mechanism is performance. This might not be a factor for some applications, but for many others it might inhibit the programmer from using a smart pointer for his application altogether, and simply pass a base type pointer (conventional pointer, CAnimal*) to functions such as MakeAnimalTalk(). Other pointers types try to address this performance issue in various ways.

Copy on Write Mechanism

Copy on Write (COW as it is popularly called) attempts to optimize the performance of deep-copy smart pointers by sharing pointers until the first attempt at writing to the object is made. On the first attempt at invoking a non-const function, a COW pointer typically creates a copy of the object on which the non-const function is invoked, whereas other instances of the pointer continue sharing the source object.

COW has its fair share of fans. For those that swear by COW, implementing operators * and -> in their const and non-const versions is key to the functionality of the COW pointer. The latter creates a copy.

The point is that when you chose a pointer implementation that follows the COW philosophy, be sure that you understand the implementation details before you proceed to use such an implementation. Otherwise, you might land in situations where you have a copy too few or a copy too many.

Reference Counted Smart Pointers

Reference counting in general is a mechanism that keeps a count of the number of users of an object. When the count reduces to zero, the object is released. So, reference counting makes a very good mechanism for sharing objects without having to copy them. If you have ever worked with a Microsoft technology called COM, the concept of reference counting would have definitely crossed your path on at least one occasion.

Such smart pointers, when copied, need to have the reference count of the object in question incremented; there are at least two popular ways to keep this count:

  • Reference count maintained in the object
  • Reference count manintained by the pointer class in a shared object
The reference-counting mechanism hence makes it pertinent that the programmer works with the smart pointers only when using the object. A smart pointer managing the object and a raw pointer pointing to it is a bad idea because the smart pointer will (smartly) release the object when the count maintained by it goes down to zero, but the raw pointer will continue pointing to the part of the memory that no longer belongs to your application.
Similarly, reference counting can cause issues peculiar to their situation: Two objects that hold a pointer to each other will never be released because their cyclic dependency will hold their reference counts at a minimum of 1.
Refernece-Linked Smart Pointers
Reference-linked smart pointers are ones that don’t proactively count the number of references using the object; rather, they just need to know when the number comes down to zero so that the object can be released.
They are called reference-linked because their implementation is based on a doublelinked list. When a new smart pointer is created by copying an existing one, it is appended to the list. When a smart pointer goes out of scope, is destroyed, the destructor de-indexes the smart pointer from this list. Reference linking also suffers from the problem caused by cyclic dependency, as applicable to reference-counted pointers.
Destructive Copy
Destructive copy is a mechanism where a smart pointer, when copied, transfers complete ownership of the object being handled to the destination, and reset itself.
destructive_copy_smartptr <CSomeClass> pSmartPtr (new CSomeClass ());
SomeFunc (pSmartPtr);
// Ownership transferred to SomeFunc
// Don’t use pSmartPtr in the caller any more!

Although this mechanism is obviously not intuitive to use, the advantage supplied by destructive copy smart pointers is that they ensure that at any point in time, only one active pointer points to an object. So, they make good mechanisms for returning pointers from functions, and are of use in scenarios where you can use their “destructive” properties to your advantage.

std::auto_ptr is by far the most popular (or most notorious, depending on how you look at it) pointer that follows the principles of destructive copy. The disadvantage of using such a pointer is highlighted by the preceding code snippet. It demonstrates that such a smart pointer is useless once it has been passed to a function or copied into another. The implementation of destructive copy pointers deviates from standard, recommended C++ programming techniques, as seen below:

template <typename T>
class destructivecopy_pointer
{
private:
T
* m_pObject;
public:
// other members, constructors, destructors, operators* and ->, etc...

// copy constructor
destructivecopy_pointer(destructivecopy_pointer& source)
{
// Take ownership on copy
m_pObject = source.m_pObject;

// destroy source
source.m_pObject = 0;
}

// assignment operator
destructivecopy_pointer& operator= (destructivecopy_pointer& rhs)
{
if (m_pObject != source.m_pObject)
{
delete m_pObject;
m_pObject
= source.m_pObject;
source.m_pObject
= 0;
}
}
};
 

The fact that such smart pointers destroy the source also makes them unsuitable for use in STL containers, such as the std::vector, or any other dynamic collection class that you might use. These containers need to copy your content internally and end up invalidating the pointers.

So, for reasons more than one, there are a lot of programmers who avoid destructive copy smart pointers like the plague. However, one of the most popular smart pointer implementations, the std::auto_ptr, is of this type, and that it is a part of the standard template library makes it important that you at least understand how it works.

Using the std::auto_ptr

The auto_ptr is a destructive copy-based smart pointer that transfers the ownershio of the object on copy, and release the object it owns when it goes out of scope.

To use std:auto_ptr, you should first include the header:

#include <memory>

To study the effects of using the std::auto_ptr, let’s create a sample class CSomeClass that does little other than indicate its lifetime by printing some lines in its constructor and destructor:

// LISTING 26.4 - Using the std::auto_ptr
#include <memory>
#include
<iostream>

using namespace std;
class CSomeClass
{
public:
  // Constructor
  CSomeClass() {cout << “CSomeClass: Constructed!<< std::endl;}
  ~CSomeClass() {cout << “CSomeClass: Destructed!<< std::endl;}
  void SaySomething () {cout << “CSomeClass: Hello!<< std::endl;}
};

void UsePointer (auto_ptr <CSomeClass> spObj);

int main ()
{
using namespace std;
cout
<< "main() started" << endl;

auto_ptr
<CSomeClass> spObject (new CSomeClass ());

cout
<< "main: Calling UsePointer()" << endl;

// Call a function, transfer ownership
UsePointer (spObject);

cout
<< "main: UsePointer() returned, back in main()" << endl;

// spObject->SaySomthing (); // invalid pointer!

cout
<< "main() ends" << endl;

return 0;
}

void UsePointer (auto_ptr <CSomeClass> spObj)
{
cout
<< "UsePointer: started, will use input pointer now" << endl;

// Use the input pointer
spObj->SaySomething ();

cout
<< "UsePointer: will return now" << endl;
}

Output:

main() started

CSomeClass: Constructed!

main: Calling UsePointer()

UsePointer: started, will use input pointer now

CSomeClass: Hello!

UsePointer: will return now

CSomeClass: Destructed!

main: UsePointer() returned, back in main()

main() ends

Popular Smart Pointer Libraries

It’s pretty apparent that the version of the smart pointer shipped with the C++ standard library is not going to meet every programmer’s requirements. This is precisely why there are many smart pointer libraries out there.

Boost (www.boost.org) supplies you with some well-tested and well-documented smart pointer classes, among many other useful utility classes. You will find further information on Boost smart pointers and their downloads at http://www.boost.org/libs/smart_ptr/smart_ptr.htm