使用现代C++如何避免bugs（下）

About virtual functions

Virtual functions hinder a potential problem: the thing is that it's very simple to make an error in signature of the derived class and as result not to override a function, but to declare a new one. Let's take a look at this situation in the following example:
class Base {
virtual void Foo(int x);
}
class Derived : public class Base {
void Foo(int x, int a = 1);
}

The method Derived::Foo isn't possible to call by the pointer/reference to Base. But this is a simple example, and you may say that nobody makes such mistakes. Usually people make mistakes in the following way:

Note: This code is taken from MongoDB.
class DBClientBase : .... {
public:
virtual auto_ptr<DBClientCursor> query(
    const string &ns,
    Query query,
    int nToReturn = 0
    int nToSkip = 0,
    const BSONObj *fieldsToReturn = 0,
    int queryOptions = 0,
    int batchSize = 0 );
};
class DBDirectClient : public DBClientBase {
public:
virtual auto_ptr<DBClientCursor> query(
    const string &ns,
    Query query,
    int nToReturn = 0,
    int nToSkip = 0,
    const BSONObj *fieldsToReturn = 0,
    int queryOptions = 0);
};

PVS-Studio warning: V762 Consider inspecting virtual function arguments. See seventh argument of function 'query' in derived class 'DBDirectClient', and base class 'DBClientBase'. dbdirectclient.cpp 61

There are a lot of arguments and there is no last argument in the function of heir-class. These are different, unconnected functions. Quite often such an error occurs with arguments that have a default value.

In the next fragment the situation is a bit more tricky. This code will work if it's compiled as 32-bit code, but will not work in the 64-bit version. Originally, in the base class, the parameter was of DWORD type, but then it was corrected toDWORD_PTR. At the same time it wasn't changed in the inherited classes. Long live the sleepless night, debugging, and coffee!
class CWnd : public CCmdTarget {
....
virtual void WinHelp(DWORD_PTR dwData, UINT nCmd = HELP_CONTEXT);
....
};
class CFrameWnd : public CWnd { .... };
class CFrameWndEx : public CFrameWnd {
....
virtual void WinHelp(DWORD dwData, UINT nCmd = HELP_CONTEXT);
....
};

You can make a mistake in the signature in more extravagant ways. You can forget const of the function, or an argument. You can forget that the function in a base class is not virtual. You can confuse a signed/unsigned type.

In C++ several keywords were added that can regulate overriding of virtual functions. Override will be of great help. This code simply won't compile.

class DBDirectClient : public DBClientBase {
public:
virtual auto_ptr<DBClientCursor> query(
    const string &ns,
    Query query,
    int nToReturn = 0,
    int nToSkip = 0,
    const BSONObj *fieldsToReturn = 0,
    int queryOptions = 0) override;
};

NULL vs nullptr

Using NULL to indicate a null pointer leads to a number of unexpected situations. The thing is that NULL is a normal macro that expands in 0 which has int type: That's why it's not hard to understand why the second function is chosen in this example:

void Foo(int x, int y, const char *name);
void Foo(int x, int y, int ResourceID);
Foo(1, 2, NULL);

Although the reason is clear, it's very illogical. This is why there is a need in nullptr that has its own type nullptr_t. This is why we cannot use NULL (and more so 0) in modern C++.

Another example: NULL can be used to compare with other integer types. Let's suppose that there is some WinAPIfunction that returns HRESULT. This type is not related to a pointer in any way, so its comparison with NULL is meaningless. And nullptr underlines this by issuing a compilation error, at the same time NULL works:

if (WinApiFoo(a, b) != NULL) // That's bad
if (WinApiFoo(a, b) != nullptr) // Hooray,
// a compilation error

va_arg

There are cases where it's necessary to pass an undefined amount of arguments. A typical example - the function of a formatted input/ouput. Yes, it can be written in such a way that a variable number of arguments will not be needed, but I see no reason to abandon this syntax because it is much more convenient and easier to read. What do old C++ standards offer? They suggest using va_list. What problems have we got with that? It's not that easy to pass an argument of the wrong type to such an argument. Or not to pass the argument whatsoever. Let's have a closer look at the fragments.
typedef std::wstring string16;
const base::string16& relaunch_flags() const;

int RelaunchChrome(const DelegateExecuteOperation& operation)
{
AtlTrace("Relaunching [%ls] with flags [%s]\n",
operation.mutex().c_str(),
operation.relaunch_flags());
....
}

Note: This code is taken from Chromium.

PVS-Studio warning: V510 The 'AtlTrace' function is not expected to receive class-type variable as third actual argument. delegate_execute.cc 96

The programmer wanted to print the std::wstring string, but forgot to call the method c_str(). So the type wstring will be interpreted in the function as const wchar_t* . Of course, this won't do any good.

cairo_status_t
_cairo_win32_print_gdi_error (const char *context)
{
....
fwprintf (stderr, L"%s: %S", context,
(wchar_t *)lpMsgBuf);
....
}

Note: This code is taken from Cairo.

PVS-Studio warning: V576 Incorrect format. Consider checking the third actual argument of the 'fwprintf' function. The pointer to string of wchar_t type symbols is expected. cairo-win32-surface.c 130

In this fragment, the programmer confused the specifiers of the string format. The thing is that in Visual C++ wchar_t*, and %S - char*, are waiting for wprintf %s. It's interesting, that these errors are in strings that are meant for the error output or debug information - surely these are rare cases, that's why they were skipped.
static void GetNameForFile(
const char* baseFileName,
const uint32 fileIdx,
char outputName[512] )
{
assert(baseFileName != NULL);
sprintf( outputName, "%s_%d", baseFileName, fileIdx );
}

Note: This code is taken from the CryEngine 3 SDK.

PVS-Studio warning: V576 Incorrect format. Consider checking the fourth actual argument of the 'sprintf' function. The SIGNED integer type argument is expected. igame.h 66

The integer types are also very easy to confuse. Especially when their size is platform-dependent. However, here it's much simpler: the signed and unsigned types were confused. Large numbers will be printed as negative ones.
ReadAndDumpLargeSttb(cb,err)
int     cb;
int     err;
{
....
printf("\n - %d strings were read, "
         "%d were expected (decimal numbers) -\n");
....
}

Note: This code is taken from Word for Windows 1.1a.

PVS-Studio warning: V576 Incorrect format. A different number of actual arguments is expected while calling 'printf' function. Expected: 3. Present: 1. dini.c 498

Example found under one of the archaeological researches. This string presupposes three arguments, but they aren't written. Perhaps the programmer intended to print data on the stack, but we can't make assumptions of what is laying there. Certainly, we need to pass these arguments explicitly.
BOOL CALLBACK EnumPickIconResourceProc(
HMODULE hModule, LPCWSTR lpszType,
LPWSTR lpszName, LONG_PTR lParam)
{
....
swprintf(szName, L"%u", lpszName);
....
}

Note: This code is taken from ReactOS.

PVS-Studio warning: V576 Incorrect format. Consider checking the third actual argument of the 'swprintf' function. To print the value of pointer the '%p' should be used. dialogs.cpp 66

An example of a 64-bit error. The size of the pointer depends on the architecture, and using %u for it is a bad idea. What shall we use instead? The analyzer gives us a hint that the correct answer is %p. It's great if the pointer is printed for debugging. It would be much more interesting if later there is an attempt to read it from the buffer and use it.

What can be wrong with functions with a variable number of arguments? Almost everything! You cannot check the type of the argument, or the number of arguments. Step left, step right up-undefined behavior.

It is great that there are more reliable alternatives. Firstly, there are variadic templates. With their help, we get all the information about passed types during compilation, and can use it as we want. As an example let's use that very printf, but, a more secure one:
void printf(const char* s) {
std::cout << s;
}
template<typename T, typename... Args>
void printf(const char* s, T value, Args... args) {
while (s && *s) {
    if (*s=='%' && *++s!='%') {
      std::cout << value;
      return printf(++s, args...);
    }
    std::cout << *s++;
}
}

Of course this is just an example: in practice its use is pointless. But in the case of variadic templates, you are limited only by your imagination, not by the language features.

One more construction that can be used as an option to pass a variable number of arguments - std::initializer_list. It does not allow you to pass arguments of different types. But if this is enough, you can use it:
void Foo(std::initializer_list<int> a);
Foo({1, 2, 3, 4, 5});

It's also very convenient to traverse it, as we can use begin, end and the range for.

Narrowing

Narrowing casts caused a lot of headache in the programmers' life. Especially when migration to the 64-bit architecture became even more necessary. It's very good if there are only correct types in your code. But it's not all that positive: quite often programmers use various dirty hacks, and some extravagant ways of storing pointers. It took a lot of coffee to find all such fragments:
char* ptr = ...;
int n = (int)ptr;
....
ptr = (char*) n;

But let's leave the topic of 64-bit errors for a while. Here's a simpler example: there are two integer values and the programmer wants to find their ratio. It is done this way:

virtual int GetMappingWidth( ) = 0;
virtual int GetMappingHeight( ) = 0;

void CDetailObjectSystem::LevelInitPreEntity()
{
....
float flRatio = pMat->GetMappingWidth() /
pMat->GetMappingHeight();
....
}

Note: This code is taken from the Source Engine SDK.

PVS-Studio warning: V636 The expression was implicitly cast from 'int' type to 'float' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. Client (HL2) detailobjectsystem.cpp 1480

Unfortunately, it's not possible to protect yourself from such errors - there will always be one more way to cast one type to another implicitly. But the good news is that the new method of initialization in C++11 has one nice feature: it prohibits narrowing casts. In this code, the error will occur at the compilation stage and it can be easily corrected.

float flRatio { pMat->GetMappingWidth() /
pMat->GetMappingHeight() };

No news is good news

There are a great number of ways to make an error in the management of resources and memory. Convenience when working, is an important requirement for the modern language. Modern C++ is not far behind, and offers a number of tools for automatic control of resources. Although such errors are at the heartland of dynamic analysis, some issues can be revealed with the help of static analysis. Here are some ofloat flRatio { pMat->GetMappingWidth() /
pMat->GetMappingHeight() };float flRatio { pMat->GetMappingWidth() /
pMat->GetMappingHeight() };f them:

void AccessibleContainsAccessible(....)
{
auto_ptr<VARIANT> child_array(
new VARIANT[child_count]);
...
}

Note: This code is taken from Chromium.

PVS-Studio warning: V554 Incorrect use of auto_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. interactive_ui_tests accessibility_win_browsertest.cc 171

Of course, the idea of smart pointers is not new: for example, there was a class std::auto_ptr. I am talking about it using the past tense, because it was declared as deprecated in C++11 and removed in C++17. In this fragment the error was caused by the incorrectly used auto_ptr, the class doesn't have specialization for the arrays, and a result, the standard delete will be called instead of a delete[]. unique_ptr replaced auto_ptr, and it has specialization for the arrays and the ability to pass a deleter functor that will be called instead of delete, and a complete support of move semantics. It may seem that nothing can go wrong here.
void text_editor::_m_draw_string(....) const
{
....
std::unique_ptr<unsigned> pxbuf_ptr(
new unsigned[len]);
....
}

Note: This code is taken from nana.

PVS-Studio warning: V554 Incorrect use of unique_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. text_editor.cpp 3137

It turns out that you can make exactly the same error. Yes, it would be enough to write unique_ptr<unsigned[]> and it will disappear, but nevertheless, the code gets compiled in this form too. Thus, it is also possible to make an error in this way, and as practice shows, if it's possible, then people do it. This code fragment is proof of that. That's why, using unique_ptr with arrays, be extremely careful: it's much easier to shoot yourself in the foot than it seems. Perhaps it would be better to use std::vector as the Modern C++ prescribes?

Let's take a look at another type of accident.
template<class TOpenGLStage>
static FString GetShaderStageSource(TOpenGLStage* Shader)
{
....
ANSICHAR* Code = new ANSICHAR[Len + 1];
glGetShaderSource(Shaders[i], Len + 1, &Len, Code);
Source += Code;
delete Code;
....
}

Note: This code is taken from Unreal Engine 4.

PVS-Studio warning: V611 The memory was allocated using 'new T[]' operator but was released using the 'delete' operator. Consider inspecting this code. It's probably better to use 'delete [] Code;'. openglshaders.cpp 1790

The same mistake can be easily made without smart pointers: the memory allocated with new[] is freed via delete.
bool CxImage::LayerCreate(int32_t position)
{
....
CxImage** ptmp = new CxImage*[info.nNumLayers + 1];
....
free(ptmp);
....
}

Note: This code is taken from CxImage.

PVS-Studio warning: V611 The memory was allocated using 'new' operator but was released using the 'free' function. Consider inspecting operation logics behind the 'ptmp' variable. ximalyr.cpp 50

In this fragment malloc/free and new/delete got mixed up. This can happen during refactoring: there were functions from C that needed to be replaced, and as a result, we have UB.
int settings_proc_language_packs(....)
{
....
if(mem_files) {
mem_files = 0;
sys_mem_free(mem_files);
}
....
}

Note: This code is taken from the Fennec Media.

PVS-Studio warning: V575 The null pointer is passed into 'free' function. Inspect the first argument. settings interface.c 3096

This is a more amusing example. There is a practice when a pointer is zeroed after it is freed. Sometimes, programmers even write special macros for that. On the one hand, it's a great technique: you can protect yourself from another memory release. But here, the expression order was confused, and thus, free gets a null pointer (which didn't escape the analyzer's attention).

ETOOLS_API int __stdcall ogg_enc(....) {
format = open_audio_file(in, &enc_opts);
if (!format) {
    fclose(in);
    return 0;
};
out = fopen(out_fn, "wb");
if (out == NULL) {
    fclose(out);
    return 0;
}

But this problem does not only relate to memory management, but also to resource management. For example, you forget to close the file, as in the fragment above. And in both cases, the keyword-RAII. This same concept is behind smart pointers. In combination with move-semantics, RAII helps to avoid a lot of bugs related to memory leaks. And code written in this style allows the identification of resource ownership more visually.

As a small example, I'll provide the wrapper over FILE, which is using the abilities of unique_ptr:

auto deleter = [](FILE* f) {fclose(f);};
std::unique_ptr<FILE, decltype(deleter)> p(fopen("1.txt", "w"),
deleter);

Although, you may probably want a more functional wrapper to work with the files (with a more readable syntax). It's time to remember that in C++17, an API will be added to work with file systems — std::filesystem. But if you are not satisfied with this decision, and you want to use fread/fwrite instead of i/o-streams, you can get some inspiration from unique_ptr, and write your own File, which will be optimized for your personal needs, convenient, readable, and safe.

What is the result?

Modern C++ provides a lot of tools that help you write code more securely. A lot of constructions for compile-time evaluations and checks have appeared. You can switch to a more convenient memory and resources management model.

But there is no technique or programming paradigm that can fully protect you from errors. Together with the functionalities, C++ also obtains new bugs, which will be peculiar only to it. This is why we cannot solely rely on one method: we should always use the combination of code-review, quality code, and decent tools; which can help save your time and energy drinks, both of which can be used in a better way.

Speaking about tools, I suggest trying PVS-Studio: we have recently started working on a Linux version of it, you can see it in action: it supports any build system and allows you to check your project just by building it. For the Windows developers we have a handy plugin for Visual Studio which you can try as a trial version.

posted @ 2020-06-27 16:52 吴建明wujianming 阅读(248) 评论(0) 编辑收藏举报

刷新页面返回顶部

吴建明

使用现代C++如何避免bugs（下）

公告