Mixing Multiple Visual Studio Versions in a Program is Evil

Many times I have seen programmers complain that they see some strange behavior during testing; something that they cannot explain. Either their program crashes randomly or it simply crashes for no apparent reason. They scratch their head trying to debug their code. Everything seems fine, inputs are OK, memory allocations seem fine, code does not seem to contain any bugs yet it fails spectacularly.

I have seen countless number of hours being lost trying to get to the bottom of such failures, only to find out that the cause of the unexplained behavior was that the program used DLLs compiled against multiple Visual Studio versions. For example: a program using that is compiled against Visual Studio-2005 and that is compiled against Visual Studio-2008.LibA.dllLibB.dll

Very often, in various programming forums, I see questions about mixing multiple Visual Studio versions in the program. I also see responses such as – “this should not be a problem” or “I do this all the time without any issues”.

Let’s explore in detail various issues surrounding the topic of using multiple Visual Studio versions in the program. We will also see what can be done to avoid potential issues if multiple Visual Studio versions can’t be avoided. I have not explored this topic in detail for .NET languages (i.e. managed code), so I will limit this discussion to code developed in native C/C++ programming language only. In particular, we will explore this topic in following context:

Dynamic Memory Allocations.
Global and Static Data Variables.
Function Calls taking/Returning C/C++ Objects.
Template Classes, In-Line & Template Functions.
File I/O.
Environment Variables.

Visual Studio Product Names and Versions

Visual Studio C/C++ run time libraries contain Visual Studio internal version numbers. Mapping between Visual Studio product names and corresponding internal version numbers is as follows:

Product Name	Version Number
Visual Studio .NET	7.0
Visual Studio .NET 2003	7.1
Visual Studio 2005	8.0
Visual Studio 2008	9.0
Visual Studio 2010	10.0
Visual Studio 2012	11.0

Visual Studio Runtime Environment Explained

Visual Studio C/C++ run time environment (CRT environment) consists of the following dynamic libraries:

MSVCR*.DLL – This DLL contains C run-time routines – such as , , , , , , , etc. When C code is linked dynamically to CRT, Visual Studio will link it against appropriate version of .printfscanffgetcceilgetenvputenvisdigitMSVCR*.DLL
MSVCP*.DLL – This DLL contains C++ run-time routines, classes and templates, etc. – such as standard template classes (, , , , etc.), C++ implementation of math library, etc. When C/C++ code is linked dynamically to CRT, Visual Studio will link it against appropriate version of and .basic_stringvectorsetiostreamMSVCR*.DLLMSVCP*.DLL
MSVCM*.DLL – This DLL contains C/C++ run-time routines used for mixed mode (managed and native) programming.

There is also a which is used if C code is linked statically to the CRT environment.LIBCMT.LIB

Visual Studio run-time environment libraries are named after the internal version number. For example:

For Visual Studio .NET 2003, these libraries are: , , and MSVCR71.DLLMSVCP71.DLLMSVCM71.DLL
For Visual Studio 2005, these libraries are: , , and .MSVCR80.DLLMSVCP80.DLLMSVCM80.DLL
For Visual Studio 2008, these libraries are: , , and .MSVCR90.DLLMSVCP90.DLLMSVCM90.DLL
For Visual Studio 2010, these libraries are: , , and .MSVCR100.DLLMSVCP100.DLLMSVCM100.DLL

Basics of how Libraries are Loaded at Run-Time

Without going too much into technical detail, I will try to explain how dynamic and static libraries are loaded into the program memory space at the time of execution. This explanation may not be technically accurate, but it provides enough explanation for basic understanding.

Dynamic Link Libraries (DLLs)

DLL are designed to be shared across all processes that use them unlike a static library (explained later) where the library is private (not shared) to the process that uses it. In general, Dynamic Link Libraries can contain code, data, and resources in any combination.

Code segment is for the executable code and functions that the program calls. The code segment is shared across multiple processes that use the DLL. The code segment occupies a single place in the physical memory and is read-only.

Data segment is internal data that the DLL uses – such as static variables, internal data storage, etc. This data represents the state of the DLL. The data segment is generally private to each process that uses the DLL. Shared data segments are possible for inter-process communication, but that discussion is not relevant for the current topic. The data segment is read-write for the process that owns the data segment. Although the data segment is private to the process, it is still managed by the DLL. So in essence data segment is private per process, per DLL.

Static Libraries

Static libraries are designed to become part of the executable during linking. Unlike DLLs, Static libraries are not shared. At the time of linking, all appropriate code segment and data segment elements from static library are included in the executable. Thus that code and data is loaded in the same physical memory location where executable code segment and data segment are loaded.

The most important thing to remember is that the data segment is private to the process whereas code segment is shared (in the case of DLL) or private (in the case of static library).

Passing CRT Object across CRT Boundaries

For this discussion please refer to the CRT Environments below (Figure 1). This figure shows how the CRT environment is loaded depending upon if the library was dynamically compiled or statically compiled. Let’s see some scenarios when a program might use multiple CRT instances.

Figure 1: CRT Environment

Libraries from Group-1, namely (which contains the function ) and (which contains the function ), are dynamically linked against Visual Studio-2005. As a result these DLLs will share a common version 8.0 CRT instance.LibA.dllFuncALibB.dllFuncB

Similarly, libraries from Group-2, namely (which contains the function ) and (which contains the function ), are also dynamically linked but against Visual Studio-2008. As a result these DLLs will share a common version 9.0 CRT instance.LibC.dllFuncCLibD.dllFuncD

However, the library from Group-3, namely (which contains the function ), is statically linked against Visual Studio-2005. As a result it contains its own version 8.0 CRT instance. This CRT instance will not be shared with any other library or program.LibE.dllFuncE

Similarly, the library from Group-4, namely (which contains the function ), is also statically linked but against Visual Studio-2008. As a result it contains its own version 9.0 CRT instance. This CRT instance will not be shared with any other library or program.LibF.dllFuncF

As long as a program links against libraries from only one of these four groups, your program will have only one CRT instance. However, if your program links against libraries from more than one group, then your program will have multiple CRT instances.

There is an additional layer of complexity depending upon how the program itself is linked (dynamically or statically) to CRT environment. If the program itself is linked dynamically then it will share the appropriate CRT instance with other programs/DLLs. On the other hand if the program is linked statically then it too will have its own CRT instance that will not be shared with any other program/DLL. But for the sake of simplicity, let’s focus this discussion on only DLLs and their CRT instances.

CRT objects – such as file handles, environment variables, allocated memory, etc. are the data objects created and managed by a CRT instance.

Are multiple CRT instances in the program memory space good or bad? In general passing CRT objects across CRT instance boundary (within which they are created and managed) is a bad practice and should be avoided. Let’s examine some cases and see why. For rest of the discussion, please refer to the CRT Environments above (Figure 1).

Dynamically Allocated Memory

Dynamic memory is allocated when you do (C++) or (C/C++). Dynamic memory is allocated on heap. Each CRT instance manages its own heap. Whereas local variables, function parameters, and return values are created on the stack. The program, not the CRT instance, manages the stack.newmalloc

When inside allocates dynamic memory, it calls or method from its associated CRT instance. This dynamic memory is allocated on the heap and is managed by that CRT instance. Keep in mind that sometimes dynamic memory is allocated in the background. For example – the CRT function will allocate memory, copy input string and return the allocated memory. Let’s explore various scenarios and see what could happen if this dynamically allocated memory is passed as function parameter to other functions inside other libraries.FuncALibAnewmallocstrdup

`FuncA` Calls	Called Function Performs	Notes
`Func`B	Memory Free Memory Reallocation Memory Access	Function call does not cross CRT boundary. `FuncB` will access same CRT instance as that of for performing memory operations.`FuncA` Free or reallocation of memory will be performed on the same heap that was used by to allocate memory.`FuncA`
`FuncC` `FuncD` `FuncF`	Memory Free Memory Reallocation	Function call crosses CRT boundary. The CRT instance allocating the memory is not same as the one attempting to free it. Furthermore they are of different versions. Function call will result into memory access error or program crash because the heap where memory is allocated is not same as the one from where it is being freed. The CRT instance of Group-2 or Group-4 does not have any record of this memory allocation.
`FuncC` `FuncD` `FuncF`	Memory Access	Function call crosses CRT boundary. Accessing the allocated memory may not result into an error. However, interpreting the memory content may result into exceptions, data corruption, memory access error, or program crash. See subsection on “Passing C/C++ Objects” later in the blog.
`FuncE`	Memory Free Memory Reallocation	Function call crosses CRT boundary. The CRT instance allocating the memory is not same as the one attempting to free it. It does not matter that the CRT instance versions are same. Function call will result into error or program crash.
`FuncE`	Memory Access	Function call crosses CRT boundary. Accessing the allocated memory may not result into an error. This is different than accessing memory. In this case versions of CRT instances are same. However, the CRT instance of is statically compiled and hence could be at a lower build level than the CRT instance of . This difference in build level may result into exceptions, data corruption, memory access error, or program crash while interpreting the memory content.`FuncCFuncEFuncA` See subsection on “Passing C/C++ Objects” later in the blog.

Here are some suggestions to avoid this potential error:

Make sure that the executable and all dependent libraries are linked dynamically and using same Visual Studio version. This will ensure that at run time your program will use one and only one CRT instance. Any library can perform and , but all those libraries will access the same CRT instance to do those memory operations.mallocfree
Another potential solution is to statically link the program to the CRT environment. This solution may not work if your program depends on any 3^rd party DLLs.
If multiple CRT instances can’t be avoided then the library responsible for allocating memory must provide means of freeing the allocated memory too. This is exactly what Teamcenter ITK does. ITK provides and functions. Now, regardless of which library allocates the memory and which library frees it, the actual allocation and freeing happens inside single CRT instance, as long as memory is allocated using and freed using .MEM_allocMEM_freeMEM_allocMEM_free

Global and Static Data

Static data is persistent but its scope is limited to certain block of code, a function, or a file. Global data, on the other hand, has global scope. Any function from any DLL can access global data if the variable is exported. If it is not exported then any function from the same DLL which declares the global data can access it.

One well-known CRT function using static data is – , which find the next token in string. The first call to requires the input string to be tokenized. Subsequent calls, however, require as the input. This function uses static data to store the state of the function and that’s how it knows how to return next token when input is . Any non- input resets the internal state of the function. If the program has multiple CRT instances then each instance has its own function and each function has its own internal state.strtokstrtokNULLNULLNULLstrtokstrtok

Imagine what would happen if first call to happens inside , second call to (with input) happens inside , and third call to (with input) happens inside . The first and second calls would work fine because both calls to access the same CRT instance. But the third call, on the other hand, would result in the wrong token being returned if there were any prior calls to from that CRT instance. Or worst yet, program would result in memory access error or program crash if it was the first ever call to from that CRT instance.strtokFuncAstrtokNULLFuncBstrtokNULLFuncCstrtokstrtokstrtok

Here are some suggestions to avoid this potential error, if program has multiple CRT instances:

Check Microsoft documentation to see if the function has any specific warning. Also see if there is any “safe” replacement version available. Starting Visual Studio-2005, Microsoft provided many replacement functions to legacy CRT functions with potential of buffer overrun errors. Refer to Visual Studio-2005 documentation regarding Security Enhancement in the CRT and Deprecated CRT Functions. For example the “safe” replacement version of is , which uses the context parameter to store state of the function, rather than rely on internal static data.strtokstrtok_s
Even if there is a safe replacement version available, it is still a bad practice to have multiple CRT instances in your program.
One another potential solution is to provide set of functions inside a single DLL to do all static and global data manipulation and use those function instead of direct CRT calls. This solution may not work if your program depends on any 3^rd party DLLs.

Passing C/C++ Objects (Non-Primitive Data Objects)

int, , , , and (along with arrays of those) are all primitive data types. Their definition is not subject to the CRT version. It is safe to pass these objects of these data types from one function another even across a CRT boundary. But nowadays program rarely depends on these data types alone. Programs use derived data types such as – structures and classes. The following discussion refers to such derived data types.floatdoublecharbool

C and C++ objects, such as class instances and structure variables can be passed from one function another by value or by reference. When passed by value, data is duplicated on the stack prior to executing a function call. Similarly, when returning by value, data is duplicated on the stack prior to returning. Data is copied either performing a bitwise copy in case of C or calling a copy constructor in case of C++. Let’s explore pitfalls of multiple CRT instances in this context.

When calls and passes CRT objects by value, everything works fine because both functions use same CRT instance. But when calls by passing CRT objects by value, results are not guaranteed. This is because the passed CRT object may be interpreted differently in different CRT instance. Let’s say that creates a C++ class instance of and passes it by value to . The CRT instance of performs the copy of that object on the stack using its own (Visual Studio-2005) interpretation of implementation. When the program enters into , the CRT instance of interprets the object from stack using its own (Visual Studio-2008) implementation. It is very likely that the implementation of Visual Studio-2008 is much different than that from Visual Studio-2005. A similar situation would arise when returning CRT Objects by value. Also keep in mind that most C++ class instances use dynamically allocated memory, which might be freed or reallocated in background. Refer to the subsection on “In-Line functions, Template Classes, and Template Functions” later in the blog.FuncAFuncBFuncAFuncCFuncAstd::vector<int>FuncCLibAvectorstd::vectorFuncCLibCvectorstd::vector

Well, then the next question is – is it safer to pass CRT objects by reference across a CRT boundary? And the answer is – absolutely not. It does not matter how you pass an object. It is the object interpretation, data alignment, and order of data elements that causes this issue. In my opinion passing CRT objects as references across CRT boundary are even more error prone than passing by values. One could at least, employ serialization/de-serialization techniques to safeguard CRT objects when passed by value.

The same holds true in case of simple C structures too. Although it is unlikely that, say for an example, the order of or , would change from one version of Visual Studio to another, one can’t reply on such assumptions.struct tmstruct FILE

Here are some suggestions to avoid this potential error, if program has multiple CRT instances:

Rather than relying on C++ copy constructors or C bitwise copy, write your own serialization and de-serialization routines and use them consistently. Again, this solution may not work if your program depends on any 3^rd party DLLs. In that case you may have to develop your own wrapper functions for serialization and de-serialization routines in multiple Visual Studio versions.

In-Line Functions, Template Classes, and Template Functions

In-line functions are not compiled rather are replaced by its definition at the point of use, much like C/C++ macros. It is rather a bit more complicated than that. Visual C++ compiler has its own logic to decide which functions are to be in-lined and which functions are not. Using in-line keyword is merely a suggestion to the Visual C++ compiler that the function is a potential candidate for in-lining. Compiler, at its own discretion, may still decide to compile the function rather than in-lining.

On the other hand, the actual class definition of template classes and actual function definitions of template functions are always created at the time of use.

This situation adds another layer to the whole issue of multiple CRT instances. The actual class definition or function definition or in-line function may differ from one CRT version to another. This would result in interpretation errors of CRT objects, especially C++ class instances.

Here are some suggestions to avoid this potential error, if program has multiple CRT instances:

Rather than relying on C++ copy constructors or C bitwise copy, write your own serialization and de-serialization routines and use them consistently. Again, this solution may not work if your program depends on any 3^rd party DLLs. In that case you may have to develop your own wrapper functions for serialization and de-serialization routines in multiple Visual Studio versions.

File I/O and Operations

Just as mentioned in discussion about C++ class objects and template classes, file I/O is also subject to potential errors when performed from functions across CRT boundaries.

In C, file I/O is performed by various CRT function using the structure. This in and of itself may not cause an error. However using in one CRT instance and or in another would certainly cause a memory access error, or will crash the program.FILEfopenfclosefreopen

In C++ file I/O is performed using library of template classes. These classes and methods are subject to same limitation discussed before in subsection “In-Line functions, Template Classes, and Template Functions”.iostream

Here are some suggestions to avoid this potential error, if a program has multiple CRT instances:

Short of making sure that the program does not have multiple CRT instances, there is not much you can do about this.

Environment Variables

Even reading and writing environment variables in multiple CRT environments would cause a memory access error, or will crash the program.

An environment variable value is read using CRT function. And it is updated or created using CRT function. Just like heap management, each CRT instance manages its own environment variables. Each CRT instance is initialized when program starts. At the time of initialization, the CRT instance copies the current environment space into its own buffers. Multiple CRT instances mean multiple such environment space buffers. And guess what…they don’t talk to each other. Let’s explore this in detail.getenvputenv

Let’s say that calls to set environment variable. Subsequently, reads that same environment variable using . This all works fine because both these calls, and , are executed in same CRT instance. If , , , or tries to read that same environment variable using , it may not find it defined or it may not have correct value in their respective environment variable space.FuncAputenvMY_ENVFuncBgetenvputenvgetenvFuncCFuncDFuncEFuncFgetenv

In this case, however, memory access error or program crash may not be direct result of calls to or . It would rather be due to the expectation that would not return and hence result of using results without checking for .putenvgetenvgetenvNULLgetenvNULL

Here are some suggestions to avoid this potential error, if program has multiple CRT instances:

Just like the suggestion for dynamic memory allocation, provide your own wrapper implementation of and . Use these wrapper functions consistently. This is exactly what Teamcenter ITK does too. ITK provides its own and wrapper functions.putenvgetenvss_putenvss_getenv

Resources

Just like dynamic memory CRT resources are managed by CRT instance. Hence, CRT resources created, allocated and passed over CRT boundary are problematic. If you must do this, then make sure your own library provides wrapper functions for complete implementation of various management functions of the resource – including allocating, destroying, reallocating, etc.

Final Thoughts

As I have explained in this blog, using multiple CRT instances in a program is absolutely a bad idea. It is extremely problematic. Not to mention extremely hard to debug.

I tried to give the best explanation and possible workarounds in case if you must have multiple CRT instances in your program space. There is, however, no guarantee that those workarounds will work in every situation. Especially if you have 3^rdparty libraries, you can’t control how those libraries will handle those problematic areas. You can’t certainly force them to use your wrapper functions. Although, you could develop set of those wrapper functions for each version of CRT instance in your environment and use those wrapper functions as a go-between interface.

Here are my recommendations:

Avoid using multiple CRT instances in your program space at all costs.
Avoid using multiple CRT instances in your program space at all costs.
Did I mention, avoid using multiple CRT instances in your program space at all costs.

Next Steps

I think I have beaten this subject to death. I don’t think I need to provide any more examples to convince you not to let your program use multiple CRT instances. If you are convinced, I have done my job and I would certainly appreciate your comments/feedback. If you are still not convinced, I am out of examples and can only say that you are on your own.

Please feel free to comment on this topic. We also encourage you to explore our website to learn about products and services we offer. If you have any questions or wish to discuss this topic or any other topic with us in private, please contact us.

posted @ 2021-08-18 08:10 soso101 阅读(29) 评论(0) 编辑收藏举报

刷新页面返回顶部

soso101

Mixing Multiple Visual Studio Versions in a Program is Evil

Visual Studio Product Names and Versions

Visual Studio Runtime Environment Explained

Basics of how Libraries are Loaded at Run-Time

Dynamic Link Libraries (DLLs)

Static Libraries

Passing CRT Object across CRT Boundaries

Dynamically Allocated Memory

Global and Static Data

Passing C/C++ Objects (Non-Primitive Data Objects)

In-Line Functions, Template Classes, and Template Functions

File I/O and Operations

Environment Variables

Resources

Final Thoughts

Next Steps

公告