博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

Cstring与BSTR

Posted on 2007-05-12 13:05  叶落无痕  阅读(3362)  评论(0编辑  收藏  举报

CString to BSTR

When programming with ActiveX, you will sometimes need a value represented as a type BSTR. A BSTR is a counted string, a wide-character (Unicode) string on Intel platforms and can contain embedded NUL characters. 

You can convert at CString to a BSTR by calling the CString method AllocSysString:

CString s;
s = ... ; // whatever
BSTR b = s.AllocSysString();

 The pointer b points to a newly-allocated BSTR object which is a copy of the CString, including the terminal NUL character. This may now be passed to whatever interface you are calling that requires a BSTR. Normally, a BSTR is disposed of by the component receiving it. If you should need to dispose of a BSTR, you must use the call

::SysFreeString(b);

to free the string.

The story is that the decision of how to represent strings sent to ActiveX controls resulted in some serious turf wars within Microsoft. The Visual Basic people won, and the string type BSTR (acronym for "Basic String") was the result.

BSTR to CString

 Since a BSTR is a counted Unicode string, you can use standard conversions to make an 8-bit CString. Actually, this is built-in; there are special constructors for converting ANSI strings to Unicode and vice-versa. You can also get BSTRs as results in a VARIANT type, which is a type returned by various COM and Automation calls.

For example, if you do, in an ANSI application,

BSTR b;
b = ...; // whatever
CString s(b == NULL ? L"" : b)

works just fine for a single-string BSTR, because there is a special constructor that takes an LPCWSTR (which is what a BSTR is) and converts it to an ANSI string. The special test is required because a BSTR could be NULL, and the constructors Don't Play Well with NULL inputs (thanks to Brian Ross for pointing this out!). This also only works for a BSTR that contains only a single string terminated with a NUL; you have to do more work to convert strings that contain multiple NUL characters. Note that embedded NUL characters generally don't work well in CStrings and generally should be avoided.

Remember, according to the rules of C/C++, if you have an LPWSTR it will match a parameter type of LPCWSTR (it doesn't work the other way!).

In UNICODE mode, this is just the constructor

CString::CString(LPCTSTR);

As indicated above, in ANSI mode there is a special constructor for

CString::CString(LPCWSTR); 

this calls an internal function to convert the Unicode string to an ANSI string. (In Unicode mode there is a special constructor that takes an LPCSTR, a pointer to an 8-bit ANSI string, and widens it to a Unicode string!). Again, note the limitation imposed by the need to test for a BSTR value which is NULL.

There is an additional problem as pointed out above: BSTRs can contain embedded NUL characters; CString constructors can only handle single NUL characters in a string. This means that CStrings will compute the wrong length for a string which contains embedded NUL bytes. You need to handle this yourself. If you look at the constructors in strcore.cpp, you will see that they all do an lstrlen or equivalent to compute the length. 

Note that the conversion from Unicode to ANSI uses the ::WideCharToMultiByte conversion with specific arguments that you may not like. If you want a different conversion than the default, you have to write your own.

If you are compiling as UNICODE, then it is a simple assignment:

CString convert(BSTR b)
   {
    if(b == NULL)
        return CString(_T(""));
    CString s(b); // in UNICODE mode
    return s;
   }

If you are in ANSI mode, you need to convert the string in a more complex fashion. This will accomplish it. Note that this code uses the same argument values to ::WideCharToMultiByte that the implicit constructor for CString uses, so you would use this technique only if you wanted to change these parameters to do the conversion in some other fashion, for example, specifying a different default character, a different set of flags, etc.

CString convert(BSTR b)
   {
    CString s;
    if(b == NULL)
       return s; // empty for NULL BSTR
#ifdef UNICODE
    s = b;
#else
    LPSTR p = s.GetBuffer(SysStringLen(b) + 1); 
    ::WideCharToMultiByte(CP_ACP,            // ANSI Code Page
                          0,                 // no flags
                          b,                 // source widechar string
                          -1,                // assume NUL-terminated
                          p,                 // target buffer
                          SysStringLen(b)+1, // target buffer length
                          NULL,              // use system default char
                          NULL);             // don't care if default used
    s.ReleaseBuffer();
#endif
    return s;
   }

Note that I do not worry about what happens if the BSTR contains Unicode characters that do not map to the 8-bit character set, because I specify NULL as the last two parameters. This is the sort of thing you might want to change.