Working with Textual Data: Be Prepared for Unexpected Problems

BSTR & Co.

Here, I will discuss a slightly different area of where the UNICODE meaning is also applicable—the BSTR data type. As you may observe in the SDK headers, BSTR is defined as

typedef WCHAR OLECHAR;
typedef OLECHAR *BSTR;

It is a wide characters string. In fact, you can store any binary data there because BSTR puts its buffer length at the beginning, so a NULL line terminator is not necessary. In case of string data, you can see that BSTR is represented as a UNICODE string.

Where do you use BSTR on Windows Mobile? Well, MS XML, other COM-related areas, database data—those are just few examples. To help you work with BSTR, the ATL library contains two helper classes: _bstr_t and CComBSTR. Look into comutil.h and atlcom.h to get more details about the above classes and a couple of conversion functions. Both CComBSTR and _bstr_t provide a similar but a slightly different interface, so which one to use is up to your task scope and convenience. Obviously enough, you will see some minor differences in MFC 3.0 and MFC 8.0 implementations.

Conclusion

This article hopefully answered most of your initial questions about UNICODE and ASCII/UTF-8 textual data processing. In regard to text in COM or plain files, databases, or XML, you now can conquer them all. It may be a bit more complicated sometimes, but in most cases, all is quite straightforward once you understand the basics. That's it!

About the Author

Alex Gusev started to play with mainframes at the end of the 1980s, using Pascal and REXX, but soon switched to C/C++ and Java on different platforms. When mobile PDAs seriously rose their heads in the IT market, Alex did it too. After working almost a decade for an international retail software company as a team leader of the Windows Mobile R department, he has decided to dive into Symbian OS ™ Core development.