Jan 01, 2006

Strings in C++...

A few weeks ago, Danny Kalev had an interesting article about Visual C++ 8.0 hijacking the C++ standard.  The gist of the article is that Visual Studio proclaims a few string manipulation methods as deprecated, when the official standard does not.  There are two things that I could not help wonder while I was reading through these articles though.

I find it interesting to have this entire controversy on the various string manipulation functions. If you look at strings from the perspective of an array of characters, the problem they are trying to solve is already solved in newer programming languages, such as Java and C#. The question of whether an array is shorter or longer than the other in these languages effectively boils down to calling array.length. This in actuality removes not only the requirement of passing in this argument, but it also prevents from you from lying about how long your strings actually are. Everyone that has been programming for any period of time knows that the length of that array will change at one point or another, and unfortunately, there will be at least one area in the program that makes some assumption of how large or small that array is no matter how careful you actually are.

This, however, is unlikely to be added to C++, as the technique is quite powerful and useful in many cases, not to fail to mention that it may be quite painful to add at this time. But to quote the Spiderman movie, “with great power comes great responsibility,” in this case, the responsibility to know and honor the bounds of an array.

Regardless of that, the other aspect that prevents this entire class of error without changing the language is simply by using the std::string class, as this class manages the memory and particularly, the length of the string for you. It is not to say that this class is not without faults, or at least perceived faults.  For one, there is no way to directly append an integer to the string, which would generally lead to code like this:

This is unfortunate.  Java and C# have both addressed this issue, both from having automatic conversion from objects to Strings (making the above myString += value;) and also having formatters directly in the string object.

In addition to the perceived missing functionality (most developers will be able to quickly tell you at least one more function that they think should have been included in std::string), one major concern with the string class is its dynamic memory allocations.  Interestingly enough, however, the introduction to Effective C++ suggests some excellent advice: “As a programmer, you should use the standard string type whenever you need a string object. ... As for raw char*-based strings, you shouldn't use those antique throw-backs unless you have a very good reason.  Well-implemented string types can now be superior to char*s in virtually every way--including efficiency.” And this excellent quote is from the Second Edition (1998).  Of course, there are reasons where you may not want the std::string class, but in those cases, you should use a class that abstracts out the optimizations you require.

In addition to this, Item 13 of Effective STL suggests to prefer std::string and std::vector to dynamically allocated arrays, suggesting that they hide many of the intricacies of dealing with pointers, and further, mentions how some implementations of std::string employ a reference counted and Copy-On-Write semantics that could limit the number of allocations for certain usage, which is further described in Item 15.

A couple years ago, Andre Alexandrescu wrote an article in CUJ's Expert Forum describing his experience with std::string. He was asked to improve the performance of an application, and it was believed that the issue was with memory allocations. A little profiling lead him to believe that strings were part of the issue, so rather than rip them out, he instead decided to provide some policies to the classes, thus depicting how they were being used. This significantly improved the applications performance within a couple weeks of work.

All of this to say that the std::string class leads to code that is more readable, performs nicely, and should you encounter some issues on your platform, you could potentially upgrade your STL implementation or roll your own string-classes.

While declaring functions deprecated that are not deprecated in the standard is a bit extreme, I think that this brings awareness to the seriousness of this issue and should developers pay attention to the warnings, it could yield more secure and stable software.

Filed In