Find your next scenic drive!

June 1, 2006

C++ and finally

Almost two months ago, Danny Kalev wrote a piece regarding adding finally to C++. Unlike Danny, however, I think this is a great addition to C++.

Danny has a good point that part of the reason that finally exists in garbage collected languages is that they lack a destruction mechanism that allows for cleaning up. For example, a common use of finally in Java is with files, sockets, and other such resources to ensure that the resource is guaranteed to be released (since the finalizer is not guaranteed to be ran). In C#, this could be done via the using keyword and hopefully in a future release on Java, a similar mechanism will exist that would allow the same behaviour.

But I think that there is a certain amount of value to adding such a feature to the language. For example, if you wanted to have a code segment display a log message at the end of a method, you would need to do something like:

  • class Trace {
  • public:
  •  Trace() {
  •     // ... maybe display something or grab a statistics
  • }
  •  ~Trace() {
  •     // ... log something or accumulate
  • }
  • void myFunc() {
  •   try {
  •     Trace t;
  •     // ... do something
  •   } catch ( ... ) {
  •     // ... handle it
  •   }
  • }

For a general purpose tracing class, this would be, by far, the best approach. But for a specific operation that is done for one specific method, I think that the code above is not as readable as the same code written using a finally, even in the case where the statements in the finally explicitly call a function.

Said another way, the for keyword is equally redundant, since the 1998 C++ ISO Standard states in section 6.5.3 that the statement for ( for-init-statement ; condition ; expression ) statement is equivalent to:

  • {
  •   for-init-statement
  •   while ( condition ) {
  •     statement
  •     expression ;
  •   }
  • }

But while the latter is equivalent to the former, there is a certain readability value that comes with the for statement that it is a welcomed keyword.

I feel that the addition of the finally keyword would have the same benefits. It is easily implemented in terms of existing infrastructure quite similar to the for example above is. It has adds increased readability, particularly for cases where the creation of a class would consume many more lines than a line or two in a finally block. Lastly, I think that there is a huge benefit to people using C++ that are accustomed to using other languages; the familiarity with this feature would permit such developers to concentrate on other, more important C++ features instead of trying to figure out how to get around the perceived lack of this feature. I think these benefits outweigh the cons of adding a new keyword, and I welcome the change.

March 22, 2006

Adapting for Concurrency...

Herb Sutter gave a talk on concurrency at PARC earlier this week, and the Audio, Video, and Slides are available.

Where as even up to a few years ago, multiple processor (and multiple core) machines were only available for niche markets, the talk highlights the fact that very soon multiple core machines will be everywhere. This is very exciting, however, we must change how we develop our applications to take advantage of this. From this perspective, it is not just simply using the existing tools that we have such as locks and threads, but also expanding our tools in order to take advantage of these language features.

Herb gives the analogy that this is similar to the early 1990’s when Object Orientation was new. While you could write an object-oriented application in C (for example, consider the FILE structure and methods), it is far easier to write such applications in languages that have native support for objects. Herb states that while we currently can write multithreaded software in our existing tools, writing correct multithreaded software is hard, pointing out that some of the class libraries and even some of his own examples have been incorrect.

During the discussion, Herb mentioned something that I was unaware of. A while ago, I discussed Double Check Locking and how it was broken and later about Scott Meyers and Andre Alexandrescu’s C++ and the Perils of Double-Check Locking. One part that I had not noticed in Scott and Andre’s paper is that, thanks to reworking the memory model, the Double Check Locking Pattern once again works in Java 1.5/5.0 (JSR 133) and .NET 2.0 (CLI 2.0 Section 12.6). The solution is to use the keyword volatile as follows:

  1. private static volatile Singleton instance= null;
  2. public static Singleton get() {
  3.   if ( instance == null ) {
  4.     synchronized ( Singleton.class ) {
  5.        instance = new Singleton();
  6.     }
  7.   }
  8.   return instance;
  9. }

Again, this only works in Java 1.5 and .NET 2.0 because of the changes in the memory model. There are some discussions that the restrictions on volatile may not make it faster than volatile, but then there’s a solution that uses the Memory Barriers instead of volatile that could get around this in .NET. Of course, some of this feels like premature optimization, as synchronization and volatile improvements are things that compiler vendors are likely to working on. I guess its like the free lunch.

And speaking of which, if you haven't really thought about having your desktop application take advantage of a 32 core machine, this is good discussion to get you thinking about that and why.

January 18, 2006

java.util.concurrent Memory Leaks...

About two months ago, one of the Java servers my group maintains ran out of memory, and since then, if this service was up for more than a week, again, we get the dreaded java.lang.OutOfMemoryError. Each time that this would happen, we would take a histogram of the memory usage (jmap -histo pid). After gathering a few of those and comparing those results to what we expected, there are two classes that seemed suspiciously using a lot of memory, java.util.concurrent.LinkedBlockingQueue$Node and java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, so I started looking at how these classes were used.

The java.util.concurrent.LinkedBlockingQueue was used in several places in the program, however, its usage was fairly straight forward. In addition to usually being the one that used the most memory, the other thing that made this class suspect was that its usage was just added a week before the problem first occurred. Furthermore, in all of the various tests that I ran, the number of Nodes would increase significantly, but when the full garbage collector ran (either automatically or when forced), the number of nodes would drop to the number of entries that I expected there to be. This made me suspect that perhaps the garbage collector was not running when the OutOfMemoryError was thrown or that perhaps there was not enough memory at that point to even run the garbage collector, but alas, I was quickly brought back to my senses.

During the investigation of the above, I googled on the class and it brought some interesting results. On the Concurrency Interest mailing list, an entry from last year suggests a memory leak in LinkedBlockingQueue and it was confirmed by Doug Lea. I re-wrote the program (in the method: originallyReportedLeak) suggested in the message to confirm that it was an actual issue, and surely, it is a current problem. Running the program with the JMX arguments, you can attach jconsole to see the memory grow, and no amount of forcing the garbage collector to run will bring it down.

But of course, that was not my problem. The issue with the leak in the above program has to do with a timeout occurring. In all but one of the cases was not using the methods with a timeout, and in that case, however, the timeout value was MAX_INT, and because of the way we use this particular service, there is no way that it would have been able to timeout and leak. And as I mentioned above, forcing a garbage collection always brought it back to normal, so alas, back to the drawing board.

The other class that used outrageous amounts of memory, AbstractQueuedSynchronizer’s Node inner class, however, was not as easy to find where it was used. By using Borland OptimizeIt, however, I found out that this class was used by java.util.concurrent.CountDownLatch, where the code was calling await() with a one second timeout, and in this case, the timeout was always triggered. Sound familiar? I wrote a program (in the method: anotherSimilarLeak) that demonstrates this precise memory leak, and surely, it is the same underlying problem as in the LinkedBlockingQueue above.

In this particular program, we decided that the latch was not necessary for this particular method, and that replacing it with a sleep was all that was necessary. This Bug is fixed in Mustang, the next version of Java, so hopefully your project can wait until it fixed. Otherwise, you may unfortunately need to roll-your-own timeout functionality... just make sure to properly document such hacks so that you can remove them in the near future!

January 1, 2006

Strings in C++...

A few weeks ago, Danny Kalev had an interesting article about Visual C++ 8.0 hijacking the C++ standard.  The gist of the article is that Visual Studio proclaims a few string manipulation methods as deprecated, when the official standard does not.  There are two things that I could not help wonder while I was reading through these articles though.

I find it interesting to have this entire controversy on the various string manipulation functions. If you look at strings from the perspective of an array of characters, the problem they are trying to solve is already solved in newer programming languages, such as Java and C#. The question of whether an array is shorter or longer than the other in these languages effectively boils down to calling array.length. This in actuality removes not only the requirement of passing in this argument, but it also prevents from you from lying about how long your strings actually are. Everyone that has been programming for any period of time knows that the length of that array will change at one point or another, and unfortunately, there will be at least one area in the program that makes some assumption of how large or small that array is no matter how careful you actually are.

This, however, is unlikely to be added to C++, as the technique is quite powerful and useful in many cases, not to fail to mention that it may be quite painful to add at this time. But to quote the Spiderman movie, “with great power comes great responsibility,” in this case, the responsibility to know and honor the bounds of an array.

Regardless of that, the other aspect that prevents this entire class of error without changing the language is simply by using the std::string class, as this class manages the memory and particularly, the length of the string for you. It is not to say that this class is not without faults, or at least perceived faults.  For one, there is no way to directly append an integer to the string, which would generally lead to code like this:

  • std::string myString = ...;
  • int value = ....;
  • char str[25];
  • snprintf( str, sizeof( str ), "%d", value );
  • myString += str;

This is unfortunate.  Java and C# have both addressed this issue, both from having automatic conversion from objects to Strings (making the above myString += value;) and also having formatters directly in the string object.

In addition to the perceived missing functionality (most developers will be able to quickly tell you at least one more function that they think should have been included in std::string), one major concern with the string class is its dynamic memory allocations.  Interestingly enough, however, the introduction to Effective C++ suggests some excellent advice: “As a programmer, you should use the standard string type whenever you need a string object. ... As for raw char*-based strings, you shouldn't use those antique throw-backs unless you have a very good reason.  Well-implemented string types can now be superior to char*s in virtually every way--including efficiency.” And this excellent quote is from the Second Edition (1998).  Of course, there are reasons where you may not want the std::string class, but in those cases, you should use a class that abstracts out the optimizations you require.

In addition to this, Item 13 of Effective STL suggests to prefer std::string and std::vector to dynamically allocated arrays, suggesting that they hide many of the intricacies of dealing with pointers, and further, mentions how some implementations of std::string employ a reference counted and Copy-On-Write semantics that could limit the number of allocations for certain usage, which is further described in Item 15.

A couple years ago, Andre Alexandrescu wrote an article in CUJ's Expert Forum describing his experience with std::string. He was asked to improve the performance of an application, and it was believed that the issue was with memory allocations. A little profiling lead him to believe that strings were part of the issue, so rather than rip them out, he instead decided to provide some policies to the classes, thus depicting how they were being used. This significantly improved the applications performance within a couple weeks of work.

All of this to say that the std::string class leads to code that is more readable, performs nicely, and should you encounter some issues on your platform, you could potentially upgrade your STL implementation or roll your own string-classes.

While declaring functions deprecated that are not deprecated in the standard is a bit extreme, I think that this brings awareness to the seriousness of this issue and should developers pay attention to the warnings, it could yield more secure and stable software.

December 21, 2005

Of Hacks and Keyboards...

I am not sure if I missed something or not, but I am completely surprised at this conversation about being notified when a tab changes in a TabControl. I am surprised that no one has mentioned TabIndexChanged event or, if you are working on a platform that does not support that event (such as Windows CE), the SelectedIndexChanged event.

But what is even more surprising to me is that, not only did someone suggest setting up a mouse event handler to be notified when the user selects the tab, the original author had actually tried it! Fortunately, it seems that this does not work.

From a usability perspective, what about the keyboard? While most people use a mouse for selecting tabs, some people use the keyboard, and depending on what happens when the event occurs, the user experience of the keyboard user would be different than that of the mouse user. Debugging this over e-mail could be rather amusing, to say the least.

When creating hacks, you have to consider the cases for which the hack would not work, and in a user-interface, the inability to use the keyboard is definitely something that should be avoided, especially when a little further research would provide the desired and correct behaviour with minimal effort as the case above. Unfortunately, sometimes this is not the case, and for these cases, the caveats must be minimized and well understood and documented. Based on the discussion, I do not believe that this was the case for this particular issue.