Adding Stack Traces to C++ Exceptions...
One of the features that I love
about Exceptions in Java, Python, and other languages is the stack
trace that it provides you, detailing precisely where the exception was
raised. This stack trace is invaluable when placed in log files or when
you do not have a debugger available, since you can now know how your
application got to the exception, which should give you some great
context for starting to approach the issue.
In this weeks InformIT
C++ Guide, Danny mentioned C++
exception-handling tricks for Linux
from IBM. One of the tricks mentioned in the article is how to get a
stack trace where the exception is thrown; the technique used
is well documented in the GNU
C Library Documentation.
The example code in the article demonstrates the concept well, however, displaying the stack trace in the constructor is not ideal, since this information should be made available to the code that catches the exception. At that point, a decision can be made to log a summary of the exception, the full exception, or handle it in some other fashion based on the context.
When I went to run the first application with it, I was expecting a stack trace similar to Java. The actual output, though, was:
- Something went dreadfully wrong.
- at ./TestException [0x804aaaa]
- at /lib/tls/libc.so.6(__libc_start_main+0xe0) [0x4012cb10]
- at ./TestException(__gxx_personality_v0+0x119) [0x8048f01]
Using a GDB, you can use list *address or something similar (such as objdump) on that same executable, and with this, you can get the same information as what a Java or Python stack trace gives you, albeit with more work for you. Furthermore, this approach only works on non-Linux or HURD systems; hopefully the other operating systems you support have some mechanism similar to it.
In general, however, the most
important piece of information in the stack trace is the first entry, viz.
where the exception was thrown, and in many cases, you do not need to
dig any further than it. Because of this, we can reuse the
C/C++
defines for __FILE__
and __LINE__,
and have these passed into our constructor. By using a quick
and dirty macro,
we can then do something like this:
- MY_THROW( "Something went dreadfully wrong." );
Using this mechanism, it can now provide us with the following information:
- Something went dreadfully wrong.
- at TestException.cpp:13
- at /lib/tls/libc.so.6(__libc_start_main+0xe0) [0x4012cb10]
- at ./TestException(__gxx_personality_v0+0x119) [0x8048f01]
And now I have all the stack
information if I need it, but I also have the filename and line number
where the exception is raised. Furthermore, the filename and line
number are available for all standard C or C++ compilers, and thus
provide a portable mechanism to provide at least the most important
information.
As such, I leave you with my
refactoring of this exception class: ExceptionWithStack.h
and ExceptionWithStack.cpp,
and an example program, TestException.cpp.
I have only provided a printStackTrace()
method similar to Java,
however, a true environment would require other methods.
My class derives from std::exception
simply to allow existing code to
be updated to this mechanism where I am planning to use this class. The
code compiles on both g++ and
Visual C++ 2003. This is a great extension; it is just too bad that it
is not part of every exception.
The Proposed New Purpose for the C++ auto Keyword...
One of the little known
keywords of C++ (or even C, at that) is the auto
keyword, and however unknown it is, it is the default behaviour for
variables. The keyword is one of the three variable storage classes,
the other two being extern
and static,
and means that the variable should be allocated on the stack. In my
more than ten years of developing in C and C++, I have seen this
keyword used probably twice, neither of which were, by any means,
required and caused more confusion than if they were not there.
According to The
InformIT C++ Guide, auto
for the People discusses how
Bjarne Stroustrup and Jaakko Järvi have a proposal for C++0x
to reuse this keyword and to add a new one, decltype.
In their proposal, the auto
keyword allows the type of a variable to be automatically discovered.
For example, consider the following segment:
- std::vector<MyClass> v;
- ...
- for ( auto it = v.begin(), end = v.end(); it != end; ++ it ) {
- ...
The attentive reader will
notice that in line 3, the type of the iterator, it,
is not declared, and rather is implied as what we would write today: std::vector<MyClass>::iterator.
Danny’s spin on this
new purpose for auto
appears focused on the number of characters to type (and while the
comments on Java may be true, I do not really feel it adds to the
article at all), and this is really where my concern is for this
feature. I think that code that exploits this behaviour will have
maintainability issues. For example, consider the following segment:
- auto v = myFunction();
- for ( auto it = v.begin(), end = v.end(); it != end; ++ it ) {
- ...
Looking at this code, I assume
that myFunction
returns some type of container that has an iterator, but this is simply
an assumption based on how the variable v
is used. During the maintenance of code, this return value could
change, and this is where I am sure that the errors generated by the
compiler can get a little odd, especially if myFunction()
is actually in a library, and the developer porting it to the latest
version is not the developer who has changed this behaviour.
Today if this problem occurred,
the error would occur right when you call myFunction(),
since there is a type mismatch, but with this new behaviour, the error
would probably occur somewhere in the for
loop where v
is actually used.
Perhaps this example is a bit
too simple since we are dealing with the standard library.
But consider
the situation where the type is not a vector. How do you know what
methods to call on this class? Is it even a class? As reminded by Code
Reading, we definitely read code
more times than we write it; is this typing convenience really worth
this ambiguity? I do not think so. There is only one place
that I can think of using it, and I will get to that in a moment.
The other keyword, decltype,
acquires the type of the variable in question, similar to the
non-standard typeof
function provided by some compilers. For example, decltype(
it ) would acquire the std::vector<MyClass>::iterator,
and unfortunately this is as detailed as Danny really gets on this
topic.
For developers who have not
delved into the world of generic programming, I do not see these
features as being useful at all. In the context of generic
programming though, both of these keywords will make generic
programming easier and far more useful. How many times have you had to
add more parameters to your generic algorithm because it used another
algorithm whose return value differs? This is a perfect application for
the auto
keyword. The declaration type also similar benefits, allowing the
return value of a generic algorithm to be dependent on some external
method, for example.
Of course, this is still a proposal and is not guaranteed to be in the C++0x standard. Nevertheless, I see as a really practical feature for generic programming. But please stay away from using it to save typing!
The Multiple Personalities of the Singleton
The Singletongof Design Pattern is by far one of most popular and most used design patterns, but the pattern is mostly associated with only one of its many uses, viz. the pattern’s intent, providing a single instance of a class from one global access point (and of course, the double check locking pattern, or a safe adaptation thereof, for multithreaded code). In a recent discussion on a Patterns mailing list, the topic of destroying a Singleton object came up, and it is interesting various answers that came up during the thread. This question is more involved that it first appears, and the problem is less about the mechanism of destroying the object, but instead surrounds the type of Singleton that we are talking about.
Unfortunately this problem is not really discussed in literature. The only place that I am aware that really has a good modern treatment of the issue is Modern C++ Design, whereby Andrei says, “From a client’s standpoint, the Singleton object owns itself. There is no special client step for creating the singleton. Consequently, the Singleton object is responsible for creating and destroying itself. Managing a singleton’s lifetime causes the most implementation headaches.”
One of the approaches recommended in the thread of the aforementioned post was to simply do nothing. Baring in mind that in languages such as C++, Singletons are generally created on the heap, some may concern themselves about a memory leak, But as noted in Effective C++ Item 10, there is no memory leak in this approach, since to have a memory leak, you must lose a reference to memory, but this is not going to happen, and instead, most modern operating systems, memory utilized by an application is returned to the operating system when the application is terminated.
Instead, this approach has a worst problem, in that there is probably a resource leak. For example, a printer queue Singleton will likely have some communication link with a printer or other distributed printer queues or clients. By exiting without properly closing those connections, it is possible to have global leaks throughout your network. It is likely that that you will have to develop software to guard yourself against such problems in the unlikely case that a hardware or software problem occurs, but such problems should be exceptional and not the norm. In the cut and dry case of opening an application to display the contents of a print queue, you do not want such problems on your network.
If you are using a language like C++, one technique recommended in More Effective C++ Item 26 is what Andrei refers to as the Meyers Singleton, whereby you leave the languages do the construction only when it is required and let the language do the destruction at the end of the language. An example of this technique follows:
Singleton&
Singleton::getInstance() {
static Singleton obj;
return obj;
}
As explained in detail in the
two aforementioned resources, the objects are destroyed in a LIFO
manner, which is invoked by the little know function, atexit().
There are three unmistakable
issues with this approach. First, dependencies on other Meyers
Singletons can become problematic, as Andrei describes, since the order
of construction and destruction can lead to undefined behaviour. The
example that Andrei provides is the one where the Logger is only
initialized when there is an error, and it emits an error on the
Display. For the sake of this example, the Keyboard is instantiated
first and then the Display, and during instantiation, the Display
experiences an error, resulting in the Logger being instantiated. When
you exit the application, these Singleton objects are destructed in the
opposite order that they were constructed, thus the Logger is
destructed first, then the Display, and finally the Keyboard, but for
our example, let's say that the Keyboard experiences an error on
destruction. What now? If the Keyboard accesses the Logger, it is now
uses a Dead Reference that has been destroyed, which leads to undefined
behaviour. As there is no way in C++ to define these types of
dependencies amongst static variables, the solution that Andrei puts
forth is to track the destruction of the object via a boolean variable,
and therefore behave slightly more definably, but it is still not
ideal. An alternative that Andrei describes in Modern C++ Design is
what he refers to as the Phoenix Singleton, in which when it detects
that it has died, the object is recreated using placement new, and
destroyed again by using atexit()
(with a few library implementation caveats here, since the behaviour
is
undefined for this case).
The second is that the type of singleton must be know a priori; for example, per se that your Singleton represents access to a configuration file. While today you may only be interested in a file on disk, it is possible that in the future you will have requirements that the data will be stored in a database, in another format (such as XML), or remotely using some middleware (such as CORBA). The above approach would require you to make this decision at design time, not runtime.
The other issue is with the lifetime. While we have delayed the construction of the object, the object destruction is still at the end of the application. Depending on the genre of Singleton, this could be a desired attributed, but it is important to distinguish the difference between the application’s lifetime and the Singleton’s.
In my last sentence, I may have
struck a nerve with some, but it is important to note that the GOF
book
provides several consequences to choosing a Singleton design pattern.
Most people are really comfortable with the first one, which is to
provide access to a sole, unique instance. This means that during the
lifetime of an application, any call to getInstance()
returns the same object, at the discretion of the Singleton
object. In other words, the Singleton can decide how and when
clients will use the object.
The second consequence is that it can clean up the global namespace; in particular you do not have global data or methods, and instead just have a simple interface to acquire an instance to the class. This may not be as important today as it was originally, with concepts like packages and better compiler support for namespaces (The C++ ISO Standards committee voted namespaces into the language in July 1993, per The Design and Evolution of C++).
The third consequence is that the Singleton pattern lends itself nicely to refinement of operations and representations. In other words, the client using a Singleton class is interested in performing a function; they are not particularly interested in how the class is implemented. For example, the Configuration File example I previously mentioned fits this nicely. The user is not particularly interested in whether it is a File, Database, or accesses the file via middleware. The import aspect is that it provides the functionality that the client uses and supports the contract set forth by the Singleton.
The fourth consequence is actually one that surprises many people, in that it states that the Singleton object can control the number of objects that are on a particular system. This can be used for many effects, such as returning a unique instance for each Thread in the system or something more like load balancing.
The fifth consequence essentially highlights the differences between the Singleton design pattern and the Monostate [PDF] design pattern, in which the Singleton pattern provides more flexibility in implementation than the Monostate pattern.
Now that we have gone through the five consequences, it should be clear that the Singleton design pattern’s purpose is not to provide access to one object, but rather to control access to objects, centralizing this control, such that you can easily modify the implementation’s policy for accessing objects in a single place, the Singleton itself. For example, the Pool pattern from POSA3 could easily be a specialization of the Singleton pattern.
Obviously, the delete
Singleton::getInstance(); is
not the solution; in addition to the implementation of the Singleton
object changing, of equal concern is the fact that the memory location
may not be allocated by new.
It may have be allocated by a placement new or a memory manager,
whereby the deletion of the object via delete
would be inappropriate.
John Vlissides, one of the GOF, wrote some portions of this in his Pattern Hatching and, in particular, To Kill a Singleton. This is an excellent read on the topic, although it is partly dependent on C++. Never-the-less, the first technique that John discusses resembles the Meyers Singleton, In it, he recommends allowing the language to recoup the memory by creating a class that holds a Singleton and is responsible for destroying it. Of course, just as the Meyers Singleton, this scheme does not work when there are Singleton dependencies. Furthermore, it also only works when the lifetime of the Singleton is the same as the lifetime of the application, which John indicates is normally the case. In the Odds and Ends section of the column, John notes that the static system employed in these schemes is not thread-safe; since C++ does not yet have knowledge of threads, multiple threads can access the same static initializer at the same time, and no locking scheme can prevent this from occurring (See also the double check locking pattern).
In response to the dependency
issue, John simply points out that an atexit()
scheme (similar to Andre’s) may be the only way around this,
although the difference between Andre”s scheme and
John”s is that John is recommending a single object that
takes care of deleting all the Singleton objects in an application, and
that this object maintains the dependency information as
required. David L. Levine, Christopher D. Gill, and Doug
Schmidt formalized
this into the Object
Manager Pattern [PDF], which
aims to resolve many of the issues
that have been issued above, not to fail to mention the issue with
certain embedded systems and static initialization.
So what is the proper answer to how to destroy a Singleton? The answer really varies based on your use and system architecture. In a large application, the Object Manager should be used or at least considered. For smaller applications, this can depend on what the Singleton really is, and how important it is for it to be destroyed. But there is definitely more than meets to eye.
Choosing the Price...
Joel on Software has a new piece entitled Camels and Rubber Duckies, which talks about selecting the price of software. Joel presents the entire science behind the price, and then tears it all apart.
He presents some interesting ideas in the discussion. When you say that your software is work a particular value, your customers will immediately compare you to your competition. For example, the UML modeler prices that I previously discussed are an amazing example of this. You have products that start at $100 USD all the way up to $5000 USD, and I completely agree with Joel when he says that choosing a cheaper product sometimes makes you think that you are getting a lesser product. Maybe you are. Maybe you aren't. The real difference is that the organizations that are selling it at $5000 have customers that are willing to purchase it at that price, but obviously students and individuals cannot afford these prices. Unfortunately, you cannot figure out how much people are willing to pay until you get it out there.
Although Joel does not present any real solid information, it is a great read, especially if you are looking at setting the price for something.
Static Generic Methods in Java...
I have been applying Java generics to my new code base, and thanks to it, I have already caught a few errors that I hadn't yet caught with inserting the wrong type into the container. Of course such errors would have been obviously caught by proper unit tests, but by using generics, I knew immediately the problem.
Earlier today, however, I was trying to sort a List, and using my C++-templates knowledge was not as intuitive as I thought it would have been, but this all just turned out to be my problem. The issue is that I had created a type, Type, that implemented the Comparable interface, but I had not made use of the generic type here, such as follows:
class Type implements Comparable {
public Type() { /* ... */ }
public int compareTo( Object rhs ) {
if ( rhs instanceof Type ) {
Type r = (Type)rhs;
// Do the comparison
} else {
// Don't know what to do with this; make something up
}
}
}
The problem that I encountered though is when I went to sort a collection of this type. I had created a list of the type, so I was looking at using java.util.Collections's sort() method, so I pecked in the following code:
java.util.List<Type> list = new java.util.LinkedList<Type>();
// populate list
java.util.Collections.sort( list );
And by so doing, I received the ever-popular unchecked method invocation warning. My instinct was that it needed a hint about being Comparable, and I was surprised that the signature was not:
java.util.Collections<Type>.sort( list );
But rather:
java.util.Collections.<Type>sort( list );
This looks very awkward, to say the least, never-mind how I was thinking that it should infer this knowledge based on what I told it. But either way, it was still giving me that warning, and it was only when I started looking at the signature a bit closer that I realized my mistake.
The final version of the code is not unlike the original one after all, only much cleaner:
class Type implements Comparable<Type> {
public Type() { /* ... */ }
public int compareTo( Type rhs ) {
// Do the comparison
}
}
java.util.List<Type> list = new java.util.LinkedList<Type>();
// populate list
java.util.Collections.sort( list );
If you look at the actual changes in the code, the code is actually extremely cleaned up. The ugly case of figuring out what to do if compareTo is called on a type that you do not know anything about, for example, is completely removed, since it is no longer possible to be called this way.
This is exactly what I was thinking about when I was writing Java Generics: Better Code or Worse?, but I had not really come across such a solid example as this one. Most of the people who preach Java generics generally do so by showing you how you no longer need to cast types from Object in containers and the safety that this awards you. And while this is an extremely visible portion of generics, generic interfaces such as Comparable allow you to very clean code, as illustrated above, allowing you to focus on the types you care about and not the odd-ball cases.
Earlier Entries
- Coding Standards...
- Got your back...
- Some Environmental Antipatterns...
- J#'s raison d'être...
- J2EE and .NET are friends after all...
- The Shared User Vision...
- Extending C++ and Java...
- C++'s Export revisited...
- Using Exceptions...
- PayPal Upgrade Brings Instability... But Its Back (at least most of it)