Aug 11, 2004

Over-Engineering or Abusing C++?

In this week's InformIT C++ newsletter, Danny Kalev introduced his Over Engineering article as a disease that harms software development, testing, and documentation, and purports to provide some keys to identify and curve this ailment, and so I eagerly read.

When I think of over-engineering, I immediately think of developers adding features that are not required, similar to my post earlier this week on Leveraging Frameworks and The End of Catch-All User, where I mention this in relation to developers not really knowing what the end-user wants and trying to please every possible end-user, and as I mentioned in the former entry, this dates back to the very beginning of our industry.

As such, when I started looking at this article, I was rather surprised, because this is not what the article is about; the article is really about language features that Danny Kalev does not like or feels that are abused, and in many of those cases, I would not see this as over-engineering. For example, his first complaint is about exceptions, because he states that this leads to code that looks like this:

try {
  url = new java.net.URL( "https://www.eyt.ca" );
} catch ( java.net.MalformedURLException murl ) {

In this snippet of code, the exception raised by java.net.URL is ignored, and Danny claims that is a very common “idiom” in “in every Java-based application.” Every is a little too general, because in my coding standard, the above code is unacceptable, and in my projects, the above code will not appear, unless there is either a comment indicating why we are ignoring it (id est: in the above case, if that URL is invalid, we have a lot of issues), or at the very least, the exception is logged. But these two cases are extremely rare. IntelliJ IDEA's code analysis feature, for example, states that this is occasionally intentional, but warns that such instances are hard to debug. I prefer to either rethrow the exception or map the exception onto an exception that does exist, using the J2SE 1.4 feature of Throwable.initCause(), so that the history is completely available.

I do not see Exceptions as “lip service” as Danny says. Per se that java.net.URL's constructor did not throw an exception; how then would you know that there was a problem? Well, then you would have to do something like the following:

java.net.URL url = new java.net.URL( "https://www.eyt.ca" );
if ( !url.operable() ) {
  // Do something smart here

Or, if you prefer to write the above in the same style as the empty-catch:

java.net.URL url = new java.net.URL( "https://www.eyt.ca" );

There is plenty of code out there (and not just in C++) that looks exactly like the line above, and does not check for any type of errors. That being said, raising exceptions means that you are forced to write code that handles the problems. You may think this is a bad thing, but how many times have you written code that does not check for errors that works perfectly on your system because of your configuration, and once that it is installed at a customer's site, it blows up? Debugging these problems are extremely difficult, and so you either check for errors or you risk your application blowing up at random places. Its your choice.

The C++ exception specification is a mis-feature only because, in my opinion, that it was poorly specified. As I mentioned in my first exposure to C#, the problem with C++ exception specification is that the compiler does not really enforce it. If a function declares that it will throw an exception of type std::exception, and it throws an exception of type std::string (yuck), the language states that it should terminate, but a compiler has no way of knowing this if exception specification is not globally used, and therefore, this becomes impossible to ensure. This is why the exception specification is not used. It would be difficult to change this stance to be as strict as C++ and maintain full backwards compatibility.

I personally like Java's approach. In continuing with the java.net.URL example, the documentation clearly indicates that only a java.net.MalformedURLException can be raised. As such, today if I write some code that uses this class, I know exactly which types of exceptions are raised and I write error handling code to address these issues. Per se that when J2SE 5.0 (or 1.5 or Tiger) comes out that the constructor that I am using will now validate the host portion of an HTTP-based URL and now throws a UnknownHostException, when I attempt to compile my code with the new version, the compiler will tell me that I have a new error to handle.

My example of java.net.URL is really a poor choice, because I cannot imagine this type of change in the Java Core (if anything, java.net.URL would be deprecated in favour of something else), but for some libraries and especially applications, these types of safe-guards are really important, and returning back the C#, I really felt that this was a missed opportunity. In either case, if I upgrade libraries and a method in that library now raises a new exception type in C++ or C#, it better be mentioned in the release notes and you better be reading those release notes because your compiler will not give you warnings or errors about this.

The example of the C++ Standard Library not throwing exceptions for each possible problem is a design issue, as is the C++ Standard Library's instance on a Stack having a pop() method that does not return the removed element (which is actually done for exception safety). As for embedded environments, I do not feel that this has any bearing on exception handling, since such environments usually do not implement other features, and you must generally adapt your use of the language to those features that are best supported and most optimal in your target environment.

Danny's next example is on templates, and here I see why he is placing this in an over engineering topic. After reading some extreme-template articles and books, such as those from Andrei Alexandrescu, you start to feel that you must create a template for every single function, which will take a trait for every single possible thing that a user will ever want to do with your template. While this goes a little into the direction of over designing classes, templates are a special case. My own take on templates is the same as Martin Fowler states in Refactoring (or its web site), in which he says that the first time you write it, the second time you copy it, and the third time you refactor it. There is no reason to write a template if you are simply going to use that template once; depending on how much code is involved, it is sometimes acceptable to simply copy the code the second time, but if it is large or complex, it may warrant refactoring, or at least by the third time you should be thinking of templatizing it. The point in all of this is simple: for most developers out there, perfecting non-templated code is much easier. Once you have a perfect algorithm, it is much easier to translate this into a template. Once that you have a few instances of this algorithm, it is much easier to generalize the algorithm, since the algorithm will probably be used with different types and in different cases. But even when refactoring into a template, it is important to keep in mind that you should only be implementing what you will use now. You can refactor the code later if you need to.

This problem, however, is not specific to generic programming, but generic programming seems to stick here, partially because many developers are not comfortable with them and secondly, not all compilers treat templates correctly, which can make them harder to port (although most modern, standard-conforming compilers are fine in this area). These two issues makes them stick out. Classes and functions are likewise occasionally over-engineered. The idea of creating classes iteratively is lot globally accepted, and instead, developers try to define every single method that a user of that class could possible use instead of focusing on a particular feature; you can always add more functionality latter, or even refactor the design completely. The fear is, however, that time to do this will never occur, and so we end up in this endless loop of not doing anything. In this regard, I find that having an extremely good idea of what your users need is most beneficial in both identifying the problem and rectifying this problem.

Danny's final point is with overloading operators. This is a C++ language feature that I seldom use, and as such, there have been very few times where I felt that it was lacking in Java. The times that I have felt that it was lacking was when dealing with numerical classes such as BigInteger, as a = b + c is easier to understand than a = new BigInteger( 0 ); a.add( b ); a.add( c ); or any other combination thereof.

As such, I tend to agree with Danny that operator overloading with numerical classes generally makes sense, and in others, it generally does not. If i is an iterator, is ++ i clearer than i.next()? This is arguable. His example of the function object using operator() is great; your class should declare members that make sense and are readable. If you need a function object, consider using an Adapter instead.

In closing, I do not feel that this article really dealt with over-engineering, and rather focuses on some language features that are abused or misunderstood. The only one that comes close to over-engineering is the example on generic programming, but even here, I feel that Danny missed the point that this happens in more places than just generics. Over-engineering is adding features that no one (or few) will use, and the only way to avoid that is to know your users, whomever they may be.

Filed In