Find your next scenic drive!

September 8, 2004

Completely Different Worlds...

Have Integrated Development Environments changed the way that software is built? For many developers, it has; yes, still some prefer vi (or have slightly modernized with vim), xemacs, or their favourite editor, but there is no argument to me that many people use and swear by IDEs. Has it always been like this? No.

In The Big Bang Theory of IDEs, Caspar Boekhoudt does a wonderful job at describing how IDEs have advanced throughout the years, since their invention in 1964, but he also points out a short coming of IDE's. While they do pack more bang for the buck, they do not necessarily make developers' lives easier, more productive, and make our products more solid. He argues that IDEs are presently packed with interfaces to external tools and make everything look easy to use, but in reality, you still need to know how to use those tools.

Even with Caspar's arguments, IDEs still have improved the development to some extent for many developers. In many cases, this is habit and preference, and so long that a developer is able to produce, the issue of what they use to produce is not as important (so long that it is legal).

It is interesting to see the parallels between IDE's and modeling, especially in light of Microsoft-base Modeling, scheduled for Visual Studio 2005. As Matthew Schmidt mentions in a recent JavaLobby newsletter, like it or not, some developers like models. In his case, he is referring to JDocs.com, a repository of JavaDocs for various APIs that, amongst other things, features a wiki-like comment system, allowing developers to share their experience with the API.

As I have mentioned many times before, a huge problem with models is keeping their in sync with the source code. A recent survey that I partook in asked if modeling was used, and if so, how were bugs in the system fixed; where the bugs fixed in the model then exported into the code, fixed in the code and imported into the model, or was the model only the starting point. This genre of question really highlights the problem with current generation modeling tools, since they are separate from the source code.

This separation of models and source code, coupled with the painful experience of reverse engineering source code has created a stigma in many developers against models. A portion of these developers believe that models are only useful for design or communication, such as Craig Larman suggests, and this vision is really based on the lack of tool integration. Some modeling vendors have attempted to integration portions of their tools into IDEs, and in most of the cases that I have seen, this integration was still separate tools bounded in a single user interface, similar to Caspar's comments above. Such tools have not really kept in mind the goals of a software developer.

Unfortunately, many developers are so stigmatized from this experience that they cannot see the actual benefit from the models, stating that the models are great for designing a feature, but once designed, they argue, what is the use of the model? Artifacts such as models are great elements for the Project Workbook, as prescribed by Frederick Brooks' classic The Mythical Man Month (or a plog by modern terms).

The problem with the assumption that the model is only good for designing software is that it assumes that the software never evolves in any way. For most projects, this is simply not the case. While it is true that many developers can remember the basic structure of something they developed sometime ago, it still takes a little time to get back into it and to understand exactly what the proper fix is. A model is an excellent way to quickly see the classes involved, the interaction of classes, states of a class, software deployment, etc.

But this last paragraph assumes that the author of the package is actually the maintainer. This is also not always a valid assumption, and in light of this, the new developer must quickly understand your design. Again, a model is ideal for this purpose, as the saying goes, a picture is worth a thousand words. Without a model, developers will tend to think that segments of code are wrong, inefficient, and require rewriting. This can also be the truth with the model, as Craig states, the ability to create a model should not be confused with the ability to create solid software and Caspar also states the experience is required to use these tools effectively.

Of course, the present problem with this is that the tools are not intuitive to keep the source code and models synchronized. Microsoft Visual Studio 2005 aims to keep the model and source code seamlessly synchronized, and there are ample other vendors out there also working towards this.

When programming in assembly became too complex, languages were created to minimize the complexity, and this has repeated a number of times, but we are again at a point where the complexity of software is far too great for humans to complex understand inside out, and a higher level of abstraction is required, and models are a great way to handle complexity, since they can be partitioned just to the right amount of detail you need. It is an excellent way to dig down into a project, and when a product can keep the code and model synchronized, modeling will not seem so separate and disconnected.

September 2, 2004

Abusing Inheritance...

Object Oriented Software Development is a completely different way of looking at software development, and developers that are coming from other paradigms generally are introduced to many new concepts, which takes time to understand the difference between them. Many such developers will generally start by creating applications by a monolithic class, a few highly coupled classes, or what I call “Object-Oriented Spaghetti,” which has a single class's functionality distributed over multiple classes (As an editor in an issue of Dr. Dobbs once said, “You can write Fortran in any language”). In all of these cases, people are not realizing the potential of object oriented design. This is problem is also quite popular when RAD tools are used and the MVC Pattern is not used.

One example of a beginners mistakes is the misuse of inheritance, such as the following example:

ValidatedList derived from Vector, but not overriding all entry points

The ValidatedList class is intended to perform some data integrity or validation when entries are added and removed from the list. In the above diagram, the ValidatedList class is derived from Vector, however, the Vector class has several ways of adding and removing elements, whereas the derived class only defines two methods. In other words, the ValidatedList class only protects two methods, but all four are available to the user of the class.

A naive solution would be to simply reimplement the class by providing the four signatures, but what if the Vector interface changes? The same problem could occur. The above example was actually taken from C++, where the Vector class is really the std::vector template class, which is even worst, since the methods are not marked virtual, indicating that the method can be overridden (in Java, all methods can be overridden, unless the static or final keywords are used in the method signature).

But in reality, let us think about what we are doing here. The key is to think about inheritance as an IS-A relationship, implying that the subclass is a type of the parent class and that the subclass specializes the parent class in some way. In other words, the client of a class can be changed to a subclass without any code modifications, as shown below, as the interface (and the contract signified by that interface) is what the client uses.

A client uses a Parent class directly, not the Subclass.

In this particular example, it may seem to some that the relationship is warranted, but if you look a little closer, the validation services that the class does are not specific to the Vector container, and could work equally well with any container. Said in such a way, this should be implemented via containment, as demonstrated below. In this case, the container can be changed in future releases, completely protecting the Client of the class from interface changes.

ValidatedList is implemented in terms of a Vector

This example was rather simple, but it is frequently an issue. Another related issue is with multiple inheritance. While it is generally considered poor design to use multiple inheritance, they can be of great value, such as in the Adapter PatternGOF. With this in mind, let us consider the following:

LockedList which derives from Locked and List

Quickly looking at this, some may identify this as the Adapter Pattern, however, the Adapter pattern is meant to adapter a class's interface to another's interface, which the two are usually related. A List and a Lock are not at all related.

Generally people who design these types of classes are thinking that a LockedList is a List that is locked, but this is not what the above communicates. The real meaning of the class must then be that it is a thread-safe list. A thread-safe list, however, is a List, not a Lock.

If we look further into this example, note that the class LockedList does not override any methods from either class; in other words, in order for the class to guarantee thread-safety, the user of the LockedList class is forced to guarantee this property. While this design decision may allow your users to lock the list for multiple operations, this forces generic algorithms to be surrounded by lock() and unlock() code that probably is of a larger scope than required and users must remember to always call these methods; but even this is a small problem in the large view of things. On a larger scope, if the program deadlocks, the problem may appear to be a LockedList problem, and debugging this code will be difficult at best, especially since unit tests will not be able to reproduce a client's problems.

One solution to this problem is to override all the methods of the List interface, and use containment instead to access these members, as demonstrated here:

LockedList derived from List and contains Lock, with Lock heirarchy.

The advantage of breaking the inheritance becomes a little more clear in this diagram, because we can now separate out the Lock class into subclasses. The classes presented as based on operating system implementations, however, this could also include a distributed lock manager, a file-based locking scheme, and many other locking schemes.

At first, this may seem to solve all the problems. A user of the List class can seamlessly update their code to use the LockedList class and now their class is thread-safe. Or is it?

Like the example of the ValidatedList above, this class has a problem in which methods can be added to the List class that could be overlooked in the LockedList class. Such a problem can be dormant in a system for some time before discovered. While the discovery time could be shorten with proper unit testing, many times this will cause intermittent bugs, exposed by unrelated changes, or only exposed when porting to a new platform or compiler. How could we guarantee thread-safety?

There are several ways that one can approach this. One approach is to add some protected lock() and unlock() methods to the List class that by default do nothing, and can be overridden to provide the locking mechanism. This implementation forces the List to be thread-aware without incurring the hit of locking. For this implementation, the LockedList would only override these methods and forward their to its lock attribute, such as follows:

List with lock()/unlock() methods, with a derived LockedList class.

Thinking on these lines, you could also move the Lock reference up-to the parent class, create a NullLock class, and remove the LockedList completely, as demonstrated here:

List with Lock classes.

While this is a workable class structure, there is still a minimal performance impact for classes that do not use locks. More flexibility could be obtained by refactoring the above class design into following:

Abstract List, with the List and LockedList subclasses, the latter using both an AbstractList and a lock.

In this design, the LockedList becomes a proper Adapter class. Each method, therefore, would be implemented by locking the class, calling the adapted list's method, and unlocking the class (A C++-based implementation would need to use the Scoped Locking PatternPOSA2). This is similar to the approach used in Java's java.util.List, although the iterators are inconsistent with this abstraction.

The decision for either approach would really depends on your exact requirements. The former design would be optimal in cases where you rarely need to specialize the List class, where as the latter case allows for greater flexibility for specialization of lists.

Using inheritance is great, however, it must be used only in contexts that truly illustrates the IS-A relationship. To illustrate the difference, Peter Coad and Mark Mayfield recommend in Java Design that inheritance should be used when the subclass is a special kind of the parent class, and not a role played by the class, that this relationship always exists for the lifetime of the subclass (for example, the subclass cannot be transformed into another type), and that the subclass extends the behaviour of the parent class, instead of simply using or nullifying the parent class.

Containment, on the other hand, allows a class's implementation to change. In the above example of the LockedList deriving from the Lock class, changing the Lock class could be difficult, as some users of the class may prefer one type of lock over another; this flexibility could not be realized directly. On the other hand, by delegating the lock behaviour to a member, the extensibility of the class was more flexible and adaptable to the changing needs of its users.

This is not to say that inheritance is bad. The point of this is to only use inheritance when inheritance is warranted, and to carefully choose when you derive classes; changing this later may not be an easy task, but with containment, it is very easy.

August 31, 2004

Joel on Estimation...

Meryl Evans has posted an article by Joel on Software that discusses Estimation. The posting is a little strange in the sense that it is really clear that it is Joel that is writing (and Joel's web site does not even mention it), especially since it is focused on web site development. Either way, the concepts are easily applicable to other areas.

There is really nothing rocket-science or new in the discussion; Joel mentions that the way to succeed is to break your estimates into tasks that take between 4 and 16 hours, similar to that recommended by Delphi Sessions. Based on this, Joel recommends that you deliver prioritized functionality little-by-little to the end-user, allowing them to refine their requirements. His reason for doing so is also very common, in that the customers do not always know what they want, so once they see it in action, they will probably change their thinking.

Nothing really new, but a good review if you have not seen it in a while.

More Java Tools...

Over at the InformIT Java Blog, Steven Haines discusses some New Java 5.0/1.5 Tools, as discussed here (Scheduled to ship September 30).

While many of these are experimental and Sun warns us that they may not be available in future releases, these types of tools will make debugging Java applications much easier, some of them seem like the tools already under Linux and other Unices. Some of these tools already come in some form with IDEs, but this will definitely make them more available when the IDE is unavailable.

I welcome them, even if many of them are not yet available under Windows and are experimental. This supports the idea that Java is, once again, a growing language.

August 28, 2004

The Scope of Class Data Members...

Herb Sutter and Jim Hyslop's Conversations Column from the June issue of C++ Users Journal (or #48) talks about the scope of data members. While it is generally widely acceptable to not place your data members in the public scope, “Getting Abstractions” reminds us about the evils of protected data members.

According to the article, The Design and Evolution of C++ (a book I do not yet own) discusses the complete history behind protected data, and also the reason why the person who recommended it now recommends against it. Since the entire reason for not having public data members is to provide an abstraction in the event that the member is refactored (renamed or removed), this naturally extends to protected data members, especially for class hierarchies that are public, not to fail to mention how Herb also recommends that all virtual functions should be private, which also works in a similar fashion.

Many RAD tools (such as Borland's VCL) make it look like there are public data members, but in reality, these are properties, and properties still call functions. When a function is called, other operations could be done, such as the example within the article, where when Bob changed the font, he had to explicitly call the repaint method; with a function, these two operations could be hidden from the user, and likewise, changes to the internal structure of the class could likewise be hidden. As there are some camps against getters and setters, the article also mentions that getters and setters should be preferred to public data members, but not when other abstractions would be better suited. In other words, you do not want to expose your current implementation of the class; instead you want to expose characteristics of that class.

Globally, it good reminder of data scopes within a class.