Programming Challenges

Challenge #5: Love your abstractions


(First posted February 6, 2000)

The word "abstraction" is a mildly stuffy one, one that CS professors may seem to overuse, but it turns out that it really is a really important concept.

Whenever we think about modules or interfaces, we ought to be thinking about abstraction, too. A module provides some services to its clients, and there's an interface which describes how a client accesses those services. The purpose of the interface is not just to make the module's functions convenient to access. Another purpose of the interface is to hide the internal details of the module.

It's easy to overlook the information-hiding aspect of an interface. If I'm working on a program I've written all by myself, I've got nothing to hide from myself. If I'm using a module which someone else has written, I may be curious about the internal details of that module, and in fact my curiousity may even be piqued the more the module's author makes it hard for me to learn about those details. But it's all too easy to let our knowledge of a module's internals influence the way our code makes use of that module, and once we start doing that, we can very easily demolish -- utterly demolish -- the most important advantages which modular programming was supposed to bring us.

A huge virtue of a properly-specified modular interface is that it is supposed to be possible to rip out the module's implementation and replace it with a partly or completely different one, without any change to the calling code. Not only does the calling code not need to be rewritten or modified in any way, the authors of the calling code don't even need to know that the module has been changed. Depending on the build methodologies and run-time environment in use, the calling code may not even need to be recompiled, and maybe not even relinked, either.

Everyone has heard that modular interfaces make this allegedly-transparent replacement of a module's implementation possible. But it's important to realize that this is not some abstract or theoretical virtue. It is something that does happen, all the time (in properly-designed systems, at least). When it's for some reason impossible to apply this strategy -- when a proposed rewrite of a module would require large-scale rewriting of client code, it is not "just one of those things" or "the cost of doing business"; it represents a tragic failure of one of the fundamental techniques that is supposed to make civilized software development possible.

When you discover that some functionality which you have need of is provided for you by a module but with the "interesting" stuff hidden behind some narrow interface, such that all you get to do is call the function and just have the task done for you, without getting to specify and tweak all sorts of low-level details, don't be frustrated or jealous about the control that seems to have been taken from you. Instead, celebrate the fact that those gory low-level details are being taken care of for you, and that all you have to do is just call the function and have the task done for you. Moreover, celebrate the fact that if the module's author ever figures out a better way of performing that task, you'll be able to take advantage of the improved implementation without lifting a finger. This is a great source of freedom.

I'm not saying you should be completely ignorant about a functional module's implementation. You should have an idea of how much work you're asking it to do based on the size or complexity of the input you give it, and you should perhaps know what algorithm it's using, so that if there are calling patterns which would seriously abuse it (and which would therefore degrade the performance of your system when misusing the module) you'll know to avoid those calling patterns. But what you don't want to know about (or if you accidentally learn, you'll want to pretend you don't know) is any quirks of today's implementation, or any undocumented, unsupported special features, which once you know about you'll be inexorably compelled to imagine some use for, until you find one, and inevitably decide your program has to actually make use of, at which point you'll have locked yourself in to this particular implementation, such that when it changes in ways that shouldn't have mattered to its published interface specification, your code will mysteriously break. (Programmers are drawn to undocumented features like moths to a candle. Don't get burned.)

Here's a specific challenge: when you do find that you need to depend on some undocumented feature of a module you're using, some feature which that module arguably ought to be providing to clients like yours, get that aspect documented in the module's interface specification. That way, if the module's implementation is ever changed, its maintainers will be sure to continue to support that feature (or if they forget to, it will be their fault when they change it and your application breaks, not yours).

You're also allowed to complain, and ask that an interface be revised, if the supported interface is simply too cumbersome for a client application like yours to use. Library modules are supposed to make your job easier, not harder. You don't need any disincentives against using the module (that is, you don't need any more temptations to go it alone and roll your own instead), and programmers less responsible than you certainly don't need such disincentives. Help the person who defined the interface evolve it toward one which clients like yours will want to use.

When a large change is needed to an existing systems, programmers are sometimes too quick to assume that the old system is completely inadequate, that it has reached the end of its useful life and must be abandoned and rewritten from scratch. Resist this temptation for as long as you can! (Sometimes, it's true, large systems really do need to be abandoned and rewritten from scratch, but doing so is always extremely miserable and timeconsuming and presents whole different sets of problems, so I won't talk about that here.) Instead, Let It Be A Challenge To You: figure out how to make the large change to a small number of modules, leaving the bulk of the system (that ought to be unaffected by the change) unaffected by the change. In a well-designed system, this will be possible. (In a poorly-designed system, on the other hand, for example one that has widely and gratuitously violated the information-hiding aspects of its allegedly modular internal interfaces, such that it has lots of insidious extra coupling between modules, the sort of surgical redesign I'm advocating probably won't be possible.)

Don't let anyone tell you that you're being lazy by trying to find a way to make the large change with less work, or that the result will be a hack job. Taking a large system and making a large change to it in a small, appropriate, and restricted way, by hiding the change behind existing interfaces, is no kludge: it is software engineering at its finest.

Another challenge: convince the naysayers that your proposed limited-scope rewrite is workable. Be confident that the guarantees surrounding your interfaces (your existing modules' proper implementation, and your clients' proper use of them) will allow the contemplated sweeping rewrite to be as (relatively) painless as it ought to be.

It should be obvious, but another threat to resist, when you're in the throes of a major rewrite, is the urge to rewrite all the interfaces, too. Don't tinker with the interfaces! You've got a lot invested in those interfaces, probably much more than you realize. Not only was there the time you spent defining and documenting them, and the time you (or someone) spent implementing the code behind them, but there's a much larger body of code (with an unknowable size and scope, if the interfaces have been around for a while) which depends on those interfaces. If you're stuck with the necessity of a major rewrite, you do not want to make the project any bigger than it already is. Don't force the rewriting of a bunch of other pieces of client code by yanking their time-honored interfaces out from under them.

Most of the Programming Challenges in this series are relatively language-independent, and where they're not, they tend to focus on C programming (since it's my forté). But I have to inject a specific note here about C++: everything I've said (and more) is equally if not more true of C++. If you're maintaining a large system in C++, you've got a big (often a huge) investment in your class definitions. Make sure your code lives long enough to see the payoff from that investment. Don't blow it by constantly rewriting the interfaces (otherwise the language and the OOP discipline will only make your job harder, not easier).


This page by Steve Summit // about these challenges / previous / next