Software archaeology and technical debt


In Vernor Vinge’s 1999 novel A Deepness in the Sky the protagonist learns what it means to become a programmer–archaeologist:

Programming went back to the beginning of time… Behind all the top-level interfaces was layer under layer of support. Some of that software had been designed for wildly different situations. Every so often, the inconsistencies caused fatal accidents. Despite the romance of spaceflight, the most common accidents were simply caused by ancient, misused programs finally getting their revenge.

“We should rewrite it all,” said Pham.

“It’s been tried. But even the top levels of fleet system code are enormous. You and a thousand of your friends would have to work for a century or so to reproduce it. And guess what—even if you did, by the time you finished, you’d have your own set of inconsistencies. And you still wouldn’t be consistent with all the applications that might be needed now and then. The word for all this is ‘mature programming environment’.”

I often think of this passage when taking on a new project, and usually I conclude that Vinge was optimistic. It doesn’t take hundreds or thousands of years to reach a ‘mature programming environment’ in which complexity is overwhelming and change is expensive and error-prone: it’s quite possible to get there in only a few years. We’re all programmer–archaeologists now.

The observation that change leads to complexity that increases the cost of future development must have been made many times, but it was put into the form of a law by Meir M. Lehman in 1980.1 In a 1996 retrospective on his “laws of software evolution”, Lehman wrote:

II—Increasing complexity: as a program is evolved its complexity increases unless work is done to maintain or reduce it. This law may be an analogue of the second law of thermodynamics or an instance of it. It results from the imposition of change upon change upon change as the system is adapted to a changing operational environment. As the need for adaptation arises and changes are successively implemented, interactions and dependencies between the system elements increase in an unstructured pattern and lead to an increase in system entropy. If the growth in complexity is not constrained, the progressive effort needed to maintain the system satisfactory becomes increasingly difficult. If anti-regressive effort is invested to combat the growth in complexity, less effort is available for system growth. Given that resources are always limited the rate of system growth declines as the system ages whichever strategy is followed. In practice the balance between progressive and anti-regressive activity is determined by feedback.2

Ward Cunningham compared the accrual of complexity and deferral of maintenance in a software project to going into debt:

Another, more serious pitfall is the failure to consolidate. Although immature code may work fine and be completely acceptable to the customer, excess quantities will make a program unmasterable, leading to extreme specialization of programmers and finally an inflexible product. Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise.3

This metaphor of technical debt cuts both ways. On the one hand maintaining a code base (or a balance sheet) that’s heavily in debt is costly. But on the other hand developing a new program (or a business) using only cashflow is very slow. Going into debt is a way of quickly getting something up and running, and at every stage the cost of paying down the debt has to be balanced against the loss of opportunities that could have been picked up if you had taken on more debt and added some new features. It’s never an easy choice: pay off all your debt and you could end up with a gold-plated late entrant to a crowded market. But let the debt accrue without bound and eventually your product will become unmaintainable and some quicker-moving newcomer will eat your lunch.

When I started working on p4.el (an interface to the Perforce software version management system from the Emacs text editor) the first thing I had to do was to pay off some of the accumulated technical debt. This is not to criticize the developers: it’s not at all surprising that a program that was originally written in 1996, which has had several maintainers in a decade of development, and which was providing an interface between two moving targets,4 should owe something to the maintenance bank. And the code was pretty good: the fact that it’s continued to be usable in the face of change for eight years with little or no maintenance is evidence of the quality of the original work on it. Here are some miscellaneous things that struck me in the course of paying down some of this debt:

  1. I dug up a couple of interesting nuggets of Perforce history amid the potsherds: the p4 get and p4 refresh commands. I had never heard of them, and a current Perforce server denies all knowledge:

    $ p4 help get
    No help for get.
    $ p4 help refresh
    No help for refresh.

    These are not mentioned in the release notes, which go back to release 98.2, so the commands must have been deprecated in 1998 or earlier. And yet there they are, still runnable:

    $ p4 get
    //depot/foo.c#1 - added as /Users/gdr/foo.c
    $ p4 refresh foo.c
    //depot/foo.c#1 - refreshing /Users/gdr/foo.c

    After some digging about in the Perforce FTP archives I found the release notes for 98.1:

    Minor New Functionality/Enhancements in 97.3 (server)
    	Get Renamed Sync -- #3788 **
    	    'p4 get' has been renamed, once again, to 'p4 sync', in order
    	    to lessen user confusion.  'p4 get' will continue to work for
    	    the foreseeable future.
    	Sync -f Supercedes Refresh -- #4403, #4346 **
    	    'p4 refresh' has been superceded by 'p4 sync -f', which
    	    updates files regardless of whether they are up to date.
    	    This does not affect files that are open.  'p4 refresh xxx'
    	    is now just an alias for 'p4 sync -f xxx#have', to forcibly
    	    refresh the currently held revision of a file.

    There also used to be diffhave (equivalent to modern diff -f #have), diffhead (equivalent to modern diff -f #head), and need (equivalent to modern sync -n), but these are no longer runnable. So this makes me wonder whether the reason that the get and refresh commands continue to be supported might be precisely because they were used by p4.el?

  2. A lot of us suck at writing change comments (and I definitely include myself among this group). It’s easy to explain what you did, but in fact that’s not usually necessary—the ‘what’ can be determined by looking at the diff—and what future developers will want to know is why you did it. Today the tools for browsing revision histories are pretty good (and they are only going to get better) so that we are making much more use of this information than we used to. When I find a feature that I don’t understand in a codebase, one of my first avenues of investigation is to run p4-annotate (or vc-annotate) and see when the feature was introduced or changed, in case that gives some clues as to why. An example of the kind of checkin comment that one hopes to find is this one by Peter Österlund:

    (p4-exec-p4): Set correct current directory when executing p4
    commands from a dired buffer. Previously it didn't work if you
    had inserted subdirectories in the dired buffer with the 'i'

    This is perfect: it explains the use case that prompted the change (a use case which might not have been at all obvious if you don’t use the dired-maybe-insert-subdir command yourself). But unfortunately it’s much more common to find a change like this one:

    p4-exec-p4: Remove no-login argument.

    But why remove the argument? I guess it was no longer needed, because the automatic login feature had been removed. But why remove the automatic login feature? It benefited users who needed to log in: so did it somehow inconvenience users who didn’t? I e-mailed the author to ask what was going on in that change but, naturally, after four years he no longer remembered why he did it.5 So I hereby resolve to try to do better with my own checkin comments, so that in four years time future developers don’t have to ask me to explain why I made some change!

  3. I said above that our history-browsing tools are pretty good. But one way in which they still suck is how they deal with reordering of source code. If you make a change swapping the order of two blocks of code, then p4 annotate doesn’t tell you anything about those blocks except that they were last touched by the change in which you swapped them. The original changes that introduced the individual lines in those blocks become invisible, or at least hard to see. (You can get at them by re-running p4 annotate FILESPEC@N, where N is the changelist before the one in which you swapped the blocks, but this is hardly convenient.) A consequence of this deficit is that developers become reluctant to make organizational improvements within a code base, because “that will break the diff”. In the case of p4.el, I was conscious from the start that the disorganization of the source was contributing to the difficulty of making changes, but I was reluctant to carry out the necessary reorganization because it would make it harder to follow the history. And yet the information is there and ought to be recoverable: when a change moves lines to a different part of the file, an intelligent diff tool should be able to follow them there.6

  1.  Meir M. Lehman (1980). “Programs, Life Cycles, and Laws of Software Evolution”. Proceedings of the IEEE 68:9 pp. 1060–1076.

  2.  Meir M. Lehman (1996). “Laws of software evolution revisited”. Software Process Technology. Lecture Notes in Computer Science 1149 pp. 108–124.

  3.  Ward Cunningham (1992). “The WyCash Portfolio Management System”. In Addendum to the proceedings on Object oriented programming systems, languages, and applications (OOPSLA) 1992 pp. 29–30.

  4.  Perforce had 28 releases with “major new functionality” between 1998 and 2013 and Emacs had 5 major releases and 14 minor releases in that period.

  5.  I really don’t want to pick on Fujii Hironori here: most of his changes to the codebase were big improvements. But this particular change came to my attention because among the problems that motivated me to take on the project was the fact that that automatic login was broken. Perforce has a ticketing system: if you run the command p4 login then the client writes a ticket to ~/.p4tickets that grants you access to the server for a limited amount of time (that’s configurable by the Perforce server administrator). If you run a command when you’re not logged in, the command fails with the error message

    Perforce password (P4PASSWD) invalid or unset.

    Obviously it would be make things much nicer if p4.el could recognize when you were logged out, prompt you for the password, log you in, and then re-run the command that failed. And indeed the code when I acquired it promised this feature in a comment:

    ;; If executing the p4 command fails with a "password invalid" error
    ;; and no-login is false, p4-login will be called to let the user
    ;; login. The failed command will then be retried.

    But this didn’t work because the feature had been removed in the change whose checkin comment I quoted above.

  6.  git blame has the -M option7:

    Detect moved or copied lines within a file. When a commit moves or copies a block of lines (e.g. the original file has A and then B, and the commit changes it to B and then A), the traditional blame algorithm notices only half of the movement and typically blames the lines that were moved up (i.e. B) to the parent and assigns blame to the lines that were moved down (i.e. A) to the child commit. With this option, both groups of lines are blamed on the parent by running extra passes of inspection.

    But the Github blame view doesn’t support it, and there’s no equivalent, as far as I am aware, in Perforce.

  7.  See John Firebaugh’s article ‘Code Archaeology in Git’ for a discussion of this and other archaeological features.