Plan to throw one away


This is a software development story from early in my career, some time in the late 1990s. The company in question went bust many years ago, so it’s probably safe to tell the tale now.1

When I had interviewed for the job, the interviewer had explained that I was needed to work on their JavaScript engine. The company was a web development tools startup, and they had spotted that it would be a good idea to have JavaScript code running on the server as well as on the client, so that you could share your business logic and validation code between the client-side and server-side parts of your application. There are now lots of products in this space but back then was probably too soon for this to be a success.

Anyway, the interviewer told me that the JavaScript engine was nearly done, but there were one or two features that were missing, and it was running a bit slowly, so I might have to do a bit of optimization. Sure, I said, that sounded like it would be fun. Oh, and it had to be done for the beta release in two weeks’ time, so I would have to get my finger out.

So, on my first day in the office I needed to make a plan for what was left to be done. Exactly which features were missing? Was there a list? No, there was no list. Had any testing been done (with only two weeks to go, surely there should have been some)? No, there had been no testing. Could I talk to the programmer who had written the engine? No, he was the CTO of the company, and he was very very busy preparing for the release. So there was nothing to be done except check out the source code for the engine from CVS, compile it, and sit down with the first edition of David Flanagan’s excellent JavaScript: The Definitive Guide and work through the features. Sure enough, “one or two” things were missing. Global objects and functions like Math and parseInt weren’t implemented. There didn’t seem to be any string methods. Arrays were missing. Object prototypes didn’t work. At least the core syntax was working, but it did seem very slow. Very slow. I mean, interpreted code can easily be a hundred times slower than compiled code, but it shouldn’t take several seconds to evaluate something like for (i=0; i<1000; ++i);.

Clearly the most important thing was to figure out what was going on inside the interpreter. So I started to read the code. And all became clear:

The failure to store a represention of the parsed code was disastrous because it meant that in order to evaluate an expression or statement for a second time, the engine had to parse it a second time. And if a loop executed a thousand times, then the engine had to parse the loop body (and termination condition) a thousand times. No wonder it was slow!

But how was I going to fix this disaster of an engine in the two weeks I had available? Normally under this kind of deadline it’s imperative that you work with what you’ve got, because if you try to rewrite from scratch then there’s a risk that you’ll end up with nothing. But sometimes the code is so bad that there’s nothing for it but to throw everything away and start again. Even if it’s the CTO who wrote it.2

Now, if there was one thing that I had been taught in my computer science degree, it was how to quickly implement a programming language. You take two very powerful tools:

and use these to build your parser. Then you use the syntax-directed translation technique to turn the output of the parser directly into stack-oriented byte code without having to represent the parse tree as a data structure. (This works neatly for context-free language syntaxes because each expression in the parse tree corresponds to pushing a value onto the stack in the byte code.) Finally, you need an engine to evaluate the byte code, but you can get away with only a handful of byte code instructions, and so the actual engine can be very small.

I didn’t tell anyone what I was doing, because I suspected that if I did, then someone would try to stop me. Luckily everyone was so busy in the run-up to the release that they had no time to ask how their new hire was coming along with a last-minute rewrite of one of the critical components of the product, and in about a week I had got my new interpreter roughly up to par with the interpreter I was throwing away (up to par as far as features were concerned—it was of course thousands of times faster). Then it was a case of implementing as many of the JavaScript types, objects, functions and methods as I could in the remaining time. I found the first edition of the ECMAScript standard enormously useful at this point. It seems to have be written as a guide to language implementers, setting out features in a logical order, starting with the object representation. The semantics of operations are defined in terms of “internal properties” on objects like “[[Get]]” and “[[Put]]”, which of course as an implementer you turn directly into Get and Put methods on the classes you’re using for your object representations. Here’s a typical section from the standard: [[Get]](P)
When the [[Get]] method of O is called with property name P, the following steps are taken:

  1. If O doesn’t have a property with name P, go to step 4.
  2. Get the value of the property.
  3. Return Result(2).
  4. If the [[Prototype]] of O is null, return undefined.
  5. Call the [[Get]] method of [[Prototype]] with property name P.
  6. Return Result(5).

which you’ll see is trivial to turn into an implementation. I’ve heard complaints about this standard from people who tried to learn how to program in JavaScript by reading it. Sorry, but it wasn’t written for you, it was written for me! You should have bought David Flanagan’s book instead.

In the course of implementing the language semantics, I realized that the original implementation (the one that I had thrown away) was hopelessly wrong in many cases. It had obviously been reverse-engineered from Flanagan’s guide, and not implemented directly from the standard, and consequently many details, like step (4) in the quote above, were missing or broken. It would have taken far longer than I took to write my implementation just to find and fix these kinds of errors in the semantics.

Anyway, by the time the two weeks were up and the beta release was made, I had produced a working JavaScript interpreter that included the core language semantics and a few of the more important global objects, and which was not much slower than the JavaScript implementation in Netscape Navigator. Mission accomplished! One thing that I had not had time to do was to write a garbage collector. This meant that any program using the engine would keep allocating until it eventually exited with an “out of memory” error.3 But I reasoned that this was a beta release, and most likely people would be running small and short-lived programs, and that given the limited time available, it was more important to add features than to improve the quality of the implementation. I decided that I wouldn’t even start work on the garbage collector until someone actually ran out of memory and complained about it to me.4

  1.  The title comes from the first edition of Fred Brooks’ The Mythical Man-Month:

    In most projects, the first system built is barely usable. It may be too slow, too big, awkward to use, or all three. There is no alternative but to start again, smarting but smarter, and build a redesigned version in which these problems are solved […]

    The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers. Seen this way, the answer is much clearer. Delivering that throwaway to customers buys time, but it does so only at the cost of agony for the user, distraction for the builders while they do the redesign, and a bad reputation for the product that the best redesign will find hard to live down.

    Hence plan to throw one away; you will, anyhow.

  2.  I remain, to this day, baffled as to how it came to be written like this. Parsing the source code again on each time around a loop!

  3.  This was in the days of 32-bit systems with small swap files, so you could actually run out of memory. Today such a program would simply allocate more and more swap, getting slower and slower but not actually quite running out.

  4.  It was two weeks before this happened. When I did come to write a garbage collector I used the mark-and-sweep algorithm. But something puzzled me, and I couldn’t find an answer in any of the textbooks I looked at: how was I supposed to schedule the collections? In a classic description of a garbage collector, you wait until memory runs out and then you collect the world. But this leads to bad behaviour on modern systems, because of swapping, and because other processes need memory too. You need to schedule collections well before memory is full. But when exactly? I still don’t know of a comprehensive solution to this problem.