Big-O proofs


This tweet from Emily Kager appeared in my timeline a couple of weeks ago:

My sister (freshman in college) texted me for Big-O proof help for her Algorithms class saying she’s scared CS isn’t for her because she doesn’t love this class. RT if you don’t think about Big-O proofs at your CS job.

I have complex feelings about this. On the one hand, I sympathize with Ms Kager’s sister—it’s horrible to be struggling in class—and it is true that most programmers never do a big-O proof again after they graduate, and many never think about asymptotic complexity at all. But on the other hand, someone who can get to grips with big-O proofs will be a better programmer for it. (I’ll try to explain why below.)

This is all going to be unhelpful for Ms Kager’s sister: no-one ever loved a class just because someone told them it was useful. But I do have some advice, which is that it is never too late to learn something. The reason someone is struggling with analysis of algorithms is probably because they are missing some crucial prerequisite: maybe it’s discrete mathematics, or maybe it’s a lack of hands-on experience with algorithms, or maybe something else. But it’s always possible to come back to the subject later in your career: when something about it grabs your interest the textbooks will still be there.

So, of course I have had never had to write down a big-O proof at work, at least not in the kind of formal style that I might have used in my undergraduate algorithms homework. Nonetheless, I think about the asymptotic runtime of the code I’m working on all the time, and this means doing informal analysis of algorithms in my head, and the reason that I am able to quickly and reliably do this kind of analysis in my head is that I practised writing all those formal proofs. I can be confident that my informal approach gets the right answer because I know that I could turn my quick rules of thumb into detailed proofs if I ever needed to convince a beginner or a skeptic. Perhaps it would take me a while, or I’d have to refer to a textbook for the difficult cases, but I’d get there in the end.

Expertise in some area always comes with a penumbra, that is, a wider area in which you know a lot (but not as much as a real expert) and a still wider area in which you know a little (and could pick up more if you needed to). An expert in everyday analysis of algorithms almost certainly has some ability at formal proofs too.

And every once in a while, you get lucky and happen upon a problem where the issue is more complex than the usual logarithmic vs linear, or linear vs quadratic, and then you have some trickier analysis to do, and are glad that you have the skills to do it.

We automate tasks using computer programs in order to save people time and money and effort. Or to give people capabilities that would otherwise require too much of these resources. So the need to check that you are using resources efficiently is ever-present in software development. Normally this does not require particularly sophisticated mathematical techniques: the vast majority of tasks fall into one of three groups:

  1. If it’s a search task (find an item matching some criteria in a collection), then it ought to run in time that’s logarithmic in the size of the collection, that is, in \(O(\log n)\) time.

  2. If it’s a data-processing task (read some data and compute some results), then it ought to run in time that’s linear in the size of the data, that is, in \(O(n)\) time.

  3. If it’s a sorting task (read some data and collate it somehow), then it ought to run in \(O(n \log n)\) time.

In each case you’re looking for particular kinds of failure. If it’s a search task, are you accidentally looking at each item in the collection, making the runtime \(Ω(n)\)? If it’s a data processing task, are you accidentally doing something taking \(Ω(n)\) time for each item, making the overall runtime \(Ω(n^2)\)? These complexity problems are ubiquitous. In particular, programs whose runtime is \(Ω(n^2)\) when it should be \(O(n\log n)\) or \(O(n)\) (that is, they are “accidentally quadratic”) are common enough to sustain a blog. Here are a few examples that I’ve encountered at Code Review:

  1. A solution to the Josephus problem that’s quadratic because it deletes each victim from a list.
  2. A memory allocation algorithm that’s quadratic because it searches the free list to find the location of the freed block.
  3. A counting algorithm that’s quadratic because it searches a list by examining all the items.
  4. A breadth-first search algorithm that’s quadratic because it implements the queue of search nodes using a list.
  5. A common ancestor algorithm that’s quadratic because it visits the whole subtree rooted at a node before deciding which way to descend.
  6. A multiple string search algorithm that’s quadratic because it searches separately for each string.
  7. A minimum spanning tree algorithm that’s quadratic because it looks at all the edges on the boundary of the growing tree to find the one with the smallest cost.
  8. A binary search algorithm that’s quadratic because it computes the wrong location of the midpoint of the list.
  9. A merge algorithm that’s quadratic because for each emitted item it looks at the heads of all the inputs to find the minimum.
  10. A local maximization algorithm that’s quadratic because on each iteration it finds the local items by slicing a list.
  11. A random sample algorithm that’s quadratic because it deletes each sample from a list.
  12. An Eulerian cycle algorithm that’s quadratic because it searches the whole cycle each time it needs to find a visited node which still has an unused edge.
  13. A test of whether one list is a rotated version of another that’s quadratic because it compares one list with all the rotated versions of the other.
  14. A construction of inverted indexes that’s quadratic because it finds the index of each item using a linear search.

The idea of giving all these links is not to shame the programmers who accidentally wrote quadratic programs, but to try to give some idea of how common this pitfall is, and how many different kinds of algorithm might be affected. Modern programming languages like Python make it possible to write complex operations tersely, but the flipside of this is that you have to pay attention to the performance of the operations you are using. It’s easy to write if item in collection: and fail to notice that collection is a list, and so this will take time that’s proportional to the length of the list.