Encapsulation is not always right


This antipattern has caught me out a couple of times recently, so I thought that it might help exorcise it if I wrote up my mistake.

Object orientation is a great technique for implementing video games. A game world consists of many objects, with a mixture of shared properties (position, rotation, velocity, etc.) and unique properties (hit points, bounciness, capacity, AI state, goal square, etc.). Object orientation captures these very well via a hierarchy of classes and mixins: so for example every object that can collide might inherit from the Collider class.

It’s easy to get seduced into the fallacy that encapsulation applies to all object behaviours: in other words, that each object is responsible for handling its own behaviour, and the implementation details should be hidden. But when objects interact with each other, the outcome of events can depend on the properties of several objects. For example, when two objects collide the result depends on the properties of both objects. Consider collisions in a game with bullets, people, and tanks:

You can coerce this kind of table of interactions into the straightjacket of single-dispatch method calls, but the results are pretty ugly however you do it. (It’s no coincidence that the main motivating example in Wikipedia’s multiple dispatch article is collision resolution.) But there are more subtle examples where the naïve approach goes wrong. Here follows an account of a mistake I made.

In the iPhone/iPad game Floe you tilt an ice floe to make coloured blocks slide:

This is implemented using a fairly lightweight version of the model–view architecture. In the view, the blocks are animated so that they appear to move smoothly and continuously, as shown in the screenshots above. In the model, each coloured block always occupies a set of squares in the grid, and when it slides it moves exactly one square at a time, as shown in the figure below.

When the player tilts the ice floe, the rule of the puzzle is that all blocks slide as far as they can go. It’s not supposed to be a game of quick reactions, so there is no possibility for the player to, say, reverse the tilt and cause a block to stop halfway through its slide. So it was natural to implement the block motion in an object-oriented fashion, putting a Move method into the Block class and calling it each frame from the block’s Update method.

The first problem with this approach is that because blocks can obstruct each other, the result depends on the order in which the blocks are updated. Consider the situation in the diagram below. The blue block cannot slide rightwards until the red block slides to make way. If the blue block is updated first, then it fails to move, while if the red block is updated first, then it makes a space for the blue block to move into.

In practice this isn’t all that significant: the blue block will move on the next frame, and the only observable effect will be that a small gap will sometimes open up between blocks that ought to move together. I worked around this problem by repeatedly attempting to move the blocks each frame, following this O(n2) algorithm:

  1. Mark all blocks as not moved
  2. Set m = false.
  3. For each block B that is marked as not moved:
    1. Try to move B.
    2. If the move was successful, mark B as moved and set m = true.
  4. If m is true, go back to step 2.

However, the second problem is that some kinds of motion are not possible at all. Consider this situation:

The red block cannot move because the pink block is in the way. Similarly the pink block cannot move because the red block is in the way. For a while I just avoided this problem by only designing levels where this kind of mutual obstruction could not occur. But eventually I felt embarrassed enough to have a proper think about the reason why this wasn’t working. And the culprit was obvious in retrospect: it was the idea that motion should be encapsulated in object methods. With this assumption dropped, it was easy to spot the O(n) algorithm:

  1. Mark every block as movable.
  2. Let Q be a queue of grid squares. Initialize it to contain all immovable squares (that is, the squares around the edges of the board).
  3. While Q is non-empty:
    1. Pop grid square a from Q.
    2. Let b be the grid square that is blocked by a, if any. (If the tilt is to the right, then b is the square to the left of a, and so on.)
    3. If there is a block B occupying square b, and if B is marked movable, then:
      1. Mark B as immovable.
      2. For every grid square c occupied by B, push c onto Q.
  4. Move all blocks marked as movable.

And with this algorithm for movement, it was possible to make levels like this:

Level 48 of Floe has two blocks that interlock so that neither can move unless the other moves.

So the moral of the story is: encapsulation is sometimes a bad idea, and motion planning and collision are areas where this is particularly likely to be the case. Richard Gabriel sums up the problem:

[Encapsulation] fails when there are global properties that need to be maintained by a group of encapsulations.

and quotes William Cook:

Encapsulation: the problem is that encapsulation is fantastic in places where it is needed, but it is terrible when applied in places where it isn’t needed. Since OOP enforced encapsulation no matter what, you are stuck. For example, there are many properties of objects that are non-local, for example, any kind of global consistency. What tends to happen in OOP is that every object has to encode its view of the global consistency condition, and do its part to help maintain the right global properties. This can be fun if you really need the encapsulation, to allow alternative implementations. But if you don’t need it, you end up writing lots of very tricky code in multiple places that basically does the same thing. Everything seems encapsulated, but is in fact completely interdependent.