Relational macros

,

1. Introduction

Relational macros are a C programming language technique for embedding relations (tables of data) into programs in a way which is easy to check, safe to update, and requires no tools other than the C preprocessor. The technique is also known by the name “x-macros”, but I think “relational macros” is more informative.

2. The problem

First, a motivating example. Imagine we’re writing some bitmap-handling code, and we’re going to be handling several pixel formats. So let’s have an enumeration in a header:

enum {
    PIXEL_FORMAT_8888, /* 8 bits RGBA. */
    PIXEL_FORMAT_8880, /* 8 bits RGB. */
    PIXEL_FORMAT_5551, /* 5 bits RGB; 1 bit A. */
    PIXEL_FORMAT_I8,   /* 8 bits of intensity (equal RGB). */
    PIXEL_FORMAT_LIMIT
};

(It’s nice to have a LIMIT for each enumeration so you can straightforwardly check whether a value is in the enumeration, iterate over values in the enumeration and so on.) Now in some other piece of code we want to know how many bits per pixel there are for each format, so let’s have a table of data:

static int pixel_format_bits[] = {
    32,
    24,
    16,
     8,
};

There are two important problems with this. First, it’s hard to check that this is correct: we’ve got to compare text in two files. Second, if we change the order of values in the enumeration, or delete or add values, we break the bits-per-pixel table and there’s nothing to tell us that we’ve gone wrong. So let’s be a lot more cautious about checking that we’ve got it right, and change that table:

static struct {
    int format;
    int bits;
} pixel_format_bits[] = {
    {PIXEL_FORMAT_8888, 32},
    {PIXEL_FORMAT_8880, 24},
    {PIXEL_FORMAT_5551, 16},
    {PIXEL_FORMAT_I8,    8},
};

static void
pixel_format_check(void) {
#ifndef NDEBUG
    int i;
    assert((sizeof pixel_format_bits) / (sizeof pixel_format_bits[0])
           == PIXEL_FORMAT_LIMIT);
    for (i = 0; i < PIXEL_FORMAT_LIMIT; i++) {
        assert(pixel_format_bits[i].format == i);
    }
#endif
}

This works, but it’s already looking a bit verbose for what ought to be a simple data structure. And if somewhere else we need another piece of information about each pixel format (for example, we need its name so that on the command line we can indicate which format of output we want), then we need to do the same thing again:

static struct {
    int format;
    const char *name;
} pixel_format_name[] = {
    {PIXEL_FORMAT_8888, "rgba"     },
    {PIXEL_FORMAT_8880, "rgb"      },
    {PIXEL_FORMAT_5551, "reduced"  },
    {PIXEL_FORMAT_I8,   "greyscale"},
};

And of course we need code to check this table too. You can see that this could get rather tedious. What we have ended up doing here is taking a single relation (pixel format, bits per pixel, format name) and splitting it up into several smaller relations in different places in the code. Not only is it still hard to check that the relation is correct, it’s still hard to modify it. (At least we do get told if we made a mistake, but we don’t find out until we’ve compiled and run the program.)

3. Solution

We can get something that’s easier to check and maintain if we keep the relation in one place. So let’s do that:

#define PIXEL_FORMAT_RELATION(X) \
    X(PIXEL_FORMAT_8888, 32, "rgba"     ) /* 8 bits RGBA. */ \
    X(PIXEL_FORMAT_8880, 24, "rgb"      ) /* 8 bits RGB; no A. */ \
    X(PIXEL_FORMAT_5551, 16, "reduced"  ) /* 5 bits RGB; 1 bit A. */ \
    X(PIXEL_FORMAT_I8,    8, "greyscale") /* 8 bits of intensity (equal RGB). */

This macro gives us the whole relation, and by passing in different things for the parameter X, we can capture different parts of the relation in different contexts.

To declare the enumeration:

enum {
#define X(ENUM, BITS, NAME) ENUM,
    PIXEL_FORMAT_RELATION(X)
#undef X
    PIXEL_FORMAT_LIMIT
};

To define the table of bits per pixel:

static int pixel_format_bits[] = {
#define X(ENUM, BITS, NAME) BITS,
    PIXEL_FORMAT_RELATION(X)
#undef X
};

And to define the table of names:

static const char *pixel_format_name[] = {
#define X(ENUM, BITS, NAME) NAME,
    PIXEL_FORMAT_RELATION(X)
#undef X
};

With this technique, there’s no need for code to check that the tables are consistent with the enumeration; and we can add, delete, and rearrange the relation PIXEL_FORMAT_RELATION without needing to change anything else in the code.

4. Other techniques

  1. The relational header technique places each relation in its own file. The relation is #include’d in each context where it is needed. This was how I learned the technique in the first place. It has the disadvantage that you end up with many small files. Paul Hankin pointed out to me that you can put the relation in a macro instead.

  2. Use a macro language like m4. This complicates the build process.

  3. Load the data at runtime from a database of some kind (either a real relational database, or an adhoc flat file database). This would mean that you couldn’t refer statically to the values in the enumeration from the code; you’d have to look them up at runtime.

5. Example

Here’s a portion of the definition of the fighter enumeration from the video game Zendoku. (Slightly simplified: the real relation also has parameters controlling the AI for each character.)

#define ENUM_GDS_FIGHTER(X) \
    /* Number                 name    male?  lucky    dojo    nemesis */ \
    X(GDS_FIGHTER_YATTA,      yatta,      1, SUMO,    HAND,   SAKURA  ) \
    X(GDS_FIGHTER_SONOKO,     sonoko,     0, HEART,   HAND,   SHINJI  ) \
    X(GDS_FIGHTER_KINGKAGE,   kingkage,   1, PANDA,   WEAPON, MUSASHI ) \
    X(GDS_FIGHTER_MUSASHI,    musashi,    1, SWORD,   WEAPON, KINGKAGE) \
    X(GDS_FIGHTER_SONNY,      sonny,      1, DRAGON,  WEAPON, SAKURA  ) \
    X(GDS_FIGHTER_SAKURA,     sakura,     0, FLOWER,  WEAPON, YATTA   ) \
    X(GDS_FIGHTER_AYUMI,      ayumi,      0, PARASOL, HAND,   SHINJI  ) \
    X(GDS_FIGHTER_SHINJI,     shinji,     1, PAGODA,  HAND,   AYUMI   ) \
    X(GDS_FIGHTER_HIROSHI,    hiroshi,    1, YINYANG, WEAPON, SHINJI  ) \
    X(GDS_FIGHTER_SANDO,      sando,      1, YINYANG, HAND,   YATTA   ) \
    X(GDS_FIGHTER_KAYERAH,    kayerah,    1, PANDA,   HAND,   KINGKAGE) \
    X(GDS_FIGHTER_SUWEDI,     suwedi,     1, HEART,   WEAPON, MUSASHI ) \
    X(GDS_FIGHTER_BOOMBOOM,   boomboom,   1, SUMO,    HAND,   SONNY   ) \
    X(GDS_FIGHTER_LORRAINE,   lorraine,   0, PARASOL, WEAPON, SONOKO  ) \
    X(GDS_FIGHTER_CHIPS,      chips,      1, FLOWER,  HAND,   SAKURA  ) \
    X(GDS_FIGHTER_JUNJIELONG, junjielong, 1, DRAGON,  WEAPON, AYUMI   ) \
    X(GDS_FIGHTER_KEKHMAAT,   kekhmaat,   1, PAGODA,  WEAPON, SHINJI  )

The three columns on the right have enumerated values in them, but to make the relation readable the common prefix of the enumeration has been omitted. Token pasting is used to restore it:

typedef struct gds_fighter_desc_s {
    const char *name;
    pkg_fighter_t *pkgp;
    pkg_t (*pkg_get_desc)(void);
    pkg_quest_fighter_t *quest_pkgp;
    pkg_t (*pkg_get_quest_desc)(void);
    int male;
    GDS_SYMBOL_e lucky;
    GDS_DOJO_e dojo;
    GDS_FIGHTER_e nemesis;
} gds_fighter_desc_s, *gds_fighter_desc_t;

static gds_fighter_desc_s gds_fighter_desc[] = {
#define X(FIGHTER, NAME, MALE, LUCKY, DOJO, NEMESIS) \
     {                                               \
         #NAME,                                      \
         &pkg_ ## NAME,                              \
         pkg_ ## NAME ## _get_desc,                  \
         &pkg_quest_ ## NAME,                        \
         pkg_quest_ ## NAME ## _get_desc,            \
         MALE,                                       \
         GDS_SYMBOL_ ## LUCKY,                       \
         GDS_DOJO_ ## DOJO,                          \
         GDS_FIGHTER_ ## NEMESIS,                    \
    },
    ENUM_GDS_FIGHTER(X)
#undef X
};

The various pkg pointers are used to link to the packaged assets for each fighter when those assets are loaded.