The Trouble with gcc’s –short-enums Flag
GCC has a cool and relatively obscure optimization flag called short-enums
. It is the kind of tweaky little optimization that I get excited about, but it has a dark side that bit me.
What to love about short-enums
Normally GCC compiles enums into a full register width value, like a 32-bit number. The default size of an enum can technically be different on different platforms, but in practice it will will be 32 bits most places.
short-enums
instructs GCC to use the smallest storage it can. So if your enum has all its values within 8 bits, say from 0-255 – which will be essentially all enums for most code – it will be stored in a single byte. It is the equivalent of adding attribute “packed” to all enum declarations.
It is a tradeoff. Smaller storage will make your structures smaller. If you are trying to create tiny records for container classes, heaps and similar structures, it can be really appealing to cram your enums into 8-bits and fit the management of small data structures into 4 or 8 bytes.
You balance this against the knowledge that on modern systems access to 8-bit addresses is slower than access to full register size addresses ( e.g. aligned 32 or 64 bit accesses on modern platforms are fastest ). With pipelining, in practice the 8-bit accesses are usually just as fast as 32, and if you have a ton of little structures the optimization is often worth it.
The problem with short-enums
If you use a third party library that has not been compiled with short-enums, but you include their headers, and those headers include structures that have enums, you can end up referencing structure elements at the wrong offset. This is especially the case with c libraries that usually expect you to reference structure members directly yourself (without accessors).
This leads to much merriment as you try to figure out why a library that mostly seems to function correctly keeps passing you garbled structures.
I got this problem when using GraphicsMagick, an image manipulation library. In their API, images are created into an Image
struct:
Image* image = BlobToImage( imageInfo, srcImage.getBuffer(), srcImage.getCursor(), &exception ); fprintf( stderr, "Image width: %d, height: %d\n", image->columns, image->rows );
It was surprising to me in this case that the image width and height had bad values, yet the structure works fine when passed to other GraphicsMagick functions.
If I look at the definition of Image
the problem is revealed:
typedef struct _Image { ClassType storage_class; /* DirectClass (TrueColor) or PseudoClass (colormapped) */ ColorspaceType colorspace; /* Current image colorspace/model */ ... etc ... unsigned long columns, /* Number of image columns */ rows; /* Number of image rows */ ... more members ... } Image
ClassType
, and ColorspaceType
among others are enums. The library will treat each one as if it 32-bit storage, but my code was treating them as if they have 8 and 16-bit storage. Adding the “short-enums” flag subtly changes the meaning of code in included header files.
If your projects consist 100% of your own code, or you use third party libraries that you build yourself, short-enums
is generally safe. Even if you use pre-built third party libraries, but those libraries don’t expect you to compute your own offsets into their data structures (e.g. they have accessors to all their structures) you should not have a problem.
A Better Way to get the Same Effect
Don’t use --short-enums
. It is like using a sawed-off shotgun affecting more code than you intended.
If you want short enums, add:
__attribute__ ((__packed__))
to each enum that you think would benefit from a packed representation. Only in a few cases will it make a meaningful performance difference.
You’re referring to the flag either as “short-enums” or “–short-enums”, but it probably should be “-fshort-enums” in both cases.
The typography is making the double dashes a little hard to read. Traditional GCC – like you find on most linuxes accepts flags like “–short-enums” and “–no-short-enums”, as well as the “f” style flag you’re talking about: “-fshort-enums” and “-fno-short-enums”. GCC versions based on clang, like you find on Os X seem to only accept the “f” type flags for these options.
The base name of the flag is “short-enums”, and then there are ways to call it with and without “no” and with “–” or “-f”, so to me it makes sense to refer to the feature by its base name.
Packed produces bad layout if the structure has wholes. The best of both worlds seems to be to do this selectively when packing is actually worth it:
enum myenum field:8;
I think 2 things are mixed up here: data size and alignment. The packed attribute only affects alignment and I doubt whether every compiler will decide to make an enum 8-bits when it’s used. If an enum is 32-bit, the different alignment won’t make it any smaller. Alignment might affect the size of a struct that contains it, but that’s something different.
A very good point that using short-enum “subtly changes the meaning of code in included header files”
because of the potential difference in how storage space is recognized.
If space is a valuable resource to you (i.e. embedded systems), there is
kind of trick you can do to avoid enums. There is definitely an argument to made
here (enums allow better debugging, ease of understanding, etc), but
if you’re finding yourself in the need for short-enum, you may consider
simply typedef-ing a smaller type (uint8_t) and #define-ing the options.
Again, just a trick to save some space since the preproccessor will handle
all those #defines, again, this does make debugging a little bit uglier.
e.g.
typedef uint8_t ColorType
/* ColorType */
#define RED 0
#define BLUE 1
#define GREEN 2
Cool idea. Then you never have to worry about the size of the storage used changing.
Replacing an enum with typedef and #define may work in some cases, but when referring to enum values through a namespace, that strategy unfortunately doesn’t work.