Saturday, March 10, 2012

Bug: C enum variables stored as unsigned ints

I read in K&R that enum values are basically int constants (like #defines in that way), and so enum variables are equivalent to ints. However, in C (not C++ though), you may assign any int value to an enum variable--even if that int value is not one of the listed values in the enum definition. You can do this without even raising a compiler warning.

In a program I was working on, I took advantage of that. I had an enum of values 0 through 7:

  enum direction { N, NE, E, SE, S, SW, W, NW};

In a particular function, I was scanning a map for a target in different directions and decided to return -1 if there was nothing interesting found in any direction. However, this led to strange bug.

The following program shows this bug clearly:

#include <stdio.h>

enum nums {zero, one, two, three};

int main(void) {  
  //using an enum as normal
  enum nums myNum = zero;
  printf("zero == %d\n", myNum);
  
  //assigning int value to an emum
  myNum = -1;
  printf("-1 == %d\n", myNum);
  if (myNum >= 0) {
    printf("%d >= 0\n", myNum);
  }else {
    printf("%d < 0\n", myNum); 
  }
}

This program prints:

 zero == 0
 -1 == -1
 -1 >= 0

I'm using GCC, and the manual itself says: "By default, these values are of type signed int" and "Although such variables are considered to be of an enumeration type, you can assign them any value that you could assign to an int variable".

However, further research shows that gcc will store an enum variable as an unsigned int if you have no negative values in your defined enum. For example, if I add neg = -1 as an extra value to my enum nums above, the output of the program changes to what I expect: -1 < 0.

Apparently the section 6.7.2.2 of the C99 standard (draft version) clarifies that this is allowed--that the particular int format used is implementation-dependent. An official version of the C90 standard is not freely available for comparison. -std=c90 doesn't change gcc's behavior on this issue.