Signed and unsigned data types

The C++ language includes a number of primitive data types that can be used to represent characters and integer data.

Data TypeSize in bits
char8
short16
int32
long long64

All of these data types are signed types, which means that their first bit is reserved as a sign bit. If the number represented is a negative number this first bit will be 1, otherwise the sign bit will be 0.

In situations where we want to represent only positive integers, we can used the unsigned variant of each of these basic types. The unsigned variant does not have a sign bit, and makes all of the bits available to store numerical data.

Using hexadecimal representations

In most applications in which we want to work with numerical data we will want to use decimal representations of numbers. C++ automatically converts decimal numbers into binary.

int x = 357; 
// Decimal 357 gets converted into binary and stored in x

In situations where we explicitly want to work with sequences of bits, we will find it useful to use a number system that makes it easier to write the bit sequences we want. C++ provides an alternative scheme for representing numerical quantities which uses a base 16, or hexadecimal, number system.

Here are the digits in the base 16 number system and their associated bit sequence equivalents.

DigitBit Equivalent
00000
10001
20010
30011
40100
50101
60110
70111
81000
91001
A1010
B1011
C1100
D1101
E1110
F1111

Because each digit in a hexadecimal number takes up exactly four bits and the size of all of the integer types in bits are multiples of 8, this number system makes an easy to use system for representing sequences of bits. To make 8 bits for an unsigned char we only need to write out two hexadecimal digits.

To write a hexadecimal number in C++ we preface the digits with 0x to indicate that the number is a hexadecimal number.

Here are some examples of bits of code that use hexadecimal numbers to store particular bit sequences in variables.

unsigned char zero = 0x00;    // 00000000
unsigned char one = 0x01;     // 00000001
unsigned char allOnes = 0xFF; // 11111111
unsiged char zebra = 0x55;    // 01010101

Bitwise operators

The C++ language does not include a data type that allows us to represent a single bit. Instead, we have to work with groups of bits. The smallest group of bits the language allows use to work with is the unsigned char, which is a group of 8 bits.

C++ does include operators that allow us to manipulate the bits in a number. These bit operators work on groups of bits in parallel.

The three bit operators are the bitwise and operator &, the bitwise or operator |, and the bitwise exclusive or operator ^. Here are tables showing how these operators act on pairs of bits.

&01
000
101

|01
001
111

^01
001
110

When these operators are applied to pairs of integers, they act in parallel on all the bits that make up the integers. Here are some examples to demonstrate how this works.

unsigned char a = 0x5F;  // 10011111
unsigned char b = 0x6A;  // 11001010
unsigned char c = a & b; // 10001010
unsigned char d = a | b; // 11011111
unsigned char e = a ^ b; // 01010101

The bitwise operators are commonly used to perform bit manipulations on numbers. For example, to determine whether or a given bit in a number is 1 we can and the number with a specially constructed mask that has a 1 in that position:

unsigned char mask = 0x08;  // 00001000
unsigned char x = 85;
unsigned char bit = x & mask; 
// bit will be either 00000000 or 00001000 depending on what
// the bit in position 3 of x is.

unsigned char mask = 1; // 00000001
unsigned char x = 85;
unsigned char lastBit = x & mask; // Extract the last bit of x

unsigned char mask = 0x80;  // 10000000
unsigned char x = 85;
unsigned char firstBit = x & mask; // Extract the first bit of x

We can use the bitwise or operator to set a particular bit in a number to 1.

unsigned char mask = 1; // 00000001
unsigned char x = 86;
x = x | mask; // Set the last bit of x to 1

We can use a combination of the and operator and an inverse mask to set any bit in a number to 0.

unsigned char mask = 0x7F; // 01111111
unsigned char x = 85;
x = x & mask; // Set the first bit of x to 0

The exclusive or operator is most often used to flip bits to their opposite state. This is based on the observation that the exclusive or of any bit with 1 yields the opposite of that bit.

unsigned char mask = 0xFF; // 11111111
unsigned char x = 85;
x = x ^ mask; // All of the bits in x have been flipped to
// their opposite values.

Bit shift operators

The final operators that C++ offers for doing bit manipulations are the bit shift operators, << and >>. These operators shift bit sequences to the left or right by a number of spaces, padding with 0 bits and dropping any bits that fall off the number. Here are some examples.

unsigned char x = 0xD5; // 11100101
x = x << 2;    // x is now 10010100
x = x >> 3;    // x is now 00010010

The bit shift operators are most often used in combination with masking operations to peel bits off a number one by one. The example below shows how to split an unsigned char into an array of individual bits:

unsigned char x = 0xD5;
unsigned char bits[8];
unsigned char mask = 1; // 00000001
for(int n = 7;n >= 0;n--) {
  bits[n] = x & mask; 
  // 1 if the last bit of x is 1, 0 if it is not
  x = x >> 1; // Use a shift to drop the last bit of x
}