Pointers as Member Variables

Now that we have seen that pointers can be used for a variety of applications in C++ programming, we have to be prepared to use them extensively in our programming.

One particular scenario which is going to occur frequently is the use of pointer variables as member variables in classes we write. This application of pointers has to be handled with some care, so I have prepared these notes to walk you through some important aspects of using pointers as member variables.

A class that uses a pointer as a member variable

The class that I am going to use as the primary example for these notes is a class that loosely replicates the Java ArrayList class. An ArrayList is essentially a resizable array.

class ArrayList
{
private:
  int *A;
  int N;
public:
  ArrayList(); // Default constructor
  ArrayList(int size); // Standard constructor
  ArrayList(const ArrayList &other); // Copy constructor
    ~ArrayList(); // Destructor
  
  ArrayList& operator=(const ArrayList& other);

  int size() const;

  void resize(int newSize);

  void set(int n,int x);
  int get(int n) const;
};

The ArrayList maintains an internal array of ints that is accessed via the pointer member variable A. Since the array can be resized, we also maintain a second member variable N that records the present size of the array.

The ArrayList class has a size() member function that users can call to learn the size of the ArrayList, and set() and get() methods used to set and retrieve ints from the ArrayList. Additionally, the ArrayList class also has a resize() method that users can call to change the size of the array.

The standard constructor for this class initializes the internal array and records the size information.

ArrayList::ArrayList(int size)
{
  A = new int[size];
  N = size;
}

The set() and get() methods are used to access the array elements. Note that set() includes a test to ensure that the location whose value we are trying to set is a legitimate location.

void ArrayList::set(int n,int x)
{
  if(n >= 0 && n < N)
    A[n] = x;
}

int ArrayList::get(int n) const
{
  return A[n];
}

The most interesting member function is resize(). This function creates a new array with the desired size, copies the contents of the old array into the new array, and finally deletes the old array.

void ArrayList::resize(int newSize)
{
  int *temp = new int[newSize];
  int max = newSize;
  if(newSize > N)
    max = N;
  for(int k = 0;k < max;k++)
    temp[k] = A[k];
  delete[] A;
  A = temp;
  N = newSize;
}

The const keyword

You may have noticed the use of the keyword const in several spots in the class declaration above. The original motivation for the const keyword in C++ is to declare that a certain variable is a constant. For example, you may see a declaration like this in a program:

const double pi = 3.14159265;

A variable declared const is a constant and is not allowed to subsequently change its value. With the declaration above if we subsequently tried to write a statement

pi = 22.0/7.0;

the compiler would flag that statement as an error because pi was declared const.

The second application of const occurs in combination with reference parameters. Reference parameters are a very efficient method of passing data to a function, but they come with one serious danger. That danger is that the function may modify a reference parameter, having an undesirable effect on the thing that was passed in as the parameter. In order to retain the efficiency of reference parameters as a parameter-passing mechanism while precluding this undesirable side effect, C++ allows you to declare reference parameters to be const. Placing the const qualifier on the parameter locks it down and precludes the function from doing anything to modify the parameter.

This solves the problem of accidental modification, but it causes another problem. That problem is that parameters passed by reference are frequently objects. One of the things one can do with an object is to call one of its methods. What guarantee do we have that the method we are calling won't secretly modify the object, thus violating the const rule? The solution to this problem is to say that methods in a class can be declared const or non-const by the presence or absence of the keywork const at the end of the method declaration. Methods that are declared const give a public guarantee that they will do nothing to modify the object when called. The compiler will then enforce that guarantee: when you write the code for that method you will not be allowed to write any code that modifies the value of the member variables of the object. Methods that are not declared const offer no such guarantee, and are allowed to do things that will change the member data of the object. Given the ability to declare methods either const or non-const, the C++ compiler can now fully enforce the const rule for reference parameters: if the thing being passed in as a const reference parameter is an object, the only methods that can be called on that object are methods that are declared const.

As we cover the various member functions of the ArrayList class in the discussion below, pay attention to the use of the const keyword in various contexts and note that all of the code for this class is consistent with the correct usage of that keyword.

The destructor

Classes with member variables will frequently construct the things those member variables point to by calling new in the constructor. For example, the ArrayList constructor creates an array of ints:

ArrayList::ArrayList(int size)
{
  A = new int[size];
  N = size;
}

Whenever you write code that calls new to create an array or an object, you are obligated to make sure that you call the corresponding delete command at some later time.

The appropriate place to call delete for some thing you created in a constructor is from the class's destructor. This is the destructor for the ArrayList class.

ArrayList::~ArrayList()
{
  delete[] A;
}

Note that it cleans up the array created by the constructor by calling delete[] on the array.

When are destructors triggered? C++ automatically generates a call to the destructor any time an object ceases to exist. Two things can cause an object to disappear. If the object was created as an automatic variable, the object disappears when the program leaves the scope the object was declared in. If the object was created dynamically through a call to new, the destructor will get called when we subsequently call delete on the object. Note that if we forget to call delete on the object, the destructor will never get called, and the array in the ArrayList will never get properly cleaned up. (This situation is commonly referred to as a memory leak.)

The copy constructor

When we create a class that has a pointer variable as a member variable, we have to be especially careful about situations in which copies of these objects will be made. If we make a copy of an ArrayList object, what will will get is a new object that has a pointer and an int in it. Both of those member variables will be copies of the member variables found in the original. This sets up a dangerous situation, because the pointer in the copy points to exactly the same array that the pointer in the original points to. If we call the set() member function on the copy, set() will change an element in the array. Since both objects have pointers that point to the same array, we are going to end up effectively modifying both the original ArrayList and the copy ArrayList as the same time. This not what we want when we set out to make a copy.

An even more dangerous situation would occur if we had two ArrayLists with pointers that pointed to the same array and we deleted one of the ArrayLists. The deletion would trigger a call to the destructor for that ArrayList, which would delete the array. As soon as that happens, the other ArrayList will be left holding a pointer to an array that is no longer valid.

The fix for this situation is to write some code that ensures that when we copy an ArrayList we actually copy the array contained in the original, so that the copy is a true copy, completely unconnected with the original. To handle this, C++ has a concept called a copy constructor. Here is the code for the ArrayList class's copy constructor.

ArrayList::ArrayList(const ArrayList &other)
{
  N = other.size();
  A = new int[N];
  for(int k = 0;k < N;k++)
    A[k] = other.get(k);
}

The copy constructor is a constructor whose parameter is a const reference to an object whose contents we wish to copy. Note that the code in this constructor makes a completely new array of ints with the same size as that used by the parameter ArrayList. That new array gets a copy of all of the elements found in the original.

Note also that the parameter other in this constructor is a const parameter. Notice that both methods this code calls on other are declared const and thus are safe to use with this type of parameter.

Copy constructors will get triggered automatically in any situation where you would normally make a copy of an ArrayList. Most commonly, copy constructors trigger automatically when you pass an ArrayList as a value parameter to a function. Here is some code to demonstrate how this works.

Suppose we wrote a function

void f(ArrayList x)
{
  x.resize(100);
  cout << "x has size " << x.size() << endl;
}

that includes code that clearly modifies the parameter x. Here is some code that demonstrates what happens when you call this function.

ArrayList original(20);
cout << "original has size " << original.size() << endl;
f(original);
cout << "original has size " << original.size() << endl;

Running this code prints this:

original has size 20
x has size 100
original has size 20

When we pass original as a parameter to f, a copy gets made, and the copy constructor will get called automatically to initialize the parameter x as a copy of original. Anything that f does to x applies only to the copy, and the ArrayList original is completely unaffected.

If we were to remove the copy constructor, C++ would revert to using dumb copy behavior. x would end up with a pointer that points to the same array that original's pointer points to, and a member variable N with a value of 20. Calling resize on x in that case leads to disaster:

The bottom line here is the whenever you create a class that has one or more pointers as member variables you need to write a copy constructor that makes a copy of the thing the pointers point to and doesn't just copy the pointer.

Copy constructors get triggered whenever you pass an object as a value parameter or you return an object from a function. You can also trigger a copy by explicitly using copy constructor syntax.

ArrayList one(100);
ArrayList two(one); // two is a copy of one

Note also that the copy constructor will not get triggered when you pass an object as a reference parameter or pass a pointer to that object, because in both instances you are passing location information and not an object that needs to be copied.

operator=

Another obvious situation that should trigger copying behavior is ordinary assignment. Consider this code.

ArrayList one(100);
ArrayList two = one;

In this case, we would expect two to end up as a copy of one. After all, assignment should involve copying.

The only problem with this is that having a copy constructor defined for the class being copied by the assignment is not sufficient. Unlike other situations, using assignment will not trigger the copy constructor. Instead, we have to essentially write some code that defines how assignment should work for ArrayLists. We do this by adding a special member function to the ArrayList class called operator=.

ArrayList& ArrayList::operator=(const ArrayList &other)
{
  if(N > 0)
    delete[] A;

  N = other.size();
  A = new int[N];
  for(int k = 0;k < N;k++)
    A[k] = other.get(k);
    
  return *this;
}

A call to this member function gets triggered automatically by the line

ArrayList two = one;

because the compiler will transform that code into the equivalent

ArrayList two;
two.operator=(one);

Why do we need to write a member function in addition to the copy constructor we already have? We need a completely new function to handle assignment because there are some subtle differences between assignment and copying.