Working with C style strings

C++ is an extension of the C language. Because C++ seeks to be backward compatible with C code, most code written in the original C idiom continues to work just fine. For example, a widely used convention in C programming is that text strings are represented via null terminated character arrays. Text strings are stored in arrays of char, with the special null character, '\0', placed at the end of the array to serve as an end marker. Those char arrays that store null terminated text are almost universally accessed via char* pointers.

To support this convention, the C standard library provided a number of standard functions to perform common tasks with C style text strings. For example, the function

int strlen(char* str)

computes and returns the length of the character string stored in the array pointed to by str.

To support the use of the C-style text idiom in C++, the C++ standard library makes all of the legacy text manipulation functions available via a header file <cstring>.

Here is an example program that uses C-style strings to solve a simple problem involving strings. The files words.txt and test.txt contain two word lists. In each list words are stored one word per line. We would like to construct a program that finds all of the words in test.txt that do not appear in words.txt.

Here is the code for a simple program to do this.

#include <iostream>
#include <fstream>
#include <vector>
#include <cstring>

int main(int argc, const char * argv[]) {
  std::vector<char*> strs;
  std::vector<char*> testStrs;
  std::vector<char*> diffStrs;

  std::ifstream in;
  char buffer[64];


  // Read the word list
  in.open("words.txt");
  while(in.getline(buffer,63)) {
    char* newStr = new char[std::strlen(buffer)+1];
    std::strcpy(newStr,buffer);
    strs.push_back(newStr);
  }
  in.close();

  // Read the test words
  in.open("test.txt");
  while(in.getline(buffer,63)) {
    char* newStr = new char[std::strlen(buffer)+1];
    std::strcpy(newStr,buffer);
    testStrs.push_back(newStr);
  }
  in.close();

  // Find all the test words that do not appear on the word list
  for(auto otr = testStrs.begin();otr != testStrs.end();otr++) {
    char* testStr = *otr;
    bool found = false;
    for(auto itr = strs.begin();!found && itr != strs.end();itr++)
      if(std::strcmp(*itr,testStr) == 0)
        found = true;

    if(!found)
      diffStrs.push_back(testStr);
  }

  // Print the test words that did not appear on the word list
  for(auto itr = diffStrs.begin();itr != diffStrs.end();itr++)
    std::cout << *itr << std::endl;

  // Free the memory used by the character arrays.
  for(auto itr = strs.begin();itr != strs.end();itr++)
    delete[] *itr;
  for(auto itr = testStrs.begin();itr != testStrs.end();itr++)
    delete[] *itr;
    
  return 0;
}

The program makes use of a mix of C-style strings and simple C++ ideas to get its work done.

The first thing we need to be able to do here is to read the words from an input text file. As usual in C++, set up ifstream objects to do the reading from the original text files. The ifstream class offers a method getline() that makes it possible to read a line of text from a text file and store that text in a C-style character array. getline() takes two parameters - the first is a pointer to a character array where the characters will be stored and the second is the maximum number of characters the array can store. The program uses getline() to read words via an arrangement that looks like this:

char buffer[64];
in.getline(buffer,63);

As the program reads the words from the two files, it will want to store those words in a pair of vectors. Those vectors are declared to have type vector<char*> and will actually be storing pointers to the arrays for each word. Here is the portion of the code that reads words from an ifstream and stores them in a vector:

in.open("words.txt");
while(in.getline(buffer,63)) {
  char* newStr = new char[std::strlen(buffer)+1];
  std::strcpy(newStr,buffer);
  strs.push_back(newStr);
}
in.close();

This code uses getline() to read a word and store it in the buffer array. Then, using strlen() to determine the length of the word it makes a second array to store the word permanently and copies the word from the buffer array into the new array with the strcpy() function. Finally, it stores a pointer to the copy in a vector by using the vector class's push_back() method.

Here is the portion of the code that finds the words in the second vector that do not appear in the first.

// Find all the test words that do not appear on the word list
for(auto otr = testStrs.begin();otr != testStrs.end();otr++) {
  char* testStr = *otr;
  bool found = false;
  for(auto itr = strs.begin();!found && itr != strs.end();itr++)
    if(std::strcmp(*itr,testStr) == 0)
      found = true;

  if(!found)
    diffStrs.push_back(testStr);
}

The code uses a fairly obvious double loop structure. The outer loop iterates over all of the words from the test file. For each word the inner loop scans the list of words read from words.txt looking for a match. If no match is found, we push the current test word onto a third vector, diffStrs. At the end of the program we print all of the strings in diffStrs out to cout.

A key bit of logic is the test needed to compare our test string against a string from words.txt. This if statement implements that test:

if(std::strcmp(*itr,testStr) == 0)
  found = true;

The test makes use of the C-style string comparison function strcmp(). That function takes two pointers to string arrays as its parameters and returns 0 if the text stored in the two strings is the same.

Switching to the C++ string class

Since C++ is an object oriented language, it naturally offers a string class. The C++ string class can be used as a replacement for the older C-style string code. One of the benefits of replacing C-style character arrays with the string class is that the string class allows us to write more natural code and frees us from having to know the specifics of functions such as strlen, strcpy, and strcmp.

Here is the example program rewritten to use the string class:

#include <iostream>
#include <fstream>
#include <vector>
#include <string>

int main(int argc, const char * argv[]) {
  std::vector<std::string> strs;
  std::vector<std::string> testStrs;
  std::vector<std::string> diffStrs;

  std::ifstream in;
  char buffer[64];


  // Read the word list
  in.open("words.txt");
  while(in.getline(buffer,63)) {
    std::string newStr(buffer);
    strs.push_back(newStr);
  }
  in.close();

  // Read the test words
  in.open("test.txt");
  while(in.getline(buffer,63)) {
    std::string newStr(buffer);
    testStrs.push_back(newStr);
  }
  in.close();

  // Find all the test words that do not appear on the word list
  for(auto otr = testStrs.begin();otr != testStrs.end();otr++) {
    std::string testStr = *otr;
    bool found = false;
    for(auto itr = strs.begin();!found && itr != strs.end();itr++)
      if(testStr == *itr)
        found = true;

    if(!found)
      diffStrs.push_back(testStr);
  }

  // Print the test words that did not appear on the word list
  for(auto itr = diffStrs.begin();itr != diffStrs.end();itr++)
    std::cout << *itr << std::endl;

  return 0;
}

The string class provides three immediate benefits. The first is that the constructor for the string class that takes a pointer to a C-style string as its parameter automatically makes a copy of the C-style string for us. This simplifies the code for reading the strings from the files:

in.open("words.txt");
while(in.getline(buffer,63)) {
  std::string newStr(buffer);
  strs.push_back(newStr);
}
in.close();

The second benefit is that the string class defines a version of the comparison operator == that allows us to compare two strings. This makes the code for string searches simpler:

for(auto itr = strs.begin();!found && itr != strs.end();itr++)
  if(testStr == *itr)
    found = true;

The third benefit is that the string class manages its own storage. When the vectors containing the strings go away at the end of main, the vectors will automatically destroy all the strings they contain, and the strings will free the memory they used to store their text.

Making our own string class

The C++ string class makes it possible to write cleaner code. However, like all classes in the C++ standard library, the C++ string class is a bit of a black box. To begin to develop some insight into how classes like std::string actually work, we are going to engage in the exercise of writing our own string class.

One obvious difference between classes in C++ and classes in Java is that a class definition in Java typically appears in a single file. Most commonly, a Java class definition will appear in a source code file whose name matches the name of the class. C++ typically splits a class into two parts. The first part is the class declaration. The class declaration typically appears in a header file, which is a source code file with the .h file extension. The header file can have the same name as the class, but is not required to have the same name. On some occasions you will see multiple class declarations in a single header file.

A class declaration consists of a listing of the member variables and prototypes for the methods in the class.

Here is the class declaration for our string class. This class declaration appears in a header file named String.h:

#include <iostream>

class String {
  friend bool operator==(const String&,const String&);
  friend std::ostream& operator<<(std::ostream&,const String&);
  public:
    String();
    String(const char* otherStr);
    String(const String& other);
    String(String&& other) noexcept;
    ~String();

    String& operator=(const String& other);
    String& operator=(String&& other);
    
    int length() const;

    static void initString();
    static int memoryUsage();
    static int maxMemoryUsage();
    static int copyCount();
    static int moveCount();
  private:
    static int memoryUsed;
    static int maxMemoryUsed;
    static int copiesMade;
    static int movesMade;

    char* allocMemory(int size);

    int len;
    char* str;
};
    
bool operator==(const String& one,const String& two);
std::ostream& operator<<(std::ostream& out,const String& str);

Here are some things to note here:

  1. C++ does not make a distinction between public and private classes as Java does. Hence the first line of the declaration simply says class String.
  2. Just as in Java, methods and member variables can be declared public or private. Java does this by attaching the public and private qualifiers to each method or member variable. C++ instead divides the class into public and private sections.
  3. Just as in Java, C++ also has the concept of a constructor. This class has four constructors used to initialize Strings from arrays of characters or from other Strings.
  4. C++ adds the concept of a destructor. The destructor is the method whose name is a tilde (~) followed by the class name. The destructor is a method that gets called automatically whenever we delete an String object or when an String variable goes out of scope. As we will see below, the destructor typically includes code that frees up any memory or other resources used by the object just before the object goes away.
  5. Some of these methods are declared const. A const declaration is a declaration that the method in question will not do anything to change the object the method gets called on. You may also notice that some of these methods take parameters that are declared const. You can only call const methods on those const parameters. Calling a non-const method on a parameter or variable that has been declared const results in a compiler error.
  6. A C++ class declaration ends with a semicolon.

Method definitions

As you saw above, a C++ class declaration typically only lists the methods in the class without providing any code for those methods. The implementations of the methods are usually written out in a .cpp file whose name matches the name used for the header file.

In this case, I have created a file String.cpp that contains the code for the methods. Here are the contents of that file:

#include "String.h"

int String::memoryUsed;
int String::maxMemoryUsed;
int String::copiesMade;
int String::movesMade;

String::String() :
  len(0),str(nullptr) {
}

String::String(const char* otherStr) {
  len = 0;
  while(otherStr[len] != '\0')
    ++len;
  
  str = allocMemory(len);

  for(int n = 0;n < len;n++)
    str[n] = otherStr[n];
}

String::String(const String& other) {
  copiesMade++;
  len = other.len;
  str = allocMemory(len);
  for(int n = 0;n < len;n++)
    str[n] = other.str[n];
}

String::String(String&& other) noexcept {
  movesMade++;
  len = other.len;
  other.len = 0;
  str = other.str;
  other.str = nullptr;
}

String::~String() {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
    len = 0;
    str = nullptr;
  }
}

String& String::operator=(const String& other) {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
  }
  copiesMade++;
  len = other.len;
  str = allocMemory(len);
  for(int n = 0;n < len;n++)
    str[n] = other.str[n];
  return *this;
}

String& String::operator=(String&& other) {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
  }
  movesMade++;
  len = other.len;
  other.len = 0;
  str = other.str;
  other.str = nullptr;
  return *this;
}

int String::length() const {
  return len;
}

void String::initString() {
  memoryUsed = 0;
  maxMemoryUsed = 0;
  copiesMade = 0;
  movesMade = 0;
}

int String::memoryUsage() {
  return memoryUsed;
}

int String::maxMemoryUsage() {
  return maxMemoryUsed;
}

int String::copyCount() {
  return copiesMade;
}

int String::moveCount() {
  return movesMade;
}

char* String::allocMemory(int size) {
  memoryUsed += size;
  if(memoryUsed > maxMemoryUsed)
    maxMemoryUsed = memoryUsed;
  return new char[size];
}

bool operator==(const String& one,const String& two)
{
  if(one.length() != two.length())
    return false;
  int max = one.length();
  for(int n = 0;n < max;n++)
    if(one.str[n] != two.str[n])
      return false;
  return true;
}

std::ostream& operator<<(std::ostream& out,const String& str)
{
  for(int n = 0;n < str.len;n++)
    out << str.str[n];
  return out;
}

Here are some things to note about this code.

  1. The first thing you see at the top of the file is an include statement that includes the header file where the class was declared. If we forget to include that class declaration the compiler will have no idea what we are talking about when we start to work with the String class.
  2. The code for each member function is listed separately. Note that the name of the function consists of the class name, ::, and the name of the function.
  3. In an effort to track how efficiently this class uses resources, I have added some debugging code. The most import bits of debugging code involve the static member variables memoryUsed, maxMemoryUsed, copiesMade, and movesMade. Each time the String class has to allocate a new chunk of memory to hold some characters, it adds the size of that array to the memoryUsed variable. Any time the destructor frees an array of characters it will subtract that array's size from the memoryUsed variable.
  4. In the header file and the cpp file you will also notice that I have set up a pair of free functions after the class. These two functions are operator overloads that allow us to use our new String class in combination with the == and << operators. I will discuss those two functions in greater detail below.

Using the new class

The final step in working with this class is to use it in a program.

Here is source code for a program that uses our String class to solve the problem we solved in the earlier version of the program.

#include <iostream>
#include <fstream>
#include <vector>
#include "String.h"

int main(int argc, const char * argv[]) {
  std::vector<String> strs;
  std::vector<String> testStrs;
  std::vector<String> diffStrs;

  std::ifstream in;
  char buffer[64];

  String::initString();

  // Read the word list
  in.open("words.txt");
  while(in.getline(buffer,64)) {
    String newStr(buffer);
    strs.push_back(newStr);
  }
  in.close();

  // Read the test words
  in.open("test.txt");
  while(in.getline(buffer,64)) {
    String newStr(buffer);
    testStrs.push_back(newStr);
  }
  in.close();

  // Find all the test words that do not appear on the word list
  for(auto otr = testStrs.begin();otr != testStrs.end();otr++) {
    String testStr(*otr);
    bool found = false;
    for(auto itr = strs.begin();!found && itr != strs.end();itr++)
      if(*itr == testStr)
        found = true;

    if(!found)
      diffStrs.push_back(testStr);
  }

  // Print the test words that did not appear on the word list
  for(auto itr = diffStrs.begin();itr != diffStrs.end();itr++)
    std::cout << *itr << std::endl;

  // Display memory usage statistics
  std::cout << "Using memory: " << String::memoryUsage() << std::endl;
  std::cout << "Max memory: " << String::maxMemoryUsage() << std::endl;
  std::cout << "Copies made: " << String::copyCount() << std::endl;
  
  return 0;
}

One additional feature I have added here takes advantage of the fact that I had complete control over the design of the String class. Since the String class is designed to automatically track its memory usage as it runs, the program ends by outputting some information about memory usage.

static member variables

Both Java and C++ feature the concept of static member variables and static member functions. In Java static member functions are invoked via the syntax

<class name>.<function>

C++ is similar, except the syntax for calling a static member function is

<class name>::<function>

Another small difference is the way that static member variables are implemented. In Java, a class definition combines everything having to do with a class into one file. In C++, on the other hand, classes are most often broken up into a class declaration, which goes into a header file, and member function defintions, which go into a separate cpp file.

Static member variables are usually used to store global information about a class and its behavior. In this example, the static member variables in the String class exist to track information about how much memory is being used across all instances of the String class.

When you put a static member variable in a class declaration you are simply declaring that such a variable exists. The statement that actually sets up the variable appears later, usually in the associated cpp file. If you look at the top of the String.cpp file you will see these statements:

int String::memoryUsed;
int String::maxMemoryUsed;
int String::copiesMade;
int String::movesMade

These statements set up the four static member variables that were declared in the class declaration. If you forget to put these statements in the cpp file, your program will fail to compile with a linker error that complains about those variables not being defined.

As always, static member functions are only allowed to access static member variables.

A tour of constructors

Both Java and C++ classes support the concept of constructors. A constructor is a specialized member function whose purpose in life is to properly initialize an object from some input data. The example String class above features four different constructor functions. This section of the notes will walk you through each of these constructors in turn.

The first constructor is a default constructor.

String::String() :
  len(0),str(nullptr) {
}

The default constructor is used to initalize String objects in cases where no data is available to initialize the object. For example, if we make a variable declaration

String myString;

the compiler will invoke the default constructor to initialize myString.

Another case where the default constructor will be used is when we construct a vector of String objects:

std::vector<String> V(100);

This declaration sets up a vector of 100 Strings. The compiler will invoke the default constructor to initialize each one of the 100 Strings in the vector.

This constructor makes use of a special constructor initializer syntax. The initializer expressions len(0) and str(nullptr) that appear after the colon specify how to initialize the two member variables len and str. len gets initialized to 0, and str gets initialized to the value nullptr, which is used to indicate that a pointer currently has nothing to point to. (nullptr was introduced to C++ in C++11.) Since the initializer expressions do all of the work that the constructor needs to do, the body of the constructor is empty.

The second constructor is designed to initialize a String object from a C-style character array:

String::String(const char* otherStr) {
  len = 0;
  while(otherStr[len] != '\0')
    ++len;
  
  str = allocMemory(len);

  for(int n = 0;n < len;n++)
    str[n] = otherStr[n];
}

As we saw in the first example program above, the smart thing to do when working with a C-style array is to start by immediately making a copy of that array. This code does precisely that. As a side effect of making that copy, the code also determines the length of the original C-style string and stores that length information in the member variable len.

To allocate the memory needed to store the copy, this constructor calls the private member function allocMemory. That function creates an array of chars of the desired size and returns a pointer to that array. In addition, it records the memory used to create the array in the class's static member variables so we can track this class's memory usage.

char* String::allocMemory(int size) {
  memoryUsed += size;
  if(memoryUsed > maxMemoryUsed)
    maxMemoryUsed = memoryUsed;
  return new char[size];
}

The third constructor is a copy constructor.

String::String(const String& other) {
  copiesMade++;
  len = other.len;
  str = allocMemory(len);
  for(int n = 0;n < len;n++)
    str[n] = other.str[n];
}

Copy constructors get invoked in situations where we clearly need to make a copy of some other object. For example, the code

String one("Hello");
String two = one;

will invoke the copy constructor to copy String one into String two.

Another situation that triggers the copy constructor is passing a String as a parameter to a function.

std::vector<String> vec;
String s("Hello");
vec.push_back(s);

In this example we want to push a String we just created onto a vector for storage. Passing that String to the vector's push_back method triggers a copy of that String via the copy constructor.

The fourth constructor is a move constructor.

String::String(String&& other) noexcept {
  movesMade++;
  len = other.len;
  other.len = 0;
  str = other.str;
  other.str = nullptr;
}

As you can see from the code for this constructor, this constructor is designed to move the data out of the String other into a new String that is being constructed. After moving the data out of other, we reset its member variables to indicate that other is now empty.

To distinguish a move constructor from a copy constructor, the move constructor uses a different parameter type, a universal reference. The pair of ampersands after the String in the parameter type indicate that other is a universal reference to a String.

Move constructors were introduced in C++11, and are designed to be invoked in situations where an object is being moved from one location to another. One very common situation that involves doing a lot of move operations is the vector class's push_back method. The vector class stores its data internally in an array. When that array runs out of space in a push_back operation, the vector class will make a new, larger internal array and then attempt to move all of its data from the old array to the new one. If the internal array contains objects, the vector class will use the objects' move constructor to implement the moves from one array to another. The vector class further requires that any move constructors it uses not throw an exception: if the move constructors can not supply that guarantee, the vector class will revert to using the copy constructor to move objects from old array to the new array. After all of the objects have been copied from the old array to the new array, the vector will destroy all of the original objects. To inform the vector class that our move constructor will not throw an exception, we add the noexcept specification to the constructor.

Destructors

Any class that allocates resources in its constructors needs to also provide a destructor. The purpose of the destructor is to release any resources claimed by the constructors. Here is the destructor for the String class

String::~String() {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
    len = 0;
    str = nullptr;
  }
}

Most of the String class's constructors allocate memory to store the characters in the String. The destructor will check to see if the str pointer points to something. If it does, the destructor invokes the delete[] operation on that pointer to free the memory used by that array.

Destructors get triggered when objects cease to exist. For example, if we declare an initialize a String object in some scope

String example("Hello");

that String object will get destroyed automatically when we exit that scope.

We can also explictly trigger a destructor by doing this:

String *example = new String("Hello");
// Do some stuff with example...
delete example; // ..then make it go away.

The explicit call to the delete operation calls the destructor for that object and then frees up the memory associated with the String object itself.

Another scenario that triggers calls to a destructor occurs when a container gets destroyed. In the example program above I declared and used a number of vector<String> objects in main. When we exit main, those vectors will be destroyed automatically. At the instant those vectors get destroyed, the vector class's destructor will call the destructor for each one of the String objects stored in that vector.

Operator overloads

One more thing that we will need to do to make our String class useful is to define a set of operator overloads. Our program to compare words includes a couple of statements that attempt to apply standard C++ operators to Strings. The first statement is a statement that attempts to use the comparison operator, == , to compare two Strings with each other:

// Find all the test words that do not appear on the word list
for(auto otr = testStrs.begin();otr != testStrs.end();otr++) {
  String testStr(*otr);
  bool found = false;
  for(auto itr = strs.begin();!found && itr != strs.end();itr++)
    if(*itr == testStr) // Note the use of == here
      found = true;

  if(!found)
    diffStrs.push_back(testStr);
}

The second statement attempts to use the stream insertion operator, <<, to write some Strings out to cout.

// Print the test words that did not appear on the word list
for(auto itr = diffStrs.begin();itr != diffStrs.end();itr++)
    std::cout << *itr << std::endl;

Both of these statements will fail to compile if we don't define what these operators mean when used with Strings. We define these two operators for Strings by constructing a pair of special functions, called operator overloads:

// Define an overload for == for Strings
bool operator==(const String& one,const String& two)
{
  if(one.length() != two.length())
    return false;
  int max = one.length();
  for(int n = 0;n < max;n++)
    if(one.str[n] != two.str[n])
      return false;
  return true;
}

// Define an overload for << for Strings
std::ostream& operator<<(std::ostream& out,const String& str)
{
  for(int n = 0;n < str.len;n++)
    out << str.str[n];
  return out;
}

When the compiler encounters some statement involving one of these operators, such as

if(*itr == testStr)

it will try to translate that operator expression into an equivalent function expression:

if(operator==(*itr,testStr))

It will then go looking for a function named operator== that takes two String objects as its parameters. That is precisely the function we defined above.

One final comment about these operator overloads. These functions are set up as free functions that are declared and defined outside the String class. At the same time, both of these functions need to access private data in the String objects they work with. To enable them to do this, the String class declares both of these operator functions to be friends of the String class, which gives them explicit permission to see the private data stored inside any String object.

class String {
  friend bool operator==(const String&,const String&);
  friend std::ostream& operator<<(std::ostream&,const String&);
  public:
    // Anyone can see this stuff
  private:
    // Only member functions and friends can see this stuff
};

The final pair of operator overloads provide versions of the assignment operator, =. This operator will be invoked whenever you write code like this:

String one = two;

When the compiler encouters an assignment involving objects, the first thing it will do is to translate the assignment into an equivalent syntax involving a call to a member function with the name operator=:

String one;
one.operator=(two);

The compiler will then go looking for a member function in the class with that name. Here is the operator= for our String class.

String& String::operator=(const String& other) {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
  }
  copiesMade++;
  len = other.len;
  str = allocMemory(len);
  for(int n = 0;n < len;n++)
    str[n] = other.str[n];
  return *this;
}

Not surprisingly, this looks a lot like the code for a destructor followed by a copy constructor. This is natural, since a request to assign one String to another requires us to first delete any data held by the String we are assigning to and then copy the data from the other String into this String. Finally, operator= is required to return a reference to the thing being assigned to. This makes it possible to chain assignments. For example, the statement

String one = two = three;

is legal in C++. The compiler will translate this statement into

String one;
one.operator=(two.operator=(three));

For this to work properly, the assignment on the right that copies three into two must return a reference to two so we can subsequently assign that to one.

Just as there is both a copy constructor and a move constructor, we also have to have both a copy assignment operator and a move assignment operator. Here is the code for the move assignment operator.

String& String::operator=(String&& other) {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
  }
  movesMade++;
  len = other.len;
  other.len = 0;
  str = other.str;
  other.str = nullptr;
  return *this;
}

Programming Assignment

Constructing a class is a very challenging process. When we construct something like the String class, we need to construct a class that ensures that any programs that use the class are both correct and efficient. The String class I constructed above is correct, but it is less efficient than it could possible be. The inefficiency is caused by the String class's tendency to make lots of copies of the text it contains. You can trace this inefficiency directly to the String class's copy constructor:

String::String(const String& other) {
  copiesMade++;
  len = other.len;
  str = allocMemory(len);
  for(int n = 0;n < len;n++)
    str[n] = other.str[n];
}

Each time we invoke the copy constructor, it will make a copy of the character array in the String. Now consider this copy constructor, which looks a little more like the move constructor.

String::String(const String& other) {
  copiesMade++;
  len = other.len;
  str = other; // Copy the pointer, but not the array
}

If you take the program from above and make this one small change, the program will do much less unnecessary copying. However, this change introduces a new problem.

Because the copy constructor now will cause multiple String objects to share the same array of characters, we will now have trouble with the destructor. The current destructor

String::~String() {
  if(str != nullptr) {
    memoryUsed -= len;
    delete[] str;
    len = 0;
    str = nullptr;
  }
}

will attempt to delete the array that str points to. This will cause trouble in cases where more than one String object shares the same value of str. When the first one in a group of objects executes the destructor, the other objects will be left with str values that are no longer valid. When a second String in that group tries to execute its destructor, it will try deleting the str array that has already been deleted, which will cause an exception.

What I am going to propose next is a modification to the original String class that will make it possible to have some of the efficiency gains that the new copy constructor offers while at the same time making it possible to construct a version of the destructor that is correct.

The first step in the modification is to change the structure of the String class. Remove the len and str member variables in String and replace them with

StringData* data;

Where StringData is a new class:

class StringData {
  public:
    char* str;
    int len;
    int refCount;
    
    StringData(const char* other);
};

Here we have moved the str and len member variables into the StringData object and also added a third member variable, called a reference counter. The purpose of the reference counter is to track how many String objects are currently sharing this StringData.

With this change we rewrite the copy constructor for the String class to look like this:

String::String(const String& other) {
  data = other.data;
  data->refCount++;
}

The pictures below illustrate how this will work.

The first picture below illustrates the situation immediately after we execute the statement

String one("blah, blah");

The String object one now points to a StringData object that contains a pointer to an array of characters that hold a copy of the text. The reference count value of the StringData object is set to 1 initially, because only the String object one refers to this text.

Next, we execute a statement

String two(one);

which triggers a copy. Instead of copying the text, the copy constructor for the String class simply makes the String object two point to the same StringData object that one points to, while increasing the reference count in the StringData object from 1 to 2.

Finally, we exit the scope in which one was defined, which triggers a call to the String destructor function for one. The destructor sees that the StringData object that one points to currently has more than one reference to it, so it simply decreases the reference count from 2 to 1.

When the destructor for two eventually runs it will see that two points to a StringData object whose reference count is 1. At that point the destructor will delete[] the array containing the characters and then also delete the StringData object that two pointed to.

Rewrite the code for the String class to use this new arrangement. The test program that uses the String class to solve the word problem should continue to function correctly after the change, and should report using much less memory than the original version. (A program that makes optimal use of memory should report using 28403 bytes of memory to store the strings.)

For your convenience, here is an archive containing the source code and data files for the original version of the String program.