In the last lecture we saw how to declare and use objects in C++. Here is a main function taken from the greeter example I showed on Monday.
void main()
{
// Construct a Greeter object and tell it to greet.
Greeter g("Hello, world");
g.greet();
}
The variable g we are creating in this example to hold the Greeter object is a known as an automatic variable. Automatic variables are variables that appear automatically when we enter the scope in which they are declared. Most often, the scope for a variable consists of the particular block of curly braces within which the variable is declared. In the example above, the Greeter variable g is declared within the scope of the main function. Automatic variables disappear automatically and cease to exist when we leave the scope within which they are declared.
Here is a somewhat more sophisticated example to demonstrate how the scope rule and automatic variable declaration works.
void main()
{
string name;
cout << "Enter your name, or anon to be anonymous: ";
cin >> name;
if(name == "anon") {
Greeter genericGreeter("Hello, stranger");
genericGreeter.greet();
}
else {
Greeter specificGreeter(string("Hello, ") + name);
specificGreeter.greet();
}
}
The program creates two different Greeter objects for the two cases it encounters. In each case, the Greeter object exists only within the scope within which it is defined. Any attempt to access or use either of the Greeter objects outside their scopes will fail.
As an alternative to automatic variables, C++ offers dynamic variables. A dynamic variable consists of two parts: a pointer variable that is used to access the item in question, and a dynamically allocated item for the pointer variable to point to.
A pointer variable is a variable that provides information about where something is located. Here is how to declare a pointer variable that is designed to point to a Greeter object:
Greeter *g;
The asterisk denotes g as a pointer variable. The purpose of pointer variables is to point to things. To use the pointer variable, you typically dereference the pointer to access the thing the pointer points to. C++ uses the * operator to do dereferencing. For example if you were to set up g to point to a Greeter object, this is how you would tell that Greeter to do the greet() method:
(*g).greet();
To use a pointer variable correctly, you have to give the pointer something to point to. This is usually done by a process called dynamic allocation. The following code dynamically creates a Greeter object with the C++ new operator. new returns a pointer to the object created, so we can assign that pointer to a pointer variable.
Greeter *g; // Allocate a pointer variable
g = new Greeter("Hello, world"); // Dynamically allocate a Greeter and
// store a pointer to it in g
By separating the pointer variable from the act of creating the thing to point to, we allow ourselves greater flexibility. The follow example shows another way to handle the situation we encountered in the previous example.
void main()
{
string name;
cout << "Enter your name, or anon to be anonymous: ";
cin >> name;
Greeter *g; // g is a pointer to a Greeter object
// g needs something to point to, so next we create a
// Greeter object.
if(name == "anon") {
g = new Greeter("Hello, stranger");
}
else {
g = new Greeter(string("Hello, ") + name);
}
// Now that g points to a valid Greeter, we can tell it to greet.
(*g).greet();
}
Since we are frequently going to be creating objects dynamically and using pointers to access those objects, we are very quickly going to find the usual syntax
(*g).greet();
to be clumsy and awkward. Fortunately, C++ offers a convenient alternative syntax.
g->greet();
This does exactly the same thing as the previous form, and is easier to type.
The flip side of dynamically allocating an object with new is the need to deallocate the object when we are done with it. When we create an object with new, we get an object that occupies some location in the memory, and a pointer we can use to access the object. When we are done using an object, we need to tell C++ to free up the space used by that object so that memory space can be reused for some other purpose. When you are done working with an object that you have created with new, you make it go away by using the delete operator:
Greeter *g;
g = new Greeter("Hello, world");
g->greet();
delete g;
If you forget to do that last step, you will generally not suffer immediate serious consequences. The only scenario in which it becomes essential to use delete is a situation in which your program is constantly creating and discarding large numbers of objects. If you forget to delete objects when you are done with them, they simply clog up the available memory space, and you may find yourself running out of memory.
Now that we have seen that there are two different ways to create and interact with objects in C++, it is interesting to compare those two forms with what Java does. Consider these three examples:
// Automatic object case, C++
Greeter g("Hello, world");
g.greet();
// Dynamic object case, C++
Greeter *ptr;
ptr = new Greeter("Hello, world");
ptr->greet();
delete ptr;
// Java
Greeter jGreet = new Greeter("Hello, world");
jGreet.greet();
The Java syntax is a mix of the syntax used for the C++ automatic and dynamic forms. In fact, an object reference variable in Java is actually a pointer. This explains why you always have to use new to create objects in Java - objects accessed via pointers have to be dynamically allocated. Since Java only has dynamic allocation for objects, it doesn't need two distinct syntaxes for calling methods - that allowed the Java language designers to adopt the simpler dot syntax that C++ uses with automatic object variables.
Java also does not have delete. The reason for this is that Java has a built-in system of garbage collection that can sense when a dynamically allocated object is no longer being used and dispose of it automatically.
The fact that we can create two different types of object variables, automatic and dynamic, has consequences when we start writing functions that take objects as a parameters. Since automatic objects are typically accessed without pointers and dynamic objects are accessed through pointers, there are two initial methods for passing objects to functions as parameters - pass by value and pass by reference.
To help illustrate the differences between different parameter passing mechanisms, I have prepared a simple example class that will help to illustrate some of these differences.
class Counter
{
private:
int count;
public:
Counter() { count = 0; }
void increment() { count++; }
int getCount() const { return count; }
};
This class represents a simple counter than can be incremented and queried.
Here are two different functions that make use of the two different mechanisms available for passing objects to functions.
void fCopy(Counter c)
{
c.increment();
}
void fPointer(Counter *c)
{
c->increment();
}
The function fCopy uses the pass by value mechanism, while fPointer uses pass by reference.
When you pass an object to fCopy, fCopy gets a copy of the object you pass to it, essentially getting the object's current value. Anything that fCopy does to the parameter it receives applies only to the copy, and not to the original. Thus, the following code
Counter a; a.increment(); a.increment(); cout << "a.getCount() = " << a.getCount() << endl; fCopy(a); cout << "After call to fCopy "; cout << "a.getCount() = " << a.getCount() << endl;
will print the message
a.getCount() = 2 After call to fCopy a.getCount() = 2
when we run it. The increment that fCopy performed on its parameter has no effect on a, since passing a as a parameter to fCopy triggers a copy, and modifications to the copy have no effect on the original.
When you pass a pointer to fPointer, fPointer will use that pointer to access the object. Since the pointer c points to the same object as the pointer you passed in as the parameter, any modifications that fPointer makes through that pointer will affect the original object whose pointer you passed in. Thus, the following code
Counter *b = new Counter(); b->increment(); b->increment(); cout << "b->getCount() = " << b->getCount() << endl; fPointer(b); cout << "After call to fPointer "; cout << "b->getCount() = " << b->getCount() << endl;
will print the message
b->getCount() = 2 After call to fPointer b->getCount() = 3
All other things being equal, the form that gives the behavior you will prefer in most instances is the pointer version. What do you do if you have a non-pointer object variable and you want to pass it as a parameter to a function that expects a pointer? C++ has a special operator &, called the address-of operator, that you can use to make a pointer to something. Here is one way you could use that address-of operator to pass a pointer to a non-pointer object.
Counter *d = &a; // Make the pointer d point to the address of object a. fPointer(d); // d is a valid pointer, so we can pass it to a function // that expects a pointer parameter.
Equivalently, you could use the more convenient short-cut form
fPointer(&a); // Get the address of a and pass it to fPointer.
Also, if you need to pass a pointer object to a function that expects a non-pointer parameter, you can use the dereferencing operator *.
fCopy(*b);
All other things being equal, in most instances where you want to pass objects as parameters, you will want to pass them as pointers. The only small problem with passing pointers is that pointers tie you to the annoying pointer syntax. Typing
(*c).increment();
or
c->increment();
is slightly more difficult than typing
c.increment();
To help alleviate this problem, C++ offers a third parameter passing mechanism, the reference parameter. A reference parameter is essentially a pointer parameter with non-pointer syntax. Inside the function that takes a reference parameter you manipulate that parameter as if it were a non-pointer parameter. Also, to pass an object as a reference parameter, you have to treat it as if you were passing a non-pointer parameter. The code below illustrates how this works. First, we define a function that takes a reference parameter. Note that the code inside the function is acting as if c were a non-pointer parameter.
void fReference(Counter &c)
{
c.increment();
}
Here is some code that illustrates how you would pass both non-pointer and pointer objects to this function as a parameter.
cout << "a.getCount() = " << a.getCount() << endl; fReference(a); cout << "After call to fReference "; cout << "a.getCount() = " << a.getCount() << endl; cout << "b->getCount() = " << b->getCount() << endl; fReference(*b); cout << "After call to fReference "; cout << "b->getCount() = " << b->getCount() << endl;
Even though all the syntax above suggests that we are dealing with a non-pointer parameter passing mechanism, the behavior we get from a reference parameter is identical to what you would get with a pointer parameter. Here is what the code above ends up printing out:
a.getCount() = 2 After call to fReference a.getCount() = 3 b->getCount() = 3 After call to fReference b->getCount() = 4
To summarize, here are some rules to follow when deciding which parameter passing mechanism to use:
Just like objects, arrays in C++ come in two varieties: automatic arrays and dynamic arrays. An automatic array is fixed in size and is created with a declaration like
int A[100];
A dynamic array is allocated with new. Here is the syntax used to allocate a dynamic array.
int *B = new int[100];
Note that the array variable in this case is actually a pointer variable. The pointer points to the first element in the array. Arrays created dynamically with new must be deleted with delete[].
delete[] B;
The primary advantage that dynamic arrays offer is that they allow you to specify the size of the array at run-time. Automatic arrays must have sizes that are determined at compile time.
cout << "How many numbers do you need to store? "; int n; cin >> n; int A[n]; // Illegal - automatic arrays must have sizes fixed at compile time. int *B = new int[n]; // Legal, and preferred.
Despite the fact that these arrays are set up in different ways they both work exactly the same. In both cases you can use the bracket notation to access individual elements of the array.
A[0] = 2; // Standard array notation B[0] = 2; // Same notation works with a dynamic array
You can pass both of these arrays to functions. For example, here is a function that will sort an array with selection sort.
void sort(int X[],int N) {
// Use selection sort to sort the array
for(int n = 0;n < N - 1;n++) {
int smallest = X[n];
int where = n;
for(int k = n + 1;k < N;k++)
if(X[k] < smallest) {
smallest = X[k];
where = k;
}
X[where] = X[n];
X[n] = smallest;
}
}
Note that since C++ does not allow you to ask an array its length via the X.length syntax, we have to explicitly pass the size of the array as a parameter to the function. To call this sort function with the array B created above we would do
sort(B,n);
An array variable can also be used as a pointer, because technically the array variable is a pointer to the first element in the array.
*A = 5; // Same as A[0] = 5;
Also, an array variable can be passed as a parameter to any function that expects a pointer instead:
void f(int* x) { *x = 2; }
void main() {
int A[5];
f(A);
cout << A[0] << endl; // Outputs 2
}
You can also do pointer arithmetic with an array variable. Adding an offset to an array pointer moves you over that many spaces in the array.
*(A + i) = 5; // Equivalent to A[i] = 5;
Here is some code to open a text file containing a list of integers and count how many integers are in the file.
int count(string &fileName) {
int N = 0;
int x;
ifstream in;
in.open(fileName.c_str());
while(in >> x) N++;
in.close();
return N;
}
Use this code to write a program that does the following.