Background reading

The author discusses the AVL tree in section 6.7 of the textbook. You should begin by reading that section carefully.

The assignment

Implement the full AVL tree insert function, and write a program that demonstrates that your AVL tree is capable of handling a number of insertions while remaining balanced.

For this assignment you will only have to implement insertions, not deletions.

Below I am going to give you some code to get you started. You must use this code and follow the structure I outline below.

A skeleton for an AVL tree class

Here is code that will serve as the starting point for your AVL tree class.

template<class T>
class AVLNode {
public:
    AVLNode() { 
        left = right = nullptr;
        balance = 0;
    }
    AVLNode(const T& el, AVLNode *l = nullptr, AVLNode *r = nullptr) {
        key = el; left = l; right = r; 
        balance = 0;
    }
    T key;
    AVLNode *left, *right;
    int balance;
};

template<class T>
class AVLTree {
public:
  AVLTree() { root = nullptr; }
  
  void insert(const T& key) { insert(root,key); }
  
  void snapShot(std::ostream& out) { 
    out << "TreeForm[";
    snapShot(out,root); 
    out << "]" << std::endl; }
private:
  AVLNode<T>* root;

  void rightRotation(AVLNode<T>* &node) {
    AVLNode<T>* left = node->left;
    
    node->left = left->right;
    left->right = node;
    node = left;
  }
  
  void insert(AVLNode<T>* &node,const T& key) {
    if(node == nullptr) {
      node = new AVLNode<T>(key);
    } else if(key > node->key) {
      insert(node->right,key);
    } else {
      insert(node->left,key);
    }
  }
  
  void snapShot(std::ostream& out,AVLNode<T> *p) {
    out << '\"' << p->key << '\"';
    if(p->left != nullptr || p->right != nullptr) {
      out << '[';
    if(p->left==nullptr)
      out << "\"\"";
    else
      snapShot(out,p->left);
    out << ',';
    if(p->right==nullptr)
      out << "\"\"";
    else
      snapShot(out,p->right);
    out << ']';
    }
  }   
};

The most important thing to note here is that the AVL tree's insert method is recursive. This is essential for the proper functioning of the fully implemented insert method. The insert method I have provided does insertions correctly, but does not yet handle balance factors, and makes no attempt to rebalance the tree after an insertion. Adding these features will be the key thing you will contribute in this exercise.

The first significant change you need to make here is to change the method

void insert(AVLNode<T>* &node,const T& key)

to return a bool. This bool will tell the caller whether the height of the subtree rooted at node changed after the insertion. For example, the code in the first if in this insert method needs to change to

if(node == nullptr) {
  node = new AVLNode<T>(key);
  return true;
}

because in the case where we create a new node where one did not exist before, we go from having a subtree of height -1 to a having a subtree of height 0, which is a net increase in the height of the subtree. When insert calls itself recursively, it will have to pay attention to the change in height information coming back from the recursive call, since this information may cause the balance factor for the node making the call to change. Whenever balance factors change, you may then have to trigger rotations.

To give you a head start on implementing rotations, I have also provided code for a right rotation method. You will also need to write a symmetric left rotation method, and then call these rotation methods as needed in the insert method to put the tree back into balance locally after an insertion.

As the textbook points out, there are various different cases that will require rotations after an insertion. You should start by studying the material in section 6.7.2 that describes these cases carefully.

Finally, note that both the rotation and the insert methods work with references to pointers. This is a key requirement, and is essential to the proper implementation of these methods.

Visualizing your trees

In addition to the insert method, I have also provided code for a recursive snapShot method. This method will print a description of the tree to an output stream. You can use this method to print descriptions of a tree at various points in time to an output file, as this sample test program demonstrates:

#include <iostream>
#include <fstream>
#include "AVL.h"


int main() {
  AVLTree<int> x;

  x.insert(12);
  x.insert(8);
  x.insert(6);
  x.insert(10);
  x.insert(14);

  std::ofstream out("tree.txt");
  x.snapShot(out);

  x.insert(16);
  x.snapShot(out);

  out.close();
  
  return 0;
}

This test program prints the following to a file named "tree.txt":

TreeForm["12"["8"["6","10"],"14"]]
TreeForm["12"["8"["6","10"],"14"["","16"]]]

You can copy and paste these statements into Mathematica and evaluate them to generate graphical representations of the trees. This visualization capability may be useful to you as you are debugging your insertion method. Mathematica is available on the lab machines in Briggs 419. To display a tree in Mathematica, start up Mathematica, paste in a tree expression, click to the right of the expression and press shift-enter to evaluate the expression and generate a picture of the tree.

Test data

After writing your AVL insert program you will want to test it to determine whether or not it is doing the right thing. Here is a sequence of random integers:

41
67
34
0
69
24
78
58
62
64
5
45
81
27
61
91
95
42
27
36
91
4
2
53
92
82
21
16
18
95
47
26
71
38
69
12
67
99
35
94
3
11
22
33
73
64
41
11
53
68

Note that this list may contain duplicates. If you use this sequence of ints as an input to a test program that seeks to build a search tree from them, you should begin by searching the tree for each input item - if the item is already present in the tree you should not try to insert it again.

If we use the original insert method I provided above and output a snapshot of the tree after all the numbers have been inserted, Mathematica will generate a tree diagram that looks like this from the snapshot data:

After inserting the same list of items into an AVL tree and making a snapshot we instead get this:

If you would like even more data to test your program against, here is a text file containing snapshots made after every single insertion in the test program.

More about AVL trees

The description of the AVL tree insert algorithm found in the text should be sufficient to allow you to correctly implement the AVL tree insert as outlined above. In case you are still having difficulty implementing this algorithm after reading the text's description, here is some further discussion of the algorithm.

Observation

All insertions into a BST add a leaf node to the tree.

Definition

An insertion into an AVL tree is a 0-insertion if the path from the leaf where the insertion was made up to the root of the tree (including the root) passes only through nodes marked with a balance factor of 0.

0-insertion Lemma

Any 0-insertion increases the height of the tree by one.

Proof (By induction on the height of the tree before the insertion.) The base case is a tree with a height of 0. The root of such a tree automatically has a balance factor of 0, and inserting a node into such a tree always increases its height from 0 to 1. For the induction step, assume that the result is true for trees of height n-1. Consider a 0-insertion into a tree of height n. Since the root started out with a balance factor of 0, its left and right subtrees have equal heights before the insertion. Consider the subtree that the insertion gets made into. The insertion into that subtree is a 0-insertion into a tree of height n-1, and by the induction hypothesis the insertion causes that subtree's height to increase by 1. That means that the subtree that the insertion was made into is now higher than the other subtree, and the height of the tree as a whole increases by 1.

Definition of the ancestor node

Consider the path from the leaf where the new node gets inserted up to the root of the tree. The first node on that path with an original balance factor of -1 or +1 is the ancestor node.

Non-propagation

If we can convince ourselves that in every case below the height of the subtree rooted at the ancestor node does not change after our adjustments, then it will not be necessary to adjust any balance factors or do any rotations to nodes above the ancestor node.

Observation

We can dispense with two simple cases right away.

Case One There is no ancestor node, either because the tree was empty to start with, or because the insertion is a 0-insertion. In this case the tree is still a valid AVL tree after the insertion (although some balance factors may need adjusting). No rotations are needed. In this case there are no nodes above the ancestor node, so we don't have to worry about propagation upwards.

Case Two The balance factor of the ancestor is -1 and we insert into the right subtree of the ancestor, or the balance factor of the ancestor is +1 and we insert into the left subtree of the ancestor. In this case the insertion we make is a 0-insertion into a subtree of the ancestor that increases the height of that subtree by one and therefore brings the ancestor's subtree back into balance. In this case the balance factor of the ancestor node reverts to 0, and no rotations are needed. Also in this case the height of the subtree rooted at the ancestor node does not change, so no propagation upward takes place.

The difficult cases

The only remaining cases are insertion into the left subtree of an ancestor with balance factor of -1 and insertion into the right subtree of an ancestor node with balance factor of +1.

By symmetry, we only need to consider the +1 case. The -1 case is symmetric.

First of all, it is clear that some rotation has to take place, because the height of the right subtree was one greater than the height of the left subtree before insertion. The insertion into the right subtree is a 0-insertion and causes the height of the right subtree to increase by 1. Immediately after insertion the right subtree of the ancestor is now 2 taller than the left subtree. Thus a left rotation about the ancestor is absolutely necessary. The only question is whether or not a preliminary rotation is necessary.

Consider the right subtree of the ancestor. The root of that right subtree is a 0 node, which means that the heights of its two subtrees were equal before insertion. The subtree we insert into will have its height increase by 1. The only question that remains is which of the two subtrees of the 0 node we inserted into, and what we should do in those cases.

Here are pictures to cover the two cases along with height information just after the insertion.

In the right hand case an initial right rotation about the 0 node will convert the tree into a tree that looks like the left picture. A left rotation about the +1 node in the left picture produces this result:

After the rotations the balance factor on the +1 node becomes 0, and the entire subtree has height n+1, which was the same height it had before the changes. Once again, no changes propagate further up into the tree.