Background reading

Read about the B-tree data structure in section 7.1. Note that the author provides code outlines for the B-tree insertion and deletion methods in that section that you can use as a starting point.

What to do

Your assignment is to construct a B-tree class. You should use the following class for the nodes of the B-tree:

template <typename T,int M>
class BTreeNode {
  friend class BTree<T,M>;
public:
  BTreeNode();
  BTreeNode(const T& el);
private:
  bool leaf;
  int keyTally;
  T keys[M-1];
  BTreeNode* pointers[M];
};

The keyTally member variable indicates how many of the M-1 possible keys in the node are currently being used.

Your B-tree class should support the following operations:

You will also want to construct a test program to test your B-tree class. Have the test program read a list of integers from a text file and insert those integers into an initially empty B-tree. Then, read a second list of integers from a second text file and have the B-tree remove those data items. Finally, have your test program run a traversal of the B-tree to print its contents out to cout. To make sure that you are testing your B-tree class thoroughly enough, make sure that you add enough data items to the tree to get the tree to reach a height of at least two.

Additional resources

Here are some additional resources to help you with this assignment.

First, here are a pair of data sets for you to use. The first is a list of input data values you can use as data to insert into your BTree. The list contains 100 random numbers from the range 0-999. The list contains no duplicates. The second is a list of data items to remove from your BTree. This contains 40 random numbers from the range 0 - 999. Each one of these numbers is guaranteed to have already appeared in the list of data items to insert.

I suggest you use these input data values with a tree of arity M = 6. Once you get that arity working to your satisfaction you should also try using the same data sets with trees of arity M = 5 and M = 4. Your BTree class should be able handle all of these different arities correctly.

Next, here are a pair of helpful methods. These methods can be used to emit snapshots of your BTree structure that you can view in Mathematica.

template <typename T, int M>
void BTreeNode<T, M>::snapShot(std::ostream& out) {
  out << '\"' << keys[0];
  for (int n = 1; n < keyTally; n++) {
    out << ':' << keys[n];
  }
  out << '\"';
  if (!leaf) {
    out << '[';
    pointers[0]->snapShot(out);
    for (int n = 1; n <= keyTally; n++) {
      out << ',';
      pointers[n]->snapShot(out);
    }
    out << ']';
  }
}
template <typename T, int M>
void BTree<T, M>::snapShot(std::ostream& out) {
  out << "TreeForm[";
  root->snapShot(out);
  out << ']' << std::endl;
}

Finally, here is some advice on how to handle consolidations. One of the things you should be prepared for is cascading consolidations. One way that this can happen is that you remove a key from a leaf, which causes the leaf to underflow. After looking at both the left and right siblings of the leaf you see that the node and its sibling contain too few keys, so you should consolidate the leaf and its sibling into one leaf and also remove a key from the parent. In rare cases removing a key from the parent will cause the parent to also underflow, triggering the need to consolidate the parent with one of its siblings. In that case, you will need a pointer to the parent of the parent to be able to do that second consolidation. To obtain that grandparent pointer you will have no choice but to use a search process: back up to the root of the tree and conduct a search for one of the keys in the parent. As you do that search, maintain a trailing pointer that always lags one step behind the current search node. When the search node arrives at the parent, you will then be able to return the parent of that parent, which is the node you will need access to for the second consolidation.