Making the Good, Great

Enhancing Tree Data Management

As mentioned at the end of the previous chapter, Binary Search Trees (BSTs) are really powerful and fast ways to store and manage data but their flaw is that if the data is loaded in a certain order such as previously sorted by the keys, trees will devolve into essentially linked lists, losing all the value they might have provided. The data structure that should provide excellent Log₂(N) access falls back to N. In other words a tree with 1,000,000 items that could be searchable with 20 or fewer tests goes back to taking 500,000 or so tests on average.

Computer Scientists, being the folks they are, continue to look for better ways to store and manage data so some of them moved beyond simple BSTs into trees that can continue to perform at their best under all circumstances of data input. Some researchers took the BST and managed it looking down from each node on the way to the top to look for opportunities to balance the tree and came up with the AVL tree.

Some researchers took the BST, color coded each node, and made a simple set of rules for responding to color coded conditions to keep the tree balanced. These researchers invented the Red-Black Tree. And finally some researchers threw out the BST approach and made nodes that could handle a varying amount of data with a varying number of children. There are actually a wide variety of these but a good way to learn about them is to start with the simplest one which is the Two Three tree.

This chapter will provide an opportunity to learn about all three kinds of self-balancing trees and appreciate what Computer Scientists have done to make our data management lives easier, faster, and better.

Enjoy.

Section 18a

Considering Balanced Trees

Making the Good, Great

Enhancing Tree Data Management