s========================================================================== CSC 263H Lecture Outline for Week 4 Winter 2004 =========================================================================== [[Q: denotes a question that you should think about and that will be answered during lecture.]] ------------------ 2-3-4 Trees Search ------------------ Consider the following 2-3-4 tree. ______17______ / \ __4____9_ 20____30____41__ / | \ / | | \ 1_2_3 7_8 12 18 24 33 56_80 / | | \ / | \ / \ / \ / \ / \ / | \ [[Q: In what order would the nodes be examined if you were searching for the key 7? Which keys are evaluated at each node and what questions are asked? ]] [[Q: What about searching for key = 11? ]] Let's try to write this up formally. Before we start let's remind ourselves that a d-node has values, k_1, k_2, ..., k_{d-1} and children v_1, v_2, ..., v_d. The ordering property states that k_1 < k_2 < ... < k_{d-1} and every value k stored in the subtree rooted at v_i satisfies k_{i-1} < k < k_i (with the convention that k_0 = -oo and k_d = +oo). Starting with node n which is a d-node we are searching for key k. SEARCH(n,k): If n is a leaf, return NIL. (k is not the in the tree.) for i = 1 to d if k = k_i return x_i if k < k_i return SEARCH(v_i,k) [[Q: Trace through this algorithm on the same keys we searched for earlier.]] --------------------- 2-3-4 Trees Insertion --------------------- Consider inserting 6 into the above tree. First you would search to find if 6 is already in the tree. [[Q: If 6 isn't in the tree, what does our SEARCH routine return?]] That isn't very useful to us. Instead we need to know the parent of the leaf node. In this case we need to know the node (7,8). [[Q: What changes do we need to make to the search method?]] Now SEARCH returns the destination node into which we must insert the new element. If this node is a 2-node or a 3-node, the insertion is easy. We must put the new element into the correct position so that the order property still holds. [[Q: Do we need to worry about the sub-trees if we are shifting the keys?]] If the destination node is a 4-node already then inserting the new node will violate the size property. Consider inserting 6 and then inserting 5. When we insert the latter, the node (5,6,7,8) is formed which has too many keys. Whenever this happens, the violating node will have 4 keys. We resolve the problem by doing a SPLIT. In this case we form the node (5,6) and the node (8) and give them the parent (7). We insert the subtree __7__ / \ 5 6 8 into node (4,9) which was the original parent of (5,7,8). [[Q: What does the final tree look like after the insertion?]] Consider that more formally: When an insertion results in a 4-node n with keys k_1,k_2,k_3,k_4 and subtrees v_0,v_1,v_2,v_3,v_4. Let p represent the parent node of n. Assume for now that n is not the root. Let i be the position in p where n is the subtree (v_i = n). A split occurs which creates a 3-node s_1 with keys k_1,k2 and subtrees v_0, v_1 and v2, and a 2-node s_2 with key k_4 and subtrees v_3 and v_4. The original tree is reorganized by inserting k_3 into p as its k_{i+1} and shifting the higher keys in k to the right. p's subtrees v_i and v_{i+1} are assigned s_1 and s_2. The original subtrees of p v_j for j = 1..i-1 are the same. The subtrees of p v_j for j={i+2} .. 3 are each shifted to the right by increasing their index. Notice that this may result in p itself becoming a 5-node. If p was already a 4-node then inserting the new key will cause it to violate the size principle. When this happens a SPLIT must be performed on p. This could propogate all the way to the root. Now consider the case where n has no parent p because n is the root. In this case there is no node p into which we could insert k_2 so we create a new 2-node as root. Its key is k_3 and its subtrees are s_1 and s_2. -------------------- 2-3-4 Trees Deletion -------------------- Consider the following 2-3-4 tree: ______17_____________ / \ __4____6____9_ 20____30____41___ / | | \ / | | \ 1_2_3 5 7_8 12 18 24 33 56_80 / | | \ / \ / | \ / \ / \ / \ / \ / | \ [[Q: How would the tree change if we deleted the element with key 2?]] [[Q: What would be the problem if we deleted the element 4 in the same way?]] Using the same approach as for BST's we find the predecessor for 4 and swap the elements. In this case 4's predecessor is 3. If you are unfamiliar with this then you need to review this material (from your CSC148/A58 lecture notes or from section 3.1 of the textbook) before lecture time. So, we can safely only consider the case where we are deleting an element whose children are leaves. [[Q: Which tree property is violated if we delete the element with key 12 from the above tree?]] The problem is called UNDERFLOW. We solve this problem by BORROWING a key from a sibling. In this case the sibbling (7,8) has 2 keys so it can spare one. But notice that if we shifted 8 over to the node 12 we would have the following subtree _9_ / \ 7 8 which would violate the ordering property. So instead we ROTATE. We shift the 9 from the parent into 12's old position and the 8 from (7,8) into the hole left from moving 9. Notice that this could happen from either sibling so if we were to delete 5, we could rotate 3,4 and 5 or else 7,6 and 5. [[Q: Will this work if we want to delete 24? Who will rotate or why won't it work?]] In this case we don't borrow but we MERGE. We combine the node 24 with one of its 2-node siblings and the key from their parent which divides them. In this case, one choice is to combine 18,20 and 24. We delete node 24 from this and attach the combined node as the subtree of the parent. Here is the resulting tree: ______17_____________ / \ __4____6____9_ ___30____41___ / | | \ / | \ 1_2_3 5 7_8 12 18_20 33 56_80 / | | \ / \ / | \ / \ / | \ / \ / | \ Notice that we only need to merge when the sibling is a 2-node so the new resulting node is always a 2-node. Notice also that we could merge with either sibling. [[Q: What is the net result to the parent when we merge?]] This could cause the parent p to underflow. When this occurs we resolve the underflow by borrowing from or merging with a sibling of p. Consider a brand-new tree example: ______31________ / \ __6___ ____78 __ / \ / \ 1_2_3 7_8 60 100 / | | \ / | \ / \ / \ Delete 60 from this tree. 60 can't borrow from 100. 100 is already a 2-node. 60 can't borrow from (7,8). They are not siblings. So 60 is deleted and 100 merges with 78 and now the parent 78 underflows. ______31________ / \ __6___ ___? __ / \ \ 1_2_3 7_8 78_100 / | | \ / | \ / | \ Now ? could borrow from node 6 if it were a 3-node or 4-node but because it is a 2-node 6 is merged with 31 resulting in: ? / __6______31__ / | \ 1_2_3 7_8 78_100 Since 31 came from a 2-node it now underflows. Because it is the root, it is simply removed from the tree and (6,31) becomes the new root. --------------------------------------- Relating 2-3-4 Trees to Red-Black Trees --------------------------------------- Red black trees are another type of balanced search trees. They are binary trees where every node has an additional property. Each node is coloured either red or black. The following properties must hold: 1. The root of the tree is black. 2. Every external node is black 3. The children of a red node are black. 4. All external nodes have the same black depth. (The same number of black ancestors. Following insertion and deletion algorithms which maintain these properties guarantees that the tree remains balanced. It is easier to understand red-black trees when we relate them to to 2-3-4 trees. Consider the definition of RB trees above. A black node can have red children or black children but all red nodes have only black children. So we can relate black nodes with no red children to 2-nodes. Black nodes with 1 red child to 3-nodes and black nodes with 2 red children to 4-nodes.