COMP9024 ♢ Week 03b ♢Search Tree Algorithms ♢ (22T0)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [0/55]
Binary search trees …
- data structures designed for O(log n) search
- consist of nodes containing item (incl. key) and two links
- can be viewed as recursive data structure (subtrees)
- have overall ordering (data(Left) < root < data(Right))
- insert new nodes as leaves (or as root), delete from anywhere
- have structure determined by insertion order (worst: O(n))
- operations: insert, delete, search, …
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [1/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [2/55]
❖ Balanced Binary Search Trees | |
Goal: build binary search trees which have
- minimum height ⇒ minimum worst case search cost
Best balance you can achieve for tree with
N nodes:
- abs(#nodes(LeftSubtree) - #nodes(RightSubtree)) ≤ 1, for every node
- height of log2N ⇒ worst case search O(log N)
Three strategies to improving worst case search in BSTs:
- randomise — reduce chance of worst-case scenario occuring
- amortise — do more work at insertion to make search faster
- optimise — implement all operations with performance bounds
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [3/55]
❖ Operations for Rebalancing | |
To assist with rebalancing, we consider new operations:
Left rotation
- move right child to root; rearrange links to retain order
Right rotation
- move left child to root; rearrange links to retain order
Insertion at root
- each new item is added as the new root node
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [4/55]
In tree below: t1 < n2 < t2 < n1 < t3
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [5/55]
Method for rotating tree T right:
- N1 is current root; N2 is root of N1's left subtree
- N1 gets new left subtree, which is N2's right subtree
- N1 becomes root of N2's new right subtree
- N2 becomes new root
Left rotation: swap left/right in the above.
Cost of tree rotation: O(1)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [6/55]
Algorithm for right rotation:
rotateRight(n1):
| Input tree n1
| Output n1 rotated to the right
|
| if n1 is empty or left(n1) is empty then
| return n1
| end if
| n2=left(n1)
| left(n1)=right(n2)
| right(n2)=n1
| return n2
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [7/55]
❖ Exercise : Tree Rotation | |
Consider the tree t
:
Show the result of rotateRight(t)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [8/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [9/55]
❖ Exercise : Tree Rotation | |
Write the algorithm for left rotation
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [10/55]
rotateLeft(n2):
| Input tree n2
| Output n2 rotated to the left
|
| if n2 is empty or right(n2) is empty then
| return n2
| end if
| n1=right(n2)
| right(n2)=left(n1)
| left(n1)=n2
| return n1
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [11/55]
Previous description of BSTs inserted at leaves.
Different approach: insert new item at root.
Potential disadvantages:
- large-scale rearrangement of tree for each insert
Potential advantages:
- recently-inserted items are close to root
- low cost if recent items more likely to be searched
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [12/55]
Method for inserting at root:
- base case:
- tree is empty; make new node and make it root
- recursive case:
- insert new node as root of appropriate subtree
- lift new node to root by rotation
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [13/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [14/55]
❖ Exercise : Insertion at Root | |
Consider the tree t
:
Show the result of insertAtRoot(t,24)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [15/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [16/55]
Analysis of insertion-at-root:
- same complexity as for insertion-at-leaf: O(height)
- tendency to be balanced, but no balance guarantee
- benefit comes in searching
- for some applications, search favours recently-added items
- insertion-at-root ensures these are close to root
- could even consider "move to root when found"
- effectively provides "self-tuning" search tree
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [17/55]
An approach to balanced trees:
- insert into leaves as for simple BST
- periodically, rebalance the tree
Question: how frequently/when/how to rebalance?
NewTreeInsert(tree,item):
| Input tree, item
| Output tree with item randomly inserted
|
| t=insertAtLeaf(tree,item)
| if #nodes(t) mod k = 0 then
| t=rebalance(t)
| end if
| return t
E.g. rebalance after every 20 insertions ⇒ choose k=20
Note: To do this efficiently we would need to change tree data structure and basic operations:
typedef struct Node {
int data;
int nnodes;
Tree left, right;
} Node;
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [18/55]
How to rebalance a BST? Move median item to root.
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [19/55]
Implementation of rebalance:
rebalance(t):
| Input tree t with n nodes
| Output t rebalanced
|
| if n≥3 then
| | t=partition(t,n/2)
| | left(t)=rebalance(left(t))
| | right(t)=rebalance(right(t))
| end if
| return t
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [20/55]
New operation on trees:
-
partition(tree,i)
: re-arrange tree so that element with index i becomes root
For tree with N nodes, indices are 0 .. N-1
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [21/55]
Partition: moves i th node to root
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [22/55]
Implementation of partition operation:
partition(tree,i):
| Input tree with n nodes, index i
| Output tree with item #i moved to the root
|
| m=#nodes(left(tree))
| if i < m then
| left(tree)=partition(left(tree),i)
| tree=rotateRight(tree)
| else if i > m then
| right(tree)=partition(right(tree),i-m-1)
| tree=rotateLeft(tree)
| end if
| return tree
Note: size(tree) = n, size(left(tree)) = m, size(right(tree)) = n-m-1 (why -1?)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [23/55]
Consider the tree t
:
Show the result of partition(t,3)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [24/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [25/55]
Analysis of rebalancing: visits every node ⇒ O(N)
Cost means not feasible to rebalance after each insertion.
When to rebalance? … Some possibilities:
- after every k insertions
- whenever "imbalance" exceeds threshold
Either way, we tolerate worse search performance for periods of time.
Does it solve the problem? … Not completely ⇒ Solution: real balanced trees (later)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [26/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [27/55]
❖ Better Balanced Binary Search Trees | |
So far, we have seen …
- randomised trees … make poor performance unlikely
- occasional rebalance … fix balance periodically
- but both types still have O(n) worst case
Ideally, we want both average/worst case to be
O(log n)
- AVL trees … fix imbalances as soon as they occur
- 2-3-4 trees … use varying-sized nodes to assist balance
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [28/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [29/55]
Invented by Georgy Adelson-Velsky and Evgenii Landis
Approach:
- insertion (at leaves) may cause imbalance
- repair balance as soon as we notice imbalance
- repairs done locally, not by overall tree restructure
A tree is unbalanced when: abs(height(left)-height(right)) > 1
This can be repaired by at most two rotations:
- if left subtree too deep …
- if data inserted in left-right grandchild ⇒ left-rotate left subtree
- rotate right
- if right subtree too deep …
- if data inserted in right-left grandchild ⇒ right-rotate right subtree
- rotate left
Problem: determining height/depth of subtrees may be expensive.
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [30/55]
Implementation of AVL insertion
insertAVL(tree,item):
| Input tree, item
| Output tree with item AVL-inserted
|
| if tree is empty then
| return new node containing item
| else if item=data(tree) then
| return tree
| else
| | if item<data(tree) then
| | left(tree)=insertAVL(left(tree),item)
| | else if item>data(tree) then
| | right(tree)=insertAVL(right(tree),item)
| | end if
| | if height(left(tree))-height(right(tree)) > 1 then
| | if item>data(left(tree)) then
| | left(tree)=rotateLeft(left(tree))
| | end if
| | tree=rotateRight(tree)
| | else if height(right(tree))-height(left(tree)) > 1 then
| | if item<data(right(tree)) then
| | right(tree)=rotateRight(right(tree))
| | end if
| | tree=rotateLeft(tree)
| | end if
| | return tree
| end if
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [31/55]
Insert 27
into the AVL tree
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [32/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [33/55]
Analysis of AVL trees:
- trees are height-balanced; subtree depths differ by +/-1
- average/worst-case search performance of O(log n)
- require extra data to be stored in each node ("height")
- may not be weight-balanced; subtree sizes may differ
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [34/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [35/55]
2-3-4 trees have three kinds of nodes
- 2-nodes, with two children (same as normal BSTs)
- 3-nodes, two values and three children
- 4-nodes, three values and four children
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [36/55]
2-3-4 trees are ordered similarly to BSTs
In a balanced 2-3-4 tree:
- all leaves are at same distance from the root
2-3-4 trees grow "upwards" by splitting 4-nodes.
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [37/55]
Possible 2-3-4 tree data structure:
typedef struct node {
int order;
int data[3];
struct node *child[4];
} node;
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [38/55]
Searching in 2-3-4 trees:
Search(tree,item):
| Input tree, item
| Output address of item if found in 2-3-4 tree
| NULL otherwise
|
| if tree is empty then
| return NULL
| else
| | i=0
| | while i<tree.order-1 and item>tree.data[i] do
| | i=i+1
| | end while
| | if item=tree.data[i] then
| | return address of tree.data[i]
| | else
| | return Search(tree.child[i],item)
| | end if
| end if
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [39/55]
2-3-4 tree searching cost analysis:
- as for other trees, worst case determined by height h
- 2-3-4 trees are always balanced ⇒ height is O(log n)
- worst case for height: all nodes are 2-nodes
same case as for balanced BSTs, i.e. h ≅ log2 n
- best case for height: all nodes are 4-nodes
balanced tree with branching factor 4, i.e. h ≅ log4 n
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [40/55]
❖ Insertion into 2-3-4 Trees | |
Starting with the root node:
repeat
- if current node is full (i.e. contains 3 items)
- split into two 2-nodes
- promote middle element to parent
- if no parent ⇒ middle element becomes the new root 2-node
- go back to parent node
- if current node is a leaf
- insert Item in this node, order++
- if current node is not a leaf
- go to child where Item belongs
until Item inserted
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [41/55]
❖ ... Insertion into 2-3-4 Trees | |
Insertion into a 2-node or 3-node:
Insertion into a 4-node (requires a split):
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [42/55]
❖ ... Insertion into 2-3-4 Trees | |
Splitting the root:
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [43/55]
❖ ... Insertion into 2-3-4 Trees | |
Building a 2-3-4 tree … 7 insertions:
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [44/55]
❖ Exercise : Insertion into 2-3-4 Tree | |
Show what happens when D, S, F, U are inserted into this tree:
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [45/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [46/55]
❖ ... Insertion into 2-3-4 Trees | |
Insertion algorithm:
insert(tree,item):
| Input 2-3-4 tree, item
| Output tree with item inserted
|
| node=root(tree), parent=NULL
| repeat
| | if node.order=4 then
| | | promote = node.data[1]
| | | nodeL = new node containing node.data[0]
| | | nodeR = new node containing node.data[2]
| | | if parent=NULL then
| | | make new 2-node root with promote,nodeL,nodeR
| | | else
| | | insert promote,nodeL,nodeR into parent
| | | increment parent.order
| | | end if
| | | node=parent
| | end if
| | if node is a leaf then
| | insert item into node
| | increment node.order
| | else
| | | parent=node
| | | if item<node.data[0] then
| | | node=node.child[0]
| | | else if item<node.data[1] then
| | | node=node.child[1]
| | | else
| | | node=node.child[2]
| | | end if
| | end if
| until item inserted
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [47/55]
❖ ... Insertion into 2-3-4 Trees | |
Variations on 2-3-4 trees …
Variation #1: why stop at 4? why not 2-3-4-5 trees? or M-way trees?
- allow nodes to hold up to M-1 items, and at least M/2
- if each node is a disk-page, then we have a B-tree (databases)
- for B-trees, depending on
Item
size, M > 100/200/400
Variation #2: don't have "variable-sized" nodes
- use standard BST nodes, augmented with one extra piece of data
- implement similar strategy as 2-3-4 trees → red-black trees.
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [48/55]
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [49/55]
Red-black trees are a representation of 2-3-4 trees using BST nodes.
- each node needs one extra value to encode link type
- but we no longer have to deal with different kinds of nodes
Link types:
- red links … combine nodes to represent 3- and 4-nodes
- black links … analogous to "ordinary" BST links (child links)
Advantages:
- standard BST search procedure works unmodified
- get benefits of 2-3-4 tree self-balancing (although deeper)
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [50/55]
Definition of a red-black tree
- a BST in which each node is marked red or black
- no two red nodes appear consecutively on any path
- a red node corresponds to a 2-3-4 sibling of its parent
- a black node corresponds to a 2-3-4 child of its parent
Balanced red-black tree
- all paths from root to leaf have same number of black nodes
Insertion algorithm: avoids worst case
O(n) behaviour
Search algorithm: standard BST search
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [51/55]
Representing 4-nodes in red-black trees:
Some texts colour the links rather than the nodes.
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [52/55]
Representing 3-nodes in red-black trees (two possibilities):
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [53/55]
Equivalent trees (one 2-3-4, one red-black):
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [54/55]
- Tree operations
- tree rotation
- tree partition
- joining trees
- Randomised insertion
- Self-adjusting trees
- Suggested reading:
- Sedgewick, Ch. 12.8-12.9
- Sedgewick, Ch. 13.1-13.4
COMP9024 (22T0) ♢ Week 03b ♢ Search Tree Algorithms ♢ [55/55]
Produced: 16 Jan 2022