Week 10


Deletion from BSTs1/58

Insertion into a binary search tree is easy.

Deletion from a binary search tree is harder.

Four cases to consider ...


... Deletion from BSTs2/58

Case 1: value to be deleted is a leaf (zero subtrees)

[Diagram:Pics/trees/del-k-small.png]


... Deletion from BSTs3/58

Case 1: value to be deleted is a leaf (zero subtrees)

[Diagram:Pics/trees/del1-k-small.png]


... Deletion from BSTs4/58

Case 2: value to be deleted has one subtree

[Diagram:Pics/trees/del-p-small.png]


... Deletion from BSTs5/58

Case 2: value to be deleted has one subtree

[Diagram:Pics/trees/del1-p-small.png]


... Deletion from BSTs6/58

Case 3a: value to be deleted has two subtrees

[Diagram:Pics/trees/del-m-small.png]

Replace deleted node by its immediate successor.


... Deletion from BSTs7/58

Case 3a: value to be deleted has two subtrees

[Diagram:Pics/trees/del2-m-small.png]


... Deletion from BSTs8/58

Case 3b: value to be deleted has two subtrees

[Diagram:Pics/trees/del0-m-small.png]

Merge subtrees of deleted node; replace deleted node by merged tree.


... Deletion from BSTs9/58

Case 3b: value to be deleted has two subtrees

[Diagram:Pics/trees/del1-m-small.png]


Exercise 1: Deletion from Search Trees10/58

Implement the deletion function

BSTree BSTreeDelete(BSTree t, Key k)

which returns a new tree that does not contain k.

Can be done recursively ...

But eventually we need to remove the root of some subtree.


New Binary Search Tree ADT11/58

Now called simply Tree, and defined as:

// Item, Key, Node, Link, Tree types as before

// operations on keys
#define cmp(k1,k2) ((k1) - (k2))
#define lt(k1,k2) (cmp(k1,k2) < 0)
#define eq(k1,k2) (cmp(k1,k2) == 0)
#define gt(k1,k2) (cmp(k1,k2) > 0)

// standard tree operations
Tree newTree();
Tree TreeInsert(Tree, Item);
Tree TreeDelete(Tree, Key);
int  TreeFind(Tree, Key);


... New Binary Search Tree ADT12/58

// more standard tree operations
void dropTree(Tree);
void showTree(Tree);
int  TreeDepth(Tree);
int  TreeNumNodes(Tree);

// normally, internal to ADT
Link rotateR(Link);
Link rotateL(Link);
Tree rebalance(Tree);
Item *get_ith(Tree, int);
Tree partition(Tree, int);

Tree insertAtRoot(Tree, Item);
Tree insertRandom(Tree, Item);


TreeLab13/58

A shell for manipulating binary search trees

Commands:

n N Ord Seed = make a new tree
i N = insert N into tree
I N = insert N into tree at root
d N = delete N from tree
f N = search for N in tree
g I = get the i'th element in tree
p I = partition tree around i'th element
R = rotate tree right around root
L = rotate tree left around root
q = quit


... TreeLab14/58

Usage:   ./tlab #Nodes Order Seed

Possible orders to supply values for insertion

Seed = starting value for pseudo-random number generator


Exercise 2: Generating Values in Prefix Order15/58

Write a function that generates prefix order sequence

Function interface:

void mkprefix(int *v, int N, int lo, int hi)

E.g.   lo..hi = 1..7   ⇒   4 2 1 3 6 5 7


Balanced Trees


Balanced Binary Search Trees17/58

Goal: build binary search trees which have

Perfectly balanced tree with N nodes has Three approaches to improving worst case search in BSTs:


Operations for Rebalancing18/58

New ops to assist with rebalancing: rotation, insert-at-root

[Diagram:Pics/trees/left-right-rotation-small.png]


... Operations for Rebalancing19/58

[Diagram:Pics/trees/left-right-rotation-small.png]

Link rotateR(Link n1)
{
   if (n1 == NULL) return NULL;
   Link n2 = n1->left;
   if (n2 == NULL) return n1;
   n1->left = n2->right;
   n2->right = n1;
   return n2;
}

Left rotation is similar with n1/n2 and left/right switched


Exercise 3: Tree Rotations20/58

Incorporate rotateL() and rotateR() into TreeLab


Insertion at Root21/58

Previous description of BSTs inserted at leaves.

Different approach: insert new value at root.

Method for inserting at root (recursive):


... Insertion at Root22/58

[Diagram:Pics/trees/insert-at-root-small.png]


... Insertion at Root23/58

Function for insert-at-root:

Tree insertAtRoot(Tree t, Item it)
{ 
   if (t == NULL) return newNode(item);
   int diff = cmp(key(it), key(t->value));
   if (diff == 0)
      t->value = it;
   else if (diff < 0) {
      t->left = insertAtRoot(t->left, it);
      t = rotateR(t);
   }
   else if (diff > 0) {
      t->right = insertAtRoot(t->right, it);
      t = rotateL(t);
   }
   return t;
}


Exercise 4: Insertion at Root24/58

Incorporate insertAtRoot() into TreeLab


... Insertion at Root25/58

Analysis of insertion-at-root:


Randomized BST Insertion26/58

Effects of order of insertion on BST shape:

BST ADT typically has no control over order keys supplied.

Can the algorithm itself introduce some randomness?

Exercise: check the best/worst/average cases in TreeLab


... Randomized BST Insertion27/58

Approach: normally do leaf insert, randomly do root insert.

Tree insertRandom(Tree t, Item it)
{
   if (t == NULL) return newNode(it);

   int chance = rand() % (size(t)+1);
   if (chance < K) return insertAtRoot(t, it);
   int diff = cmp(key(it), key(t->value));
   if (diff == 0)
      t->value = it;
   else if (diff < 0)
      t->left = insert(t->left, it);
   else if (diff > 0)
      t->right = insert(t->right, it);
   t->nnodes = count(t);
   return t;
}


Exercise 5: Randomized Insertion28/58

Incorporate insertRandom() into TreeLab


... Randomized BST Insertion29/58

Cost analysis:

Approach can also be applied to deletion:


Warning: Notation Change30/58

From now on, we use the following function names:

// return depth of Tree (was TreeDepth)
int depth(Tree t) { ... }

// return #nodes in Tree (was TreeNumNodes)
int size(Tree t) { ... }

// return #nodes in Tree (see later)
int count(Tree t) { ... }

This amounts to changing the Tree ADT interface ... bad for clients.


Tree Size31/58

New functions for determining tree size:

// efficient; use outside Tree-changing functions
int size(Tree t) 
{
   return (t == NULL) ? 0 : t->nnodes;
}
// inefficient; use while making chages to Tree
int count(Tree t)
{
   if (t == NULL)
      return 0;
   else
      return 1 + count(t->left) + count(t->right);
}


Joining Two Trees32/58

Tree operations so far have involved just one tree.

An operation on two trees:   t = join(t1,t2)

// Pre-conditions:
// takes two BSTs; returns a single BST
// max(key(t1)) < min(key(t2))

Tree join(Tree t1, Tree t2) { ...  }

// Post-conditions:
// result is a BST (i.e. fully ordered)
// result contains all items from t1 and t2

Allows an alternative formulation of delete()


... Joining Two Trees33/58

Method for performing tree-join:

Advantage: doesn't increase depth of tree significantly

x ≤ depth(t) ≤ x+1, where x = max(depth(t1),depth(t2))

Variation: choose deeper subtree; take root from there.


... Joining Two Trees34/58

Joining two trees:

[Diagram:Pics/trees/join-op-small.png]

 

Note: t2' may be less deep than t2


... Joining Two Trees35/58

Implementation of tree-join:

Tree join(Tree t1, Tree t2)
{
   if (t1 == NULL) return t2;
   if (t2 == NULL) return t1;
   Node *curr = t2; Node *parent = NULL;
   // find inorder successor
   while (curr->left != NULL) {
      parent = curr;
      curr = curr->left;
   }
   if (parent != NULL) {
      // unlink curr from parent
      parent->left = curr->right;
      curr->right = t2;
   }
   curr->left = t1;
   return curr;
}


... Joining Two Trees36/58

Implementation of deletion using tree join:

Tree delete(Tree t, Key k)
{
   if (t == NULL) return NULL;
   int diff = cmp(k, keyOf(t->value));
   if (diff < 0)
      t->left = delete(t->left, k);
   else if (diff > 0)
      t->right = delete(t->right, k);
   else { // found an item with key k
      Tree n;
      if (t->left == NULL && t->right == NULL)
         n = NULL;
      else if (t->left == NULL)
         n = t->right;
      else if (t->right == NULL)
         n = t->left;
      else
         n = join(t->left, t->right);
      free(t);
      t = n;
   }
   return t;
}


Exercise 6: Tree Join37/58

The tree join operation defined above required

How might we join two trees if this doesn't hold?

You may use existing operations; you must retain

Tree join(tree t1, Tree t2) { ... }


Splay Trees38/58

Splay tree insertion modifies insertion-at-root method:

The idea: appropriate double-rotations improve tree balance.

Splay tree implementations also do rotation-in-search:


... Splay Trees39/58

Cases for splay tree double-rotations:

[Diagram:Pics/trees/splay-cases-small.png]


... Splay Trees40/58

Double-rotation case for left-child of left-child:

[Diagram:Pics/trees/splay-left-left-small.png]


... Splay Trees41/58

Double-rotation case for right-child of left-child:

[Diagram:Pics/trees/splay-right-left-small.png]


... Splay Trees42/58

Gives good overall (amortized) cost:

Need empirical analysis to to determine how much better.

But ... still has worst case search cost O(N)


Real Balanced Trees


Better Balanced Binary Search Trees44/58

So far, we have seen ...

Ideally, we want both average/worst case to be O(logN)


AVL Trees


AVL Trees46/58

Approach:

A tree is unbalanced when: abs(depth(left)-depth(right)) > 1

This can be repaired by a single rotation:

Problem: determining height of subtrees may be expensive.


... AVL Trees47/58

Implementation of AVL insertion

Tree insertAVL(Tree t, Item it)
{
   if (t == NULL) return newNode(it);
   int diff = cmp(key(it), key(t->value));
   if (diff == 0)
      t->value = it;
   else if (diff < 0)
      t->left = insertAVL(t->left, it);
   else if (diff > 0)
      t->right = insertAVL(t->right, it);
   int dL = depth(t->left);
   int dR = depth(t->right);
   if ((dL - dR) > 1) t = rotateR(t);
   else if ((dR - dL) > 1) t = rotateL(t);
   return t;
}


Exercise 7: AVL insertion48/58

Incorporate insertAVL() into TreeLab

Why is it inefficient? Could assist by storing height of subtree in each node


... AVL Trees49/58

Analysis of AVL trees:

[Diagram:Pics/trees/height-weight-small.png]


2-3-4 Trees


2-3-4 Trees51/58

2-3-4 trees have three kinds of nodes

[Diagram:Pics/trees/2-3-4-tree-small.png]


... 2-3-4 Trees52/58

2-3-4 trees are ordered similarly to BSTs

[Diagram:Pics/trees/2-3-4-order-small.png]

In a balanced 2-3-4 tree:

2-3-4 trees grow "upwards" from the leaves.


... 2-3-4 Trees53/58

2-3-4 tree implementation:

typedef struct node Node;
typedef struct node *Tree;
struct node {
    int  order;    // 2, 3 or 4
    Item data[3];  // items in node
    Tree child[4]; // links to subtrees
};

Note: we change the name value to data here.


... 2-3-4 Trees54/58

Make a new 2-3-4 node (always order 2):

Node *newNode(Item it)
{
    Node *new = malloc(sizeof(Node));
    assert(new != NULL);
    new->order = 2;
    new->data[0] = it;
    new->child[0] = new->child[1] = NULL;
    return new;
}


... 2-3-4 Trees55/58

Searching in 2-3-4 trees:

Item *search(Tree t, Key k)
{
    if (t == NULL) return NULL;
    int i;  int diff;  int nitems = t->order-1;
    // find relevant slot in items
    for (i = 0; i < nitems; i++) {
        diff = compare(k, keyOf(t->data[i]));
        if (diff <= 0) break;
    }    
    if (diff == 0)
        // match; return result
        return &(t->data[i]);
    else
        // keep looking in relevant subtree
        return search(t->child[i], k);
}


... 2-3-4 Trees56/58

2-3-4 tree searching cost analysis:


... 2-3-4 Trees57/58

Building a 2-3-4 tree ... 7 insertions:

[Diagram:Pics/trees/2-3-4-insert-small.png]


Exercise 8: Insertion into 2-3-4 Tree58/58

Show what happens when S then U are inserted into this tree:

 

[Diagram:Pics/trees/2-3-4-exercise-small.png]


Produced: 4 Oct 2017