From charlesreid1

See also: Balanced Search Trees

Notes

Definitions

A rooted binary tree is recursively defined as being either (1) empty, or (2) consisting of a node called root, with two rooted binary trees called the left and right subtrees.

Skiena Section 3.4

Binary search requires fast access to two elements - the media elements above and below the given node.

Search trees utilize this fact and a linked structure to create binary search trees.

Binary search tree labels each node in the binary tree with a single keys such that for any node labeled x, all of the nodes in the left subtree of x will have keys < x,while all nodes in the right subtree of x will have keys > x.

For any binary tree on n nodes and any set of n keys there is exactly one labeling that makes it a binary tsearch tree.

Implementation

Implement via:

  • left and right pointer fields
  • optional parent pointer

Operations:

  • searching
  • finding minimum/maximum element
  • traversal (in order)
  • insertion
  • deletion

balanced search trees:

  • trees exist, and can be used as black boxes.
  • suppose that we have access to a balanced dictionary or tree data structure - and go from there.
  • insert should automatically balance the tree

Weiss

Weiss, Data Structures in C++, covers binary search trees in Chapter 4, covering trees generally. This book is a good advanced data structures textbook, so it moves through trees quickly and spends most of its time on detailed implementations of types of trees.

The binary search tree covered on page 133 of the textbook provides an entire abstract data type that is as follows:

  • Class name is BinarySearchTree
  • Data member is pointer to root node (null for empty trees)
  • public member functions use general technique of calling private recursive functions
  • insert, remove, contains
  • find min, find max
  • binary node class:
    • single element of data
    • left pointer
    • right pointer

These methods are covered in more detail below. Implementations are all here: https://git.charlesreid1.com/cs/java/src/master/trees/search-trees/Weiss.java

Contains

The contains method uses Recursion. The basic approach boils down to handling four possible cases as we recurse through all the nodes in the tree:

  • We have reached an empty node, and should return false to the method that called us
  • Our element is less than the search target, and we should go left
  • Our element is greater than the search target, and we should go right
  • Our element is just right, and this node is equal to the search target
def contains(target, node):
    if(node is null):
        return false
    elif target < node.getvalue():
        return contains(target, node.left)
    elif target > node.getvalue():
        return contains(target, node.right)
    else:
        return true

findMin and findMax

private routines returning pointers to the nodes containing the minimum and maximum elements in the tree.

These both assume a sorted binary tree. This means the find minimum method just has to return the left-most element, while the find maximum method just has to return the right-most element.

Here are two non-recursive implementations of findMin and findMax. Like many recursive patterns, this non-recursive implementation uses a while loop. The body of the while loop is typically converted directly to a recursive function body.

Here is the Java implementation of non-recursive versions:

	// Non-recursive versions of findMin/findMax

	/** Private method that returns the smallest element in the tree.
	 * This assumes the tree is sorted.
	 * */
	private BinaryNode findMin(BinaryNode node) { 
		if(node!=null) { 
			while(node.getLeft() != null) {
				node = node.getLeft();
			}
		}
		return node;
	}

	/** Private method that returns the largest element in the tree. 
	 * This assumes the tree is sorted.
	 * */
	private BinaryNode findMax(BinaryNode node) { 
		if(node!=null) { 
			while(node.getRight() != null) {
				node = node.getRight();
			}
		}
		return node;
	}

and the Java implementation of recursive versions:

	// Recursive versions of findMin/findMax

	/** Private method that returns the smallest element in the tree, recursively.
	 * This assumes the tree is sorted.
	 * */
	private BinaryNode findMin_r(BinaryNode node) { 
		if(node==null){
			// base case: we're nobody
			return null;
		}
		if(node.getLeft()==null) { 
			// base case: it's us
			return node;
		}
		// recursive case:
		// keep going
		return findMin_r(node.getLleft());
	}

	/** Private method that returns the largest element in the tree, recursively. 
	 * This assumes the tree is sorted.
	 * */
	private BinaryNode findMax_r(BinaryNode node) { 
		if(node==null){
			// base case: we're nobody
			return null;
		}
		if(node.getRight()==null) { 
			// base case: it's us
			return node;
		}
		// recursive case:
		// keep going
		return findMax_r(node.getRight());
	}

Link on git.charlesreid1.com: https://git.charlesreid1.com/cs/java/src/master/trees/search-trees/Weiss.java

Insert

Insertion is conceptually simple - you insert a new node X into a tree T. You proceed down the tree as you do when searching the tree via contains() method. (See above.) If we find X, we stop immediately (no duplicates in tree.) Otherwise we insert X at the last spot in the tree that we traversed. (Where we exited the tree.)

BinTreeInsert.png

Remove

Tricky to do in a sorted binary tree. Consider the three cases:

  • No children
  • One child
  • Two children (tricky case; needs to recursively replace the node with its smallest left-most replacement)

Binary Search Tree Implementation

Implementation following Weiss: https://git.charlesreid1.com/cs/java/src/master/trees/search-trees

See Java/Binary Search Tree


Algorithmic Analysis

Let's examine some of the big-O costs of operations on binary trees.

Operations

First, let's lay out some basic operations that a binary search tree will need to define. (See also: Search Trees)

The general categories of operations implemented:

  • Navigation (moving through the tree)
  • Find/Search (finding position corresponding to a particular key)
  • Get/Set/Delete (getting/setting/deleting a node with a particular key/value in the tree)

Here's a more detailed list:

Navigation methods:

  • First node - keep traversing left in the tree until we hit the left-most node. This will be the minimum key. (This uses find min subtree method below)
  • Last node - keep traversing right in the tree until we hit the right-most node. This will be the maximum key.
  • Before position - get the position of the node whose key immediately precedes this node's key
  • After position - get the position of the node whose key comes immediately after this node's key
  • In order traversal
  • Reverse order traversal

Find:

  • Find key - find the node corresponding to key k in SUBTREE
  • Find min - find the node corresponding to the minimum value in SUBTREE
  • Find max - find the value corresponding to the max value in SUBTREE
  • Find tree min - returns first
  • Find tree max - returns first
  • Find greater than or equal to
  • Find less than or equal to
  • Find range

Get/Set/Delete:

  • Get item - Search for a key, and rebalance the tree. If we find the key, return the value at that position
  • Set item - Take a key-value pair. Start with the empty case, adding as root; otherwise,
  • Delete item - Take a key. Start with the empty case - nothing. Search the subtree; if we find something, delete it. Otherwise, rebalance the search tree.

Summary of Operations

Navigation

first

last

before(p)

after(p)

Implementing first/last:

  • If we are looking for the first or last nodes, we want the left-most or right-most node in the tree.
  • So we simply implement a while loop: while there is still a node to the left, walk left.
  • This eventually yields us the minimum node in the tree.
  • However, keep it general.
  • We will want to implement this functionality in other algorithms: find the smallest node in this subtree.
  • Implement first and last as private method, first/last in the subtrees of a given node.

Implementing before/after:

  • Let's look at the algorithm. The before and after algorithms are symmetric, but we'll do both.
  • Both algorithms examine 2 cases:
  • Case 1: right child is not null
    • Walk right once
    • Walk left until left == null
    • Return walk
  • Case 2: right child is null
    • Walk up once
    • Walk up until parent == null or walk == left(parent)
    • Return parent

Case 1 steps right once, then left left left. This yields the next element in the sequence. Case 2 walks up until we switch direction or hit the top of the tree.

Searches

A very important property of binary trees is that their structure allows us to search the tree in the same manner as a binary search algorithm. The algorithm to perform a search for a node on a binary search tree can be structured almost precisely like the binary search algorithm for an array or a list.

Three cases:

  • If target key is equal to our key, return this node
  • If target key is less than our key, call a search on left subtree
  • If target key is greater than our key, call a search on right subtree

Here is the tree search algorithm:

def tree_search(T,p,k):
    if k==p.key():
        return p
    else if k<p.key() and T.left(p) is not None:
        return tree_search(T, T.left(p), k)
    else if k>p.key() and T.right(p) is not None:
        return tree_search(T, T.right(p), k)
    return p

Insertions and Deletions

These algorithms are mostly straightforward, but with a few details that need to be taken care of.

Insertion:

  • This will utilize the search mechanism
  • We search for a key in a tree
  • If we find it, update the value
  • If we don't find it, the search function will return the nearest node, so add it to the nearest neighbor node.
  • If insert key > nearest node key, add to node's right
  • If insert key < nearest node key, add to node's left

Deletion has three cases that we will run into: the node has no children, the node has 1 child, and the node has 2 children.

  • If there are 0 children, this is the easiest case.
  • If there is 1 child, we move the child up to the removed node's former position.
  • If there are 2 children,
    • Find the largest key smaller than us (in our subtree) using function to find first node in subtree
    • Replace the position p with the preceding node before(p)d

Deletion:

  • This will also utilize the search mechanism
  • We search for a key in the tree (throw a key error if we don't find it)
  • If position p has 0 or 1 children, replace it with its child
  • If position p has 2 children,
    • Find largest key smaller than us (in our subtree) using before(p)
    • Replace p with before(p) (THE NODE ONLY, NOT THE SUBTREE)
    • Remove node before(p) from its old position (MUST have 0-1 children)

Insertion and deletion both occur in O(h) time. (Worst case: height is n, so O(n).)

Big O Analysis

Nearly every operation associated with a binary search tree is O(h), since so many operations are working from the top down or bottom up.

Checking if a key is in a tree: O(h)

Getting/setting a value from a key: O(h)

Deleting a position or deleting a key: O(h)

Finding a position from a key: O(h)

First/last: O(h)

Find min/find max: O(h)

Before/afteR: O(h)

Find lt, find le, find gt, find ge: O(h)

Find range: O(s + h) (s is the number of items in the sequence)

Iterator/reverse iterator: O(n)


Note: n is the number of items stored in the map.


Implementation Notes

Notes from combing through Goodrich's implementation:

  • asdf

Flags