Trees
From charlesreid1
Contents
Notes
Goodrich Chapter 8 Trees
Definitions
Trees - most important nonlinear data structures in computing
Relationships in trees are hierarchical
Each element in a tree has a parent element and zero or more child elements
Formally, define tree T as a set of nodes storing elements such that the nodes have a parent-child relationship satisfying the following properties:
- If T is non-empty, it has a special node, called the root of T, that has no parent.
- Each node v of T different from the root has a unique parent node w; every node with parent w is a child of w.
Relationships
Node relationships:
- Sibling nodes are nodes that share a parent
- External nodes are nodes with no children - total number is n_E
- Internal nodes are nodes with 1 or more children - total number is n_I
- Node u is an ancestor of node v is u = v or if u is an ancestor of the parent of v.
- The stubtree of T that is rooted at node v is the tree consisting of all descendants of v in T - including v itself.
Edges Paths
Edges and paths:
- An edge of three T is a pair of nodes (u,v) such that u is the parent of v or vice-versa
- A path is a set of consecutive edges - or, a set of consecutive nodes connected by edges.
- A tree is ordered if there is a meaningful linear order among the children of each node.
Example:
- Documents like books are hierarchically organized as a tree whose internal nodes are parts, chapters, sections, and leaves are paragraphs, tables, figures
Ordered/Unordered
Ordered vs unordered
- Ordered tree - the order of the sibling nodes is important
- Unordered tree - there is no precedence with respect to one sibling or the other
- Examples of ordered trees occur when there are ordered operations (e.g., siblings ordered according to order of birth)
- Examples of unordered trees occur with enumerations or functionality (org charts, divisions in a company)
Tree implementation details
Tree abstract data types:
- Things get a bit confusing from here. Goodrich separates the concept of a "Position" from the concept of a "Node."
- The Position is a location in the tree, the Node is a four-pointer container. k
Abstract class:
- Tree is an example of an abstract class implementation - see also Abstract Data Types
- In Python, each virtual function will raise a NotImplementedError("Must be implemented by subclass")
- In Python, errors in arguments pointing to invalid positions should raise a ValueError
- In Java, we define the functions to be abstract and require that inheriting objects define them. Java/Abstract Classes
The actual implementation will differ depending on the type of data we are storing and the various requirements we might have for it (e.g., a faster algorithm may be rejected due to high memory usage and memory constraints).
There are some concrete methods that can be defined, if they are independent of implementation. Some examples include:
- is_root (parent is null)
- is_leaf (number of children is 0)
- is_empty (T if no nodes in tree)
Computing depth and height:
- Depth of p is the number of ancestors of p, excluding p itself
- Depth is the distance (in other nodes) to the root node
- The height of a node is the maximum height of each of the node's children
- The height of the entire tree is the maximum depth of its leaf positions.
Algorithmic analysis
To analyze the calculation of the height of a tree, we will analyze two methods.
The first calls the depth function on each leaf node, and determines the maximum result returned. This returns the total height of the tree. Using some Python-like pseudocode, we can express this using a list comprehension:
def _height(self): return max(self.depth(p) for p in self.positions() if self.is_leaf(p))
This will call the depth function on each tree, and the worst-case depth is n, the number of nodes in the tree.
That means we're doing
Height function runs in time
An alternative method utilizes recursion:
def _height2(self,p): if(self.is_leaf(p)): return 0 else: return 1 + max(self._height2(c) for c in self.children(p)) # for each child, compute height, and return the maximum height + 1
Binary trees
See Binary Trees.
Linked implementation: see Binary Trees/LinkedBinTree.
Array implementation: see Binary Trees/ArrayBinTree.
Tree Traversal
Algorithms for tree traversal:
- preorder - perform the visit action for a node, then perform the visit action for all of the node's children (top-down) (recursive)
- postorder - recurse to the leaf nodes first, perform visit action, and work way back toward root node (bottom-up) (recursive)
- inorder - recurse to the left nodes first, then the node itself, then the right nodes (recursive)
- breadth-first - move layer by layer, popping a node and putting its children into the queue.
Pesudocode for preorder traversal:
def preorder(tree, position): perform visit action for position for each child in tree.children(position) do: preorder(tree, child)
Pseudocode for postorder traversal:
def preorder(tree, position): for each child in tree.children(position) do: preorder(tree,child) perform visit action for position
Pseudocode for breadth first traversal:
def breadthfirst(tree): initialize queue q to contain tree.root while q not empty: p = q.dequeue() perform "visit" action for position p for each child in tree.children(p): q.enqueue(child)
Inorder traversal pseudocode:
def inorder(p): if p has left child: inorder(left_child) perform "visit" action for position p if p has right child: inorder(right_child)
Implementing traversals - iterators
The best way to implement a tree traversal for a Tree object is to use built-in iterators for the object.
In Python, that means using Tree.__iter__:
class Tree: def __iter__(self): for p in self.positions(): yield p.element()
Note that we return the element, not the node or position.
Tree Traversal Runtime Analysis
Preorder and postorder traversal algorithms are both efficient.
To do these traversals, at each position p, nonrecursive portion requires number of statements equal to number of children
This is performed n times, for an overall O(n) algorithm
Code
Implementation
Further notes from Goodrich et al, Data Structures in Python, Chapter 8 Trees.
charlesreid1 git repo: https://git.charlesreid1.com/cs/java/src/master/trees
We utilize the patterns that Goodrich utilized, to flex our OOP muscles, and create some interfaces and abstract classes for our Tree classes.
The interfaces defined are:
- Tree (abstract) - see Trees/ADT - top level interface, defines abstract methods that all trees should implement
- BinaryTree (abstract) - see Binary Trees/ADT - inherits from Tree ADT, creates more abstract methods that the tree should implement
Abstract classes are:
- Abstract Tree ("concrete" implementations of a few methods) - see Trees/ADT#Concrete
- Abstract Binary Tree (more "concrete" implementations of whatever is possible) - see Binary Trees/ADT
Concrete classes:
- Linked Binary Tree - linked memory structure for our binary tree, implements the interfaces and methods above in a concrete way - see Binary Trees/LinkedBinTree
- Array Binary Tree - array-based storage for binary tree - see Binary Trees/ArrayBinTree
When we write a class that implements these methods for a particular data structure, we call it a concrete class.
Example of concrete tree implementation: Binary Trees/LinkedBinTree
Trees in the wild
Some places where you can spot trees in the wild:
Open source software
Binary Tree Map implementation in Apache Ignite: https://github.com/apache/ignite/blob/cc6257f8c53da5f6da32b0f184065c1e35a75acf/modules/core/src/main/java/org/apache/ignite/internal/binary/BinaryTreeMap.java
Apache Ignite: open-source in-memory computing platform "for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies." https://ignite.apache.org/
Binary Tree and Red Black Tree implementations in Apache Uima: https://github.com/apache/uima-uimaj/blob/40c4aefb9c7e7cd0234e6bbc76e8f960ec8d7e7d/uimaj-core/src/main/java/org/apache/uima/internal/util/rb_trees/RedBlackTree.java
(UIMA = Unstructured Information Management Applications, woo, https://uima.apache.org/)
Java implementations
tree algos: https://github.com/ignl/BinarySearchTrees/tree/master/Trees/src/main/java/org/intelligentjava/algos/trees
very extensive abstract binary search tree: https://github.com/ignl/BinarySearchTrees/blob/master/Trees/src/main/java/org/intelligentjava/algos/trees/AbstractBinarySearchTree.java
red black tree: https://github.com/ignl/BinarySearchTrees/blob/master/Trees/src/main/java/org/intelligentjava/algos/trees/RedBlackTree.java
Flags
Trees Part of Computer Science Notes
Series on Data Structures Abstract data type: Trees/ADT Concrete implementations: Trees/LinkedTree · Trees/ArrayTree · SimpleTree
Tree Traversal Preorder traversal: Trees/Preorder Postorder traversal: Trees/Postorder In-Order traversal: Binary Trees/Inorder Breadth-First Search: BFS Breadth-First Traversal: BFT Depth-First Search: DFS Depth-First Traversal: DFT OOP Principles for Traversal: Tree Traversal/OOP · Tree Traversal/Traversal Method Template Tree operations: Trees/Operations Performance · Trees/Removal
Tree Applications Finding Minimum in Log N Time: Tree/LogN Min Search
Abstract data type: Binary Trees/ADT Concrete implementations: Binary Trees/LinkedBinTree · Binary Trees/ArrayBinTree Binary Trees/Cheat Sheet · Binary Trees/OOP · Binary Trees/Implementation Notes
|