Dictionaries: Difference between revisions
From charlesreid1
| Line 57: | Line 57: | ||
===Question 3-6=== | ===Question 3-6=== | ||
Question 3-6) Describe how to modify any balanced tree structure such that search, insert, delete, min, max each take the expected amount of time, O(log n), but sucessor and predecessor methods now take O(1) time each. | '''Question 3-6) Describe how to modify any balanced tree structure such that search, insert, delete, min, max each take the expected amount of time, O(log n), but sucessor and predecessor methods now take O(1) time each. What modifications are required, and to which methods?''' | ||
The methods: | |||
* Predecessor(D,k) - returns the predecessor key (in order) to the given key k | |||
* Successor(D,k) - returns the successor key to the given key k | |||
Question is, how to we implement neighbor lookup in O(1) time in a balanced tree? | |||
Two options: | |||
* Option one - add a previous and next node pointers. This increases the complexity of the bookkeeping, and of the add/insert/remove methods, which now have to traverse the tree to fix their links. | |||
* Option two - make a data tree that is twice as big, and only holds data in the bottom-most leaf nodes. This obviates the need for a next and previous pointer, because the tree is easy to navigate - up and over... at most log(n) operations to traverse tree. | |||
Option one: | |||
We would need to modify the add() and insert() method, the delete method, no need to modify search or min or max methods. | We would need to modify the add() and insert() method, the delete method, no need to modify search or min or max methods. | ||
We modify them to keep track of the previous node. Finding the previous node would be somewhat tricky logic. When we add a new node, find its next and previous nodes, and link them correctly. | We modify them to keep track of the previous node. Finding the previous node would be somewhat tricky logic. When we add a new node, find its next and previous nodes, and link them correctly. | ||
Option two: | |||
It would be algorithmically easier (but more expensive and more complicated implementation wise) to use a tree structure with only data in the leaf nodes. Finding/keeping track of next or previous node then doesn't require the extra two pointers and the extra logic of traversing a binary tree, it just requires twice the number of nodes (tree of size N has N/2 leaf nodes). | It would be algorithmically easier (but more expensive and more complicated implementation wise) to use a tree structure with only data in the leaf nodes. Finding/keeping track of next or previous node then doesn't require the extra two pointers and the extra logic of traversing a binary tree, it just requires twice the number of nodes (tree of size N has N/2 leaf nodes). | ||
This also requires implementation of the entire data structure again, and it is not so obvious how you keep a tree structure like that balanced and dynamically resized. | |||
=Flags= | =Flags= | ||
Revision as of 06:17, 5 June 2017
Notes
Dictionary data type - definition
Dictionaries are data structures that store data in such a way that it can be looked up by value, rather than by index/position/reference.
These types of data sets naturally lend themselves to a structure that partitions data based on its sort value, as in a binary tree. This allows operations to happen faster, because the lookup key itself guides the process of finding the corresponding data in the data structure.
Skiena Chapter 3 notes
This entire chapter is abound with practical advice about data structures. Principally - use black boxes; focus on big-O analysis; the data structure can always be made more efficient. And use binary trees and hash tables to store your variables by value! These are relentlessly practical and useful data structures because lookups are O(1).
Dictionaries definitions
Unlike stacks and queues, which allow access to items in the data structure independent of their value, a dictionary provides access to an object using its value as a way of obtaining a reference to it in the data structure.
Dictionaries interface
Dictionaries implement the following operations:
- search(d,k) - given a search key k, preeturn a pointer to the element in d corresponding to key value k (if one exists)
- insert(d,x) - given a data item, add it to the set of data
- delete(d,x) - given a pointer to a given data item x in the dictionary d, remove it from d
additional useful operations:
- max(d) / min(d) - retrieve item with largest or smallest key - enabling dictionary to serve as priority queue
- predecessor(d,k) / successor(d,k) - retrieve the item from d whose key is before or after k in sorted order (enables iteration)
Example: to remove all duplicate names from a mailing list, and print the results in sorted order, initialize an empty dictionary, iterate the through the list, and add each item as a key, unless it already exists in the dictionary. (Note that there are many operations happening here, so the more of them we can make O(1), the better.)
Once we are finished, extract the remaining names out of the dictionary to print them in sorted order. Get the first item via min(d) and repeatedly call successor until obtain max(d).
Dictionary implementations based on arrays
Dictionary implementations based on linked lists
Questions
Skiena Chapter 3 questions
Question 3-4
Question 3-4) Design a dictionary structure in which search, insertion, and deletion can be handled in O(1) time.
Assume set elements are from 1..n)
(Initialization can take O(n) time.)
- Well, if this search has to take O(1) time, we need to know EXACTLY where things are, given their value.
- The problem seems to indicate either a hash table, or an off-by-one array.
- Off-by-one array: the numbers passed in from 1 to n are represented in an array of length n, with the kth number represented by the k-1th cell.
- hash table: key value is given, turns into unique integer, looked up in O(1) time
Question 3-6
Question 3-6) Describe how to modify any balanced tree structure such that search, insert, delete, min, max each take the expected amount of time, O(log n), but sucessor and predecessor methods now take O(1) time each. What modifications are required, and to which methods?
The methods:
- Predecessor(D,k) - returns the predecessor key (in order) to the given key k
- Successor(D,k) - returns the successor key to the given key k
Question is, how to we implement neighbor lookup in O(1) time in a balanced tree?
Two options:
- Option one - add a previous and next node pointers. This increases the complexity of the bookkeeping, and of the add/insert/remove methods, which now have to traverse the tree to fix their links.
- Option two - make a data tree that is twice as big, and only holds data in the bottom-most leaf nodes. This obviates the need for a next and previous pointer, because the tree is easy to navigate - up and over... at most log(n) operations to traverse tree.
Option one:
We would need to modify the add() and insert() method, the delete method, no need to modify search or min or max methods.
We modify them to keep track of the previous node. Finding the previous node would be somewhat tricky logic. When we add a new node, find its next and previous nodes, and link them correctly.
Option two:
It would be algorithmically easier (but more expensive and more complicated implementation wise) to use a tree structure with only data in the leaf nodes. Finding/keeping track of next or previous node then doesn't require the extra two pointers and the extra logic of traversing a binary tree, it just requires twice the number of nodes (tree of size N has N/2 leaf nodes).
This also requires implementation of the entire data structure again, and it is not so obvious how you keep a tree structure like that balanced and dynamically resized.
Flags
| Data Structures Part of Computer Science Notes
This is the staging ground for computer science notes as part of the 2017 CS Study Plan.
Classes of data structures: Abstract Data Types Array-based and Link-based memory management: ArrayLists and Linked Lists Algorithmic Analysis of Data Structures: Algorithmic Analysis of Data Structures Advanced data structures: Advanced Data Structures
|
| Arrays Part of Computer Science Notes
Series on Data Structures Python: Arrays/Python · Arrays/Python/Sizeof · Arrays/Python/AppendCost · Arrays/Python/CaesarCipher · Arrays/Python/CompactArrays · Arrays/Python/DynamicArray Java: Arrays/Java · Arrays/Java/CaesarCipher · Arrays/Java/FisherYates · Arrays/Java/PythonList · Arrays/Java/Repeatedly_Remove Categories: Category:Python Arrays
|
| Stacks and Queues Part of Computer Science Notes
Series on Data Structures
Stacks and Queues: Python StacksQueues/Python · StacksQueues/Python/ArrayStack · StacksQueues/Python/ArrayQueue · StacksQueues/Python/ArrayDeque StacksQueues/Python/LinkedStack
Stacks and Queues: Java StacksQueues/Java · StacksQueues/Java/ArrayStack · StacksQueues/Java/ArrayQueue · StacksQueues/Java/ArrayQueueFS · StacksQueues/Java/ArrayDeque StacksQueues/Java/LinkedStack · StacksQueues/Java/LinkedQueue · StacksQueues/Java/LinkedDeque
Applications Postfix_Expressions#Stacks · StacksQueues/Subsets · StacksQueues/Subsets/Java
|
| Priority Queues and Heaps Part of Computer Science Notes
Series on Data Structures
Java: Priority Queues/Java · Priority Queues/ADT · Priority Queues/Sorted · Priority Queues/Unsorted Performance: Priority Queues/Timing and Performance Applications: Maximum Oriented Priority Queue · Priority Queues/Stack
Priority Queues/Heap · Priority Queues/Java · Priority Queues/Comparators
|
| Linked List Part of Computer Science Notes
Series on Data Structures Java: Linked Lists/Java · Linked Lists/Java/Single · Linked Lists/Java/Double · Linked Lists/Java/Circular Performance: Linked Lists/Java/Timing · Linked Lists/Java/Reverse Python: Linked Lists/Python · Linked Lists/Python/Single
|
| Trees Part of Computer Science Notes
Series on Data Structures Abstract data type: Trees/ADT Concrete implementations: Trees/LinkedTree · Trees/ArrayTree · SimpleTree
Tree Traversal Preorder traversal: Trees/Preorder Postorder traversal: Trees/Postorder In-Order traversal: Binary Trees/Inorder Breadth-First Search: BFS Breadth-First Traversal: BFT Depth-First Search: DFS Depth-First Traversal: DFT OOP Principles for Traversal: Tree Traversal/OOP · Tree Traversal/Traversal Method Template Tree operations: Trees/Operations Performance · Trees/Removal
Tree Applications Finding Minimum in Log N Time: Tree/LogN Min Search
Abstract data type: Binary Trees/ADT Concrete implementations: Binary Trees/LinkedBinTree · Binary Trees/ArrayBinTree Binary Trees/Cheat Sheet · Binary Trees/OOP · Binary Trees/Implementation Notes
|
| Search Trees Part of Computer Science Notes
Series on Data Structures
Binary Search Trees · Balanced Search Trees Trees/OOP · Search Trees/OOP · Tree Traversal/OOP · Binary Trees/Inorder
(Note that heaps are also value-sorting trees with minimums at the top. See Template:PriorityQueuesFlag and Priority Queues.)
|
| Maps and Dictionaries Part of Computer Science Notes
Series on Data Structures
Maps/Dictionaries Maps · Maps/ADT · Maps in Java · Maps/OOP · Maps/Operations and Performance Map implementations: Maps/AbstractMap · Maps/UnsortedArrayMap · Maps/SortedArrayMap Dictionary implementations: Dictionaries/LinkedDict · Dictionaries/ArrayDict
Hashes Hash Maps/OOP · Hash Maps/Operations and Performance Hash Maps/Dynamic Resizing · Hash Maps/Collision Handling with Chaining Hash functions: Hash Functions · Hash Functions/Cyclic Permutation Hash map implementations: Hash Maps/AbstractHashMap · Hash Maps/ChainedHashMap
Skip Lists · Java/ConcurrentSkipList · Java implementations: SkipList
Sets Sets · Sets/ADT · Sets in Java · Sets/OOP · Multisets
|