Arrays/Java/Finding Duplicates
From charlesreid1
Contents
Questions
Finding one duplicate
Question C-3.17) A is an array of size n>=2 containing integers from 1 to n-1, inclusive, one of which is repeated. Describe an algorithm for finding the integer in A that is repeated.
A special case of this question is when array A is sorted, since then we know that each array element should correspond to its own index plus 1, and we can do a binary search to locate the array index that does not match the criteria. In this case, the runtime is O(log n).
Finding multiple duplicates
Question C-3.18) Let B be an array of size n>=6 containing integers from 1 to n-5, inclusive. Five of the integers are repeated. Describe an algorithm for finding the five integers in B that are repeated.
Solution approach
There are a few different solution approaches, depending on the problem constraints:
- Is the array sorted, or unsorted?
- What if we have a one-time O(n) cost of construction?
- Are we after a space-efficient or operationally-efficient algorithm?
- What kind of data - primitive types or objects?
- Do we have to use an array? Why not use a collections object?
(Square-bracket indexing is nice to have.)
Single element case
The cost of finding duplicates depends on problem parameters and tradeoffs. Assuming we want the absolute fastest algorithm, we are looking for a constant-time or logarithmic algorithm. (There are no O(1) algorithms here.)
Log time algorithms:
- O(n log n) one-time cost to sort
- O(log n) lookup time in a binary tree (less memory)
- O(1) lookup time in a hash table (more memory)
In the particular case given, of the integers 1 through n corresponding to indices 0 through (n-1), we are looking for an element that meets a particular criteria: normally in binary sort the criteria is a (greater than/less than/equal to) condition. Here, the criteria is the place where the value of element i (at zero-indexed i) changes from i+1 to i. The binary search algorithm can be outfitted to find this change point, using its head, tail, and middle pointers.
Linear time algorithms:
- O(n) add each element to a set, checking membership each time. O(1) lookup for membership in a set, iterating one item at a time, leads to O(n) time.
One-time expense algorithms:
- O(1) lookup for O(n) cost of building a hash table, at a potentially high memory cost.
- O(log n) binary search lookup for O(n log n) cost of sorting into a binary tree or similar structure, more efficient memory usage.
If we are going to be moving items in and out, what's the best approach? It all depends... Let's explore.
NOTE: Remember the ghost log. The cost of doing something with frequency of 1/j for j=1 to n is the harmonic number, which is O(log n).
Java Implementation:
- Use a HashMap, with the key being the element in the array, the value being an integer counter.
Multi element case
In the multi-element duplicate case:
Code
| Data StructuresPart of Computer Science Notes This is the staging ground for computer science notes as part of the 2017 CS Study Plan. 
 Classes of data structures: Abstract Data Types Array-based and Link-based memory management: ArrayLists and Linked Lists Algorithmic Analysis of Data Structures: Algorithmic Analysis of Data Structures Advanced data structures: Advanced Data Structures 
 | 
| ArraysPart of Computer Science Notes Series on Data Structures Python: Arrays/Python · Arrays/Python/Sizeof · Arrays/Python/AppendCost · Arrays/Python/CaesarCipher · Arrays/Python/CompactArrays · Arrays/Python/DynamicArray Java: Arrays/Java · Arrays/Java/CaesarCipher · Arrays/Java/FisherYates · Arrays/Java/PythonList · Arrays/Java/Repeatedly_Remove Categories: Category:Python Arrays 
 | 
| Stacks and QueuesPart of Computer Science Notes Series on Data Structures 
 
 Stacks and Queues: Python StacksQueues/Python · StacksQueues/Python/ArrayStack · StacksQueues/Python/ArrayQueue · StacksQueues/Python/ArrayDeque StacksQueues/Python/LinkedStack 
 Stacks and Queues: Java StacksQueues/Java · StacksQueues/Java/ArrayStack · StacksQueues/Java/ArrayQueue · StacksQueues/Java/ArrayQueueFS · StacksQueues/Java/ArrayDeque StacksQueues/Java/LinkedStack · StacksQueues/Java/LinkedQueue · StacksQueues/Java/LinkedDeque 
 Applications Postfix_Expressions#Stacks · StacksQueues/Subsets · StacksQueues/Subsets/Java 
 | 
| Priority Queues and HeapsPart of Computer Science Notes Series on Data Structures 
 Java: Priority Queues/Java · Priority Queues/ADT · Priority Queues/Sorted · Priority Queues/Unsorted Performance: Priority Queues/Timing and Performance Applications: Maximum Oriented Priority Queue · Priority Queues/Stack 
 Priority Queues/Heap · Priority Queues/Java · Priority Queues/Comparators 
 | 
| Linked ListPart of Computer Science Notes Series on Data Structures Java: Linked Lists/Java · Linked Lists/Java/Single · Linked Lists/Java/Double · Linked Lists/Java/Circular Performance: Linked Lists/Java/Timing · Linked Lists/Java/Reverse Python: Linked Lists/Python · Linked Lists/Python/Single 
 | 
| TreesPart of Computer Science Notes Series on Data Structures Abstract data type: Trees/ADT Concrete implementations: Trees/LinkedTree · Trees/ArrayTree · SimpleTree 
 Tree Traversal Preorder traversal: Trees/Preorder Postorder traversal: Trees/Postorder In-Order traversal: Binary Trees/Inorder Breadth-First Search: BFS Breadth-First Traversal: BFT Depth-First Search: DFS Depth-First Traversal: DFT OOP Principles for Traversal: Tree Traversal/OOP · Tree Traversal/Traversal Method Template Tree operations: Trees/Operations Performance · Trees/Removal 
 Tree Applications Finding Minimum in Log N Time: Tree/LogN Min Search 
 Abstract data type: Binary Trees/ADT Concrete implementations: Binary Trees/LinkedBinTree · Binary Trees/ArrayBinTree Binary Trees/Cheat Sheet · Binary Trees/OOP · Binary Trees/Implementation Notes 
 | 
| Search TreesPart of Computer Science Notes Series on Data Structures 
 Binary Search Trees · Balanced Search Trees Trees/OOP · Search Trees/OOP · Tree Traversal/OOP · Binary Trees/Inorder 
 (Note that heaps are also value-sorting trees with minimums at the top. See Template:PriorityQueuesFlag and Priority Queues.) 
 | 
| Maps and DictionariesPart of Computer Science Notes Series on Data Structures 
 
 Maps/Dictionaries Maps · Maps/ADT · Maps in Java · Maps/OOP · Maps/Operations and Performance Map implementations: Maps/AbstractMap · Maps/UnsortedArrayMap · Maps/SortedArrayMap Dictionary implementations: Dictionaries/LinkedDict · Dictionaries/ArrayDict 
 Hashes Hash Maps/OOP · Hash Maps/Operations and Performance Hash Maps/Dynamic Resizing · Hash Maps/Collision Handling with Chaining Hash functions: Hash Functions · Hash Functions/Cyclic Permutation Hash map implementations: Hash Maps/AbstractHashMap · Hash Maps/ChainedHashMap 
 Skip Lists · Java/ConcurrentSkipList · Java implementations: SkipList 
 Sets Sets · Sets/ADT · Sets in Java · Sets/OOP · Multisets 
 | 

