# Arrays/Java/Finding Duplicates

### From charlesreid1

## Contents

## Questions

### Finding one duplicate

Question C-3.17) A is an array of size n>=2 containing integers from 1 to n-1, inclusive, one of which is repeated. Describe an algorithm for finding the integer in A that is repeated.

A special case of this question is when array A is sorted, since then we know that each array element should correspond to its own index plus 1, and we can do a binary search to locate the array index that does not match the criteria. In this case, the runtime is O(log n).

### Finding multiple duplicates

Question C-3.18) Let B be an array of size n>=6 containing integers from 1 to n-5, inclusive. Five of the integers are repeated. Describe an algorithm for finding the five integers in B that are repeated.

## Solution approach

There are a few different solution approaches, depending on the problem constraints:

- Is the array sorted, or unsorted?
- What if we have a one-time O(n) cost of construction?
- Are we after a space-efficient or operationally-efficient algorithm?
- What kind of data - primitive types or objects?
- Do we have to use an array? Why not use a collections object?

(Square-bracket indexing is nice to have.)

### Single element case

The cost of finding duplicates depends on problem parameters and tradeoffs. Assuming we want the absolute fastest algorithm, we are looking for a constant-time or logarithmic algorithm. (There are no O(1) algorithms here.)

**Log time algorithms:**

- O(n log n) one-time cost to sort
- O(log n) lookup time in a binary tree (less memory)
- O(1) lookup time in a hash table (more memory)

In the particular case given, of the integers 1 through n corresponding to indices 0 through (n-1), we are looking for an element that meets a particular criteria: normally in binary sort the criteria is a (greater than/less than/equal to) condition. Here, the criteria is the place where the value of element i (at zero-indexed i) changes from i+1 to i. The binary search algorithm can be outfitted to find this change point, using its head, tail, and middle pointers.

**Linear time algorithms:**

- O(n) add each element to a set, checking membership each time. O(1) lookup for membership in a set, iterating one item at a time, leads to O(n) time.

**One-time expense algorithms:**

- O(1) lookup for O(n) cost of building a hash table, at a potentially high memory cost.
- O(log n) binary search lookup for O(n log n) cost of sorting into a binary tree or similar structure, more efficient memory usage.

If we are going to be moving items in and out, what's the best approach? It all depends... Let's explore.

NOTE: Remember the ghost log. The cost of doing something with frequency of 1/j for j=1 to n is the harmonic number, which is O(log n).

**Java Implementation:**

- Use a HashMap, with the key being the element in the array, the value being an integer counter.

### Multi element case

In the multi-element duplicate case:

## Code

Data StructuresThis is the staging ground for computer science notes as part of the 2017 CS Study Plan.
Classes of data structures: Abstract Data Types Array-based and Link-based memory management: ArrayLists and Linked Lists Algorithmic Analysis of Data Structures: Algorithmic Analysis of Data Structures Advanced data structures: Advanced Data Structures
· Template:DataStructuresFlag · e |

ArraysSeries on Data Structures Python: Arrays/Python Java: Arrays/Java Categories: Category:Python Arrays
· Template:ArraysFlagBase · e |

Stacks and QueuesSeries on Data Structures
StacksQueues/Python StacksQueues/Python/LinkedStack
StacksQueues/Java StacksQueues/Java/LinkedStack
Postfix_Expressions#Stacks
· Template:StacksQueuesFlagBase · e |

Priority Queues and HeapsSeries on Data Structures
Priority Queues/Heap
· Template:PriorityQueuesFlagBase · e |

Linked ListSeries on Data Structures
· Template:LinkedListFlagBase · e |

TreesSeries on Data Structures Abstract data type: Trees/ADT Concrete implementations: Trees/LinkedTree
Preorder traversal: Trees/Preorder Postorder traversal: Trees/Postorder In-Order traversal: Binary Trees/Inorder Breadth-First Search: BFS Breadth-First Traversal: BFT Depth-First Search: DFS Depth-First Traversal: DFT OOP Principles for Traversal: Tree Traversal/OOP Tree operations: Trees/Operations Performance
Finding Minimum in Log N Time: Tree/LogN Min Search
Abstract data type: Binary Trees/ADT Concrete implementations: Binary Trees/LinkedBinTree Binary Trees/Cheat Sheet
· Template:TreesFlagBase · e |

Search TreesSeries on Data Structures
Binary Search Trees Trees/OOP
(Note that heaps are also value-sorting trees with minimums at the top. See Template:PriorityQueuesFlag and Priority Queues.)
· Template:SearchTreesFlagBase · e |

Maps and DictionariesSeries on Data Structures
Maps Map implementations: Maps/AbstractMap Dictionary implementations: Dictionaries/LinkedDict
Hash Maps/OOP Hash Maps/Dynamic Resizing Hash functions: Hash Functions Hash map implementations: Hash Maps/AbstractHashMap
Skip Lists Java implementations: SkipList
Sets
· Template:MapsFlagBase · e |