From charlesreid1


Dictionary data type - definition

Dictionaries are data structures that store data in such a way that it can be looked up by value, rather than by index/position/reference.

These types of data sets naturally lend themselves to a structure that partitions data based on its sort value, as in a binary tree. This allows operations to happen faster, because the lookup key itself guides the process of finding the corresponding data in the data structure.

Dictionaries and Binary Trees

Dictionaries can be implemented as binary trees. Dictionaries are designed as by-value lookup, and a binary tree is a way of storing values in such a way that they are maintained in sorted order, and therefore easy to find in log(n) time.

See Advanced Data Structures for more about Skiena coverage of data structures; Skiena discusses the connection between Dictionary types and binary trees in great detail, and the questions below explore these concepts as well.

See Binary Trees, particularly Binary_Trees#Question_3-6, for more.

Skiena Chapter 3 notes

This entire chapter is abound with practical advice about data structures. Principally - use black boxes; focus on big-O analysis; the data structure can always be made more efficient. And use binary trees and hash tables to store your variables by value! These are relentlessly practical and useful data structures because lookups are O(1).

Dictionaries definitions

Unlike stacks and queues, which allow access to items in the data structure independent of their value, a dictionary provides access to an object using its value as a way of obtaining a reference to it in the data structure.

Dictionaries interface

Dictionaries implement the following operations:

  • search(d,k) - given a search key k, return a pointer to the element in d corresponding to key value k (if one exists)
  • insert(d,x) - given a data item, add it to the set of data
  • delete(d,x) - given a pointer to a given data item x in the dictionary d, remove it from d

additional useful operations:

  • max(d) / min(d) - retrieve item with largest or smallest key - enabling dictionary to serve as priority queue
  • predecessor(d,k) / successor(d,k) - retrieve the item from d whose key is before or after k in sorted order (enables iteration)

Example: to remove all duplicate names from a mailing list, and print the results in sorted order, initialize an empty dictionary, iterate the through the list, and add each item as a key, unless it already exists in the dictionary. (Note that there are many operations happening here, so the more of them we can make O(1), the better.)

Once we are finished, extract the remaining names out of the dictionary to print them in sorted order. Get the first item via min(d) and repeatedly call successor until obtain max(d).

Dictionary implementations based on arrays

See Dictionaries/ArrayDict

Dictionary implementations based on linked lists

See Dictionaries/LinkedDict


Skiena Chapter 3 questions

Question 3-4

Question 3-4) Design a dictionary structure in which search, insertion, and deletion can be handled in O(1) time.

Assume set elements are from 1..n)

(Initialization can take O(n) time.)

  • Well, if this search has to take O(1) time, we need to know EXACTLY where things are, given their value.
  • The problem seems to indicate either a hash table, or an off-by-one array.
  • Off-by-one array: the numbers passed in from 1 to n are represented in an array of length n, with the kth number represented by the k-1th cell.
  • hash table: key value is given, turns into unique integer, looked up in O(1) time

Question 3-7

Question 3-7) Suppose you have access to a balanced dictionary structure that supports insert, delete, min, max, successor, predecessor, all in O(log n) time. How would you modify the insert/delete operations so that they take O(log n) time,but min/max now takes O(1)?

When inserting or deleting, check if we have a new minimum/maximum. Keeping track of minimum element or maximum element then costs O(1) to return.

    if(e < min):
        min = e
    else if(e > max):
        max = e

        inspect each element to find new min
        inspect each element to find new max