From charlesreid1

Notes on the ConcurrentSkipList class implementation in Java.

Link: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java

Opening

This class is basically concerned with concurrent skip lists for use in multi-threaded programs. This implements lock/protection mechanisms to make it thread-safe.

The opening comment of the file is quite extensive. It gives several references, and explains the overall logic of the class design.

Helpfully, it gives a notation guide for variable names:

     * Notation guide for local variables
     * Node:         b, n, f    for  predecessor, node, successor
     * Index:        q, r, d    for index node, right, down.
     *               t          for another index node
     * Head:         h
     * Levels:       j
     * Keys:         k, key
     * Values:       v, value
     * Comparisons:  c

Helpful because, when you're writing these algorithms, there are lots of nodes and pointers flying around, and no obvious names for any of them.

Fields

The field declarations for the class start around line 320: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L323

The class uses a random number generator to generate an overall (per-instance) random number.

The class implements the head pointer as a HeadIndex<K,V> object.

There is a Comparator object used to compare keys, that takes inputs of type K or derived from K:

 private final Comparator<? super K> comparator;

Methods

After defining the fields that are part of this class, it moves into defining some utility classes, the first of which is a key-value Node class that takes two generic types, K and V.

The Node class stores a key, stores a value, and stores a next pointer. It's like a simple Linked List node. Link: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L394

Next is an index class to represent the various levels of the skip list: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L540

Here is the HeadIndex item, which extends the index class and is the type of the ConcurrentSkipList's "head" pointer: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L614

The ComparableUsingComparator class, confusingly named, is basically a class to hold a Key-Comparator pair.

What makes it confusing is, the class holding the Key object and the Comparator object then implement the Comparable<K> interface, which means this object can be directly compared with other objects. Further confusing matters, the comparable interface that is defined allows this object (of type ComparableUsingComparator) to be compared to Key objects (not the same type). These comparisons are performed, no surprise, with the Comparator that the class stores.

This is an interesting pattern - the ComparableUsingComparator<K> class doesn't, strictly speaking, look like or work like a key, but it provides a bundle (that contains a key) that can be compared with keys using customizable comparison criteria.

Here's that last class:

    /**
     * Represents a key with a comparator as a Comparable.
     *
     * Because most sorted collections seem to use natural ordering on
     * Comparables (Strings, Integers, etc), most internal methods are
     * geared to use them. This is generally faster than checking
     * per-comparison whether to use comparator or comparable because
     * it doesn't require a (Comparable) cast for each comparison.
     * (Optimizers can only sometimes remove such redundant checks
     * themselves.) When Comparators are used,
     * ComparableUsingComparators are created so that they act in the
     * same way as natural orderings. This penalizes use of
     * Comparators vs Comparables, which seems like the right
     * tradeoff.
     */
    static final class ComparableUsingComparator<K> implements Comparable<K> {
        final K actualKey;
        final Comparator<? super K> cmp;
        ComparableUsingComparator(K key, Comparator<? super K> cmp) {
            this.actualKey = key;
            this.cmp = cmp;
        }
        public int compareTo(K k2) {
            return cmp.compare(actualKey, k2);
        }
}

Comparisons

More comparison stuff:

  • The ComparableUsingComparator class above is only used in some cases by the skip list, specifically, when the user specifies a Comparator to use to compare keys.
  • There is a comparable(Object key) method that attempts to return a comparable object - if the user has specified a key Comparator, it returns a ComparableUsingComparator object; otherwise it attempts to cast the key object as a Comparable: return (Comparable<? super K>)key;
  • A compare(Key k1, Key k2) method is implemented, which takes two keys, performs a comparison (again, using a Comparator if one was provided at construction) and returns an integer.

Further comparisons:

  • Two additional methods, called inHalfOpenRange and inOpenRange, take boundary keys as arguments and return whether a key of interest is between those key boundaries. These enable the creation of sub-maps. This type of operation would be extremely useful if implementing timestamp or date information as keys.

Traversal

Traversal section of the ConcurrentSkipListMap class: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L701

The principal (private) method implemented by this class is the findPredecessor method. This method takes a Comparable<K> object as an argument - hence the reason for the comparable method defined above, which takes a Key object and returns a Comparable version of that key.

(Note the notation convention listed above.)

    private Node<K,V> findPredecessor(Comparable<? super K> key) {
        if (key == null)
            throw new NullPointerException(); // don't postpone errors

Before reviewing the body of the method, let's review a comment from the beginning, which explains the difference between an Index and a Node object. Namely, the Node object holds the key-value pair, and is singly-linked, but the Index class has a comment that explains why:

     * This class implements a tree-like two-dimensionally linked skip
     * list in which the index levels are represented in separate
     * nodes from the base nodes holding data.  There are two reasons
     * for taking this approach instead of the usual array-based
     * structure: 1) Array based implementations seem to encounter
     * more complexity and overhead 2) We can use cheaper algorithms
     * for the heavily-traversed index lists than can be used for the
     * base lists.  Here's a picture of some of the basics for a
     * possible list with 2 levels of index:
     *
     * Head nodes          Index nodes
     * +-+    right        +-+                      +-+
     * |2|---------------->| |--------------------->| |->null
     * +-+                 +-+                      +-+
     *  | down              |                        |
     *  v                   v                        v
     * +-+            +-+  +-+       +-+            +-+       +-+
     * |1|----------->| |->| |------>| |----------->| |------>| |->null
     * +-+            +-+  +-+       +-+            +-+       +-+
     *  v              |    |         |              |         |
     * Nodes  next     v    v         v              v         v
     * +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+
     * | |->|A|->|B|->|C|->|D|->|E|->|F|->|G|->|H|->|I|->|J|->|K|->null
     * +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+  +-+

The Index class and the Node class are separate because of their "forward-pointing" fields, which have different types:

     * Index nodes represent the levels of the skip list.  Note that
     * even though both Nodes and Indexes have forward-pointing
     * fields, they have different types and are handled in different
     * ways, that can't nicely be captured by placing field in a
     * shared abstract class.

What follows is the loop. The letter q represents the current Index, r represents the next Index to the right (null if we are at the right side of the list). We declare q and r as the main operational pointers that we'll use in the algorithm.

        for (;;) {
            Index<K,V> q = head;
            Index<K,V> r = q.right;

The main operation we are doing in this algorithm is getting the Node (the object that stores the key-value pair) associated with a particular Index (the Index to the right, in particular) and compare it to the key that was passed in. If the Index+Node we're looking at has a key that's larger than us, we stop moving to the right and move down. Otherwise, we keep moving right.

            for (;;) {
                if (r != null) {
                    Node<K,V> n = r.node;
                    K k = n.key;
                    if (n.value == null) {
                        if (!q.unlink(r))
                            break;           // restart
                        r = q.right;         // reread r
                        continue;
                    }
                    if (key.compareTo(k) > 0) {
                        q = r;
                        r = r.right;
                        continue;
                    }
                }

Above is the part of the code where we keep moving right until we find an Index+Node whose key is greater than the key passed in.

Once that Index is found, q is pointing to the right-most node smaller than our key, and we move the pointer down. If we can't move down any further, we're at the very bottom of the skip list, and we should return the node at the Index q.

                Index<K,V> d = q.down;
                if (d != null) {
                    q = d;
                    r = d.right;
                } else
                    return q.node;
            }
        }
    }

'

Searching

Once traversal is defined, the class can implement a findNode method, which it does, and a doGet method, which turns a key into its corresponding value. This is a private method, but is an important part of implementing a map.

Note that if our traverse method is not logarithmic time, our get/find/retrieve methods will not be either.

Both of these methods simply wrap the skip list traversal method above.

Here is a link to the doGet method, with the findNode method right above it: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L820

Note on naming convention: the private get method used internally is called doGet, the private put method used internally is called doPut, etc.

Insertion

Here is a link to the insertion portion of the class: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L837

This starts with a doPut method, repeated below. We will cover this in detail.

The method header starts with a convenient boolean called onlyIfAbsent that keeps us from accidentally squashing useful data. We are passing a key and a value to put into the map. The map should take care of wrapping the key value in whatever class it is using to represent key-value pairs n the data structure, because those are all implementation details - the user's concern ends at the key-value pair.

    private V doPut(K kkey, V value, boolean onlyIfAbsent) {
        Comparable<? super K> key = comparable(kkey);

Now we have a while loop that starts by getting the Node object that comes before and after the position corresponding to the key passed in. It crawls along node by node, comparing to the keys of each node as it goes. There's some magic, and I don't follow the whole method.

(...this explanation needs a bit more work.)

        for (;;) {
            Node<K,V> b = findPredecessor(key);
            Node<K,V> n = b.next;
            for (;;) {
                if (n != null) {
                    Node<K,V> f = n.next;
                    if (n != b.next)               // inconsistent read
                        break;
                    Object v = n.value;
                    if (v == null) {               // n is deleted
                        n.helpDelete(b, f);
                        break;
                    }
                    if (v == n || b.value == null) // b is deleted
                        break;
                    int c = key.compareTo(n.key);
                    if (c > 0) {
                        b = n;
                        n = f;
                        continue;
                    }
                    if (c == 0) {
                        if (onlyIfAbsent || n.casValue(v, value))
                            return (V)v;
                        else
                            break; // restart if lost race to replace value
                    }
                    // else c < 0; fall through
                }

                Node<K,V> z = new Node<K,V>(kkey, value, n);
                if (!b.casNext(n, z))
                    break;         // restart if lost race to append to b
                int level = randomLevel();
                if (level > 0)
                    insertIndex(z, level);
                return null;
            }
        }
}

insertIndex method: Another confusing/mysterious insert method: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L916

addIndex method: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L970

Deletion

Link to deletion section: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1025

doRemove is the main removal method. Here is the comment that precedes it:

     * Main deletion method. Locates node, nulls value, appends a
     * deletion marker, unlinks predecessor, removes associated index
     * nodes, and possibly reduces head index level.
     *
     * Index nodes are cleared out simply by calling findPredecessor.
     * which unlinks indexes to deleted nodes found along path to key,
     * which will include the indexes to this node.  This is done
     * unconditionally. We can't check beforehand whether there are
     * index nodes because it might be the case that some or all
     * indexes hadn't been inserted yet for this node during initial
     * search for it, and we'd like to ensure lack of garbage
     * retention, so must call to be sure.

This relies on an understanding of some of the other details of the class - the marker and head nodes, for example, and how they are identified.

Special first-node and last-node versions of modifier methods

There are a few methods of the put/remove/get methods that deal with the case of it being the first node: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1123

Similarly, dealing with the case of finding the last/next-to-last node: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1188

Comparisons

An operation called findNear (finding nearest node that matches a condition) is defined: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1314

Constructors

All the constructors are defined on line 1376: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1376

This includes quite a few specialist methods/constructors for building skip lists from other structures.

Serialization

More methods for serialization of these objects: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1507

Map interface

There are several methods implemented to make this conform to the Map ADT: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1595

These all utilize the private versions of get/put/remove. For example, here is the containsKey method:

    public boolean containsKey(Object key) {
        return doGet(key) != null;
}

The get and put methods:

    public V get(Object key) {
        return doGet(key);
    }


    public V put(K key, V value) {
        if (value == null)
            throw new NullPointerException();
        return doPut(key, value, false);
    }

Finally, the remove method:

    public V remove(Object key) {
        return doRemove(key, null);
}

Size

The size method is not a simple integer that is being stored with the object, it is a more complicated method:

    public int size() {
        long count = 0;
        for (Node<K,V> n = findFirst(); n != null; n = n.next) {
            if (n.getValidValue() != null)
                ++count;
        }
        return (count >= Integer.MAX_VALUE) ? Integer.MAX_VALUE : (int) count;
}

View Methods

Here are the view methods of the ConcurrentSkipList class: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1725

Here is the keySet method:

    public NavigableSet<K> keySet() {
        KeySet ks = keySet;
        return (ks != null) ? ks : (keySet = new KeySet(this));
}

Likewise, the values set method:

    public Collection<V> values() {
        Values vs = values;
        return (vs != null) ? vs : (values = new Values(this));
    }

and the entry method:

    public Set<Map.Entry<K,V>> entrySet() {
        EntrySet es = entrySet;
        return (es != null) ? es : (entrySet = new EntrySet(this));
}

Later in the file, all three of these have corresponding iterator classes. Link: https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L2305

    static final class KeySet<E>
            extends AbstractSet<E> implements NavigableSet<E> {
        private final ConcurrentNavigableMap<E,Object> m;
        KeySet(ConcurrentNavigableMap<E,Object> map) { m = map; }
        public int size() { return m.size(); }
        public boolean isEmpty() { return m.isEmpty(); }
        public boolean contains(Object o) { return m.containsKey(o); }
        public boolean remove(Object o) { return m.remove(o) != null; }
        public void clear() { m.clear(); }
        public E lower(E e) { return m.lowerKey(e); }
        public E floor(E e) { return m.floorKey(e); }
        public E ceiling(E e) { return m.ceilingKey(e); }
        public E higher(E e) { return m.higherKey(e); }
        public Comparator<? super E> comparator() { return m.comparator(); }
        public E first() { return m.firstKey(); }
        public E last() { return m.lastKey(); }
        public E pollFirst() {
            Map.Entry<E,Object> e = m.pollFirstEntry();
            return (e == null) ? null : e.getKey();
        }
        public E pollLast() {
            Map.Entry<E,Object> e = m.pollLastEntry();
            return (e == null) ? null : e.getKey();
        }
        public Iterator<E> iterator() {
            if (m instanceof ConcurrentSkipListMap)
                return ((ConcurrentSkipListMap<E,Object>)m).keyIterator();
            else
                return ((ConcurrentSkipListMap.SubMap<E,Object>)m).keyIterator();
        }
        public boolean equals(Object o) {
            if (o == this)
                return true;
            if (!(o instanceof Set))
                return false;
            Collection<?> c = (Collection<?>) o;
            try {
                return containsAll(c) && c.containsAll(this);
            } catch (ClassCastException unused)   {
                return false;
            } catch (NullPointerException unused) {
                return false;
            }
        }
        public Object[] toArray()     { return toList(this).toArray();  }
        public <T> T[] toArray(T[] a) { return toList(this).toArray(a); }
        public Iterator<E> descendingIterator() {
            return descendingSet().iterator();
        }
        public NavigableSet<E> subSet(E fromElement,
                                      boolean fromInclusive,
                                      E toElement,
                                      boolean toInclusive) {
            return new KeySet<E>(m.subMap(fromElement, fromInclusive,
                                          toElement,   toInclusive));
        }
        public NavigableSet<E> headSet(E toElement, boolean inclusive) {
            return new KeySet<E>(m.headMap(toElement, inclusive));
        }
        public NavigableSet<E> tailSet(E fromElement, boolean inclusive) {
            return new KeySet<E>(m.tailMap(fromElement, inclusive));
        }
        public NavigableSet<E> subSet(E fromElement, E toElement) {
            return subSet(fromElement, true, toElement, false);
        }
        public NavigableSet<E> headSet(E toElement) {
            return headSet(toElement, false);
        }
        public NavigableSet<E> tailSet(E fromElement) {
            return tailSet(fromElement, true);
        }
        public NavigableSet<E> descendingSet() {
            return new KeySet(m.descendingMap());
        }
}

AbstractMap method overrides

Link to the abstract map method overrides (basic methods): https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1829

More remove and replace methods...

SortedMap method overrides

Some methods that are implemented for sorted maps (first/last keys, etc.): https://github.com/openjdk-mirror/jdk7u-jdk/blob/f4d80957e89a19a29bb9f9807d2a28351ed7f7df/src/share/classes/java/util/concurrent/ConcurrentSkipListMap.java#L1947

Submap methods

There are some nice submap methods available as well. Three sub-map methods are implemented: one method taking two keys (start/end), one method taking a single key and returning every item with a key above that key, and one method taking a single key and returning every item with a key below that key.

Iterator methods

The class implements an Iterator class, with the usual set of methods, plus a KeyIterator class, a ValueIterator class, and an EntryIterator class.

Random index

There is a method defined that returns a random level to use when inserting nodes. It is bit-stuffing magic:

   /**
     * Returns a random level for inserting a new node.
     * Hardwired to k=1, p=0.5, max 31 (see above and
     * Pugh's "Skip List Cookbook", sec 3.4).
     *
     * This uses the simplest of the generators described in George
     * Marsaglia's "Xorshift RNGs" paper.  This is not a high-quality
     * generator but is acceptable here.
     */
    private int randomLevel() {
        int x = randomSeed;
        x ^= x << 13;
        x ^= x >>> 17;
        randomSeed = x ^= x << 5;
        if ((x & 0x80000001) != 0) // test highest and lowest bits
            return 0;
        int level = 1;
        while (((x >>>= 1) & 1) != 0) ++level;
        return level;
}

Flags