MODE-2121 Corrected the concurrency problem within the LazyCachedNode A new stress test case was added to have lots of calls to the cached nodes, and this highlighted a problem in the LazyCachedNode.
Although each LazyCachedNode is (mostly) immutable, there still are a number of fields that are lazily populated. Before this change, these fields were not being updated in an atomic and thread-safe fashion. Additionally, these fields were not volatile (or Atomic) and thus there is no guarantee when or in what order one thread sees the fields updated by another thread.
Most of the fields are self-contained and idempotent, so they did not pose much of a problem. However, the "parentReferenceToSelf" method used TWO fields to cache the information, and if a thread saw only one of these fields after being (correctly) updated by another thread, the reading thread would see an inconsistent state and the logic would incorrectly return null for the ChildReference containing this node's name and SNS index.
The solution for most of the fields was simply to make them volatile (for most references that are simply set) or AtomicReference in the one case where the value is somewhat expensive to compute if it is not really needed.
The solution for the "parentReferenceToSelf" method was to use a single field with an AtomicReference to an object that encapsulated the information previously stored in two fields. This also meant that some of the logic could be encapsulated into this new object, and that a specialized implementation could be used for the root node's ChildReference to itself (allowing the other implementation to be simplified). The "parentReferenceToSelf" method is carefully written to not require any synchronization or locking, and thus it is very fast. Striclty speaking, at any given instant constructing the cached information is idempotent, meaning that two nearly-concurrent calls might each have to do all of the work. But in the end, both calls will see consistent information and only one of the representations will be kept.
After these changes, the stress test runs quite well (albeit with a lot of memory), and the previously-reported exception is seen no more.