How POJO Clustering Scales
At enterprise scale where application infrastructure will be called upon to move large volumes of data in a cluster of hundreds of application servers, as well as handle disaster recovery and datacenter-level fault tolerance, critical performance and availability features emerge from POJO clustering.
POJO clustering scales by increasing locality of reference, reducing the working set of data upon which any cluster node operates, and enabling on-the-fly lock optimization. All of these features reduce the number of messages that must be passed around the cluster as well as the amount of data those messages must contain.
Locality of Reference
The term locality of reference here means where data is kept and accessed in a system relative to its home in the system of record for that data and relative to the worker operating on that data. In a cluster that can maintain a centralized view of which nodes have references to which objects, object data can be pushed where it's needed, when it's needed. This can reduce the latency perceived by each node because the data it needs is already there. If cluster nodes have a mechanism for maintaining a constrained window on the clustered data, the cluster can likewise reduce the amount of data that must move across the cluster, since only the nodes that actually need to see changes for particular objects are required to receive changes for those objects.
Networked Virtual Memory
This data windowing functionality is provided by networked virtual memory. As references to clustered POJOs are traversed by an application running in the container, they are automatically faulted in from the clustered object store by the container as needed. If the referenced object is already in the heap, the reference traversal is a normal GETFIELD bytecode instruction. If that object is not currently in the heap, the container faults the object into the heap automatically. Once the object has been pulled into the container, the GETFIELD instruction is allowed to proceed normally.
Conversely, as in-heap clustered objects become less frequently used, the POJO container is free to flush those objects out of the heap. If they are ever needed again, they may be faulted back in, but flushing unused objects keeps heap usage to a fixed size.
While a clustered object is in the heap, that container instance must subscribe to changes for that object to ensure that it is up to date (this happens automatically as a service of the container). Therefore, keeping the virtual memory window constrained to just those objects that the container needs drastically improves performance since it isn't processing updates to objects that it doesn't care about.
Any object access patterns that emerge at runtime from the application will automatically be reflected in the set of clustered objects that is resident in a particular container's heap. Windowing via dynamic, network-attached virtual memory provides a simple and efficient way to provide access to a working set of data, thereby keeping per-container heap usage under control and maintaining a high degree of locality of reference that can improve over time as the working set emerges.
Scalable Lock Performance
The same centralized view of the cluster that allows locality of reference to be optimized can also be used to optimize lock acquisition. A lock that is not contended for by other nodes can be granted to the node that wants it so that subsequent releases and acquisitions of that lock can be made locally instead of going out on the cluster. If there is no contention on that lock by any other thread, or if the omniscient view can tell that a thread is working only on its own data frame, lock acquisition and release can be made a no-op.