Grid Enable Your Java Applications with GridGain

llowing developers to leverage distributed computing resources has long been a goal of emerging technologies, including most recently RPC, CORBA, and RMI. However, these distributed computing technologies require considerable initial infrastructure setup and maintenance, and they face many hurdles to scalability. Enter grid computing, in which the individual parts of a code base are run remotely and in parallel among the available nodes of a grid (a computer cluster).

Grid computing involves setting up nodes and a network of topologies. Each node is an independent entity having its own or a shared set of resources such as memory, CPU(s), and processor(s). When several of the nodes are pooled and work in conjunction, they form a topology. Because even applications that were designed and developed as parallel-processing grid applications required lots of custom code for network programming, synchronization, and assembly of the results, grid computing has been too costly for many enterprises. Enter GridGain, an open-source, Java-based framework for grid computing that can be a cost-effective introduction to grid application development. GridGain not only provides simple APIs for developing grid applications but also enables developers to grid enable existing applications.

This article demonstrates grid computing with GridGain using the classic example of matrix multiplication.

Get to Know GridGain
GridGain provides a Java-based framework to handle network programming, synchronization, recovery on failure, and more importantly dynamic class loading for grid computing. Because it is written in Java, GridGain has all the write once, run anywhere benefits that come with Java. GridGain nodes could be run on computers running different operating systems and still be part of a topology.

To establish a grid-enabled, enterprise-wide network setup, GridGain provides a simple batch script to start up a node with a default no-name topology. The batch script takes a Spring XML configuration to set up the nodes as a named topology. At that point, a single topology or multiple topologies can be set up within an enterprise network. A single server can run multiple nodes, and each node can be part of a different topology.

At its core, GridGain is a very well architected system that uses all the features offered by Java 2 Platform Standard Edition (J2SE) 5.0 and resolves cross-cutting concerns using Aspect-Oriented Programming (AOP). GridGain provides various interfaces, along with helper interfaces and classes such as listeners and adapters, to enable developers to gain access to the grid.

GridGain also provides a pluggable service provider interface (SPI) for various fundamental, low-level functionalities of the grid, such as discovery, failover, communication, and load balancing. GridGain comes with certain pre-built SPI implementations and new ones could be added with relative ease. Review the documentation to learn how to gain programmatic access to these SPI services.

The source code download includes the file GridTest.java, which contains the following four lines of code to demonstrate the network lookup and connection establishment to a default no-name grid:

GridFactory.start();try {    Grid grid = GridFactory.getGrid();}finally {    GridFactory.stop(true);}

To connect to a specific grid, you would pass the name of the grid to the getGrid() method. If no other nodes are available, this program starts up as a grid node. The nodes that are already running will get a notification about the start of a new node. GridGain has in effect reduced what previously would have required several thousand lines of code to accomplish down to these four lines.

Grid Enabling with GridGain
Along the lines of the Einstein tenet, “Make everything as simple as possible, but not simpler,” GridGain has made grid computing easy. It implements all the low-level details, which the developer may or may not need to know. The following section walks through a code example from the downloadable source code to demonstrate the power of grid parallel processing with GridGain.

The GridMatrixMultiplier.java file has the code to multiply any two given matrices. The code populates the matrices with sequential numbers for demonstration (but you could change that as needed):

public int compute(int arr1[], int arr2[]){		int result =0;		for(int i=0; i < arr1.length ; i++)	 	{result += arr1[i] * arr2[i];}		return result;	}

If you add the @Gridify annotation to the method, the byte code for the method is shipped to a remote node, executed, and the results returned. Analogous to the JVM inside a browser communicating with a web server to download and execute applet byte code, the Gridify annotation ships the byte code of the class and its dependencies to a remote node within the topology, where it is executed.

To see how GridGain handles failover, increase the matrix size in GridMatrixMultiplier.java and start up multiple nodes. When you see the print statement in all the nodes, kill one of the nodes and observe the failover and recovery part. GridGain will handle failover transparently without affecting the result of the computation.

While a grid application is running, try adding a new node to the topology. The load balancer immediately will assign some processing to the new node. When all the nodes in the topology go down, the main application will run as a local node.

GridHierarchyTest.java demonstrates the fact that not only the class that has the Gridify method is shipped to the remote node, but also all the dependent class hierarchies as needed. In this example, Child.java invokes the Parent.java print method, which in turn calls the PrintHelper.java print method.

In the examples above, the Gridify method could be run on any node. If you needed different sections of the code to be executed in different topologies, GridGain provides GridTask and GridJob classes along with the corresponding Adapter classes for custom processing. GridTask provides callback methods for defining logic to split and aggregate the results, while GridJob is a single unit of distributable work.

Practical Performance Limitations
Because remote execution causes network delays, distributed computing may be overkill for relatively small applications. Time- and resource-intensive applications, such as those for simulation models, likely will benefit most from the GridGain framework, but the increased hardware utilization that it provides could impact IT hardware vendors.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

The Latest

microsoft careers

Top Careers at Microsoft

Microsoft has gained its position as one of the top companies in the world, and Microsoft careers are flourishing. This multinational company is efficiently developing popular software and computers with other consumer electronics. It is a dream come true for so many people to acquire a high paid, high-prestige job

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS