Grid Enable Your Java Applications with GridGain

Grid Enable Your Java Applications with GridGain

llowing developers to leverage distributed computing resources has long been a goal of emerging technologies, including most recently RPC, CORBA, and RMI. However, these distributed computing technologies require considerable initial infrastructure setup and maintenance, and they face many hurdles to scalability. Enter grid computing, in which the individual parts of a code base are run remotely and in parallel among the available nodes of a grid (a computer cluster).

Grid computing involves setting up nodes and a network of topologies. Each node is an independent entity having its own or a shared set of resources such as memory, CPU(s), and processor(s). When several of the nodes are pooled and work in conjunction, they form a topology. Because even applications that were designed and developed as parallel-processing grid applications required lots of custom code for network programming, synchronization, and assembly of the results, grid computing has been too costly for many enterprises. Enter GridGain, an open-source, Java-based framework for grid computing that can be a cost-effective introduction to grid application development. GridGain not only provides simple APIs for developing grid applications but also enables developers to grid enable existing applications.

This article demonstrates grid computing with GridGain using the classic example of matrix multiplication.

Get to Know GridGain
GridGain provides a Java-based framework to handle network programming, synchronization, recovery on failure, and more importantly dynamic class loading for grid computing. Because it is written in Java, GridGain has all the write once, run anywhere benefits that come with Java. GridGain nodes could be run on computers running different operating systems and still be part of a topology.

To establish a grid-enabled, enterprise-wide network setup, GridGain provides a simple batch script to start up a node with a default no-name topology. The batch script takes a Spring XML configuration to set up the nodes as a named topology. At that point, a single topology or multiple topologies can be set up within an enterprise network. A single server can run multiple nodes, and each node can be part of a different topology.

At its core, GridGain is a very well architected system that uses all the features offered by Java 2 Platform Standard Edition (J2SE) 5.0 and resolves cross-cutting concerns using Aspect-Oriented Programming (AOP). GridGain provides various interfaces, along with helper interfaces and classes such as listeners and adapters, to enable developers to gain access to the grid.

GridGain also provides a pluggable service provider interface (SPI) for various fundamental, low-level functionalities of the grid, such as discovery, failover, communication, and load balancing. GridGain comes with certain pre-built SPI implementations and new ones could be added with relative ease. Review the documentation to learn how to gain programmatic access to these SPI services.

The source code download includes the file GridTest.java, which contains the following four lines of code to demonstrate the network lookup and connection establishment to a default no-name grid:

GridFactory.start();try {    Grid grid = GridFactory.getGrid();}finally {    GridFactory.stop(true);}

To connect to a specific grid, you would pass the name of the grid to the getGrid() method. If no other nodes are available, this program starts up as a grid node. The nodes that are already running will get a notification about the start of a new node. GridGain has in effect reduced what previously would have required several thousand lines of code to accomplish down to these four lines.

Grid Enabling with GridGain
Along the lines of the Einstein tenet, “Make everything as simple as possible, but not simpler,” GridGain has made grid computing easy. It implements all the low-level details, which the developer may or may not need to know. The following section walks through a code example from the downloadable source code to demonstrate the power of grid parallel processing with GridGain.

The GridMatrixMultiplier.java file has the code to multiply any two given matrices. The code populates the matrices with sequential numbers for demonstration (but you could change that as needed):

public int compute(int arr1[], int arr2[]){		int result =0;		for(int i=0; i < arr1.length ; i++)	 	{result += arr1[i] * arr2[i];}		return result;	}

If you add the @Gridify annotation to the method, the byte code for the method is shipped to a remote node, executed, and the results returned. Analogous to the JVM inside a browser communicating with a web server to download and execute applet byte code, the Gridify annotation ships the byte code of the class and its dependencies to a remote node within the topology, where it is executed.

To see how GridGain handles failover, increase the matrix size in GridMatrixMultiplier.java and start up multiple nodes. When you see the print statement in all the nodes, kill one of the nodes and observe the failover and recovery part. GridGain will handle failover transparently without affecting the result of the computation.

While a grid application is running, try adding a new node to the topology. The load balancer immediately will assign some processing to the new node. When all the nodes in the topology go down, the main application will run as a local node.

GridHierarchyTest.java demonstrates the fact that not only the class that has the Gridify method is shipped to the remote node, but also all the dependent class hierarchies as needed. In this example, Child.java invokes the Parent.java print method, which in turn calls the PrintHelper.java print method.

In the examples above, the Gridify method could be run on any node. If you needed different sections of the code to be executed in different topologies, GridGain provides GridTask and GridJob classes along with the corresponding Adapter classes for custom processing. GridTask provides callback methods for defining logic to split and aggregate the results, while GridJob is a single unit of distributable work.

Practical Performance Limitations
Because remote execution causes network delays, distributed computing may be overkill for relatively small applications. Time- and resource-intensive applications, such as those for simulation models, likely will benefit most from the GridGain framework, but the increased hardware utilization that it provides could impact IT hardware vendors.

 

Share the Post:
XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes

ransomware cyber attack

Why Is Ransomware Such a Major Threat?

One of the most significant cyber threats faced by modern organizations is a ransomware attack. Ransomware attacks have grown in both sophistication and frequency over the past few years, forcing

data dictionary

Tools You Need to Make a Data Dictionary

Data dictionaries are crucial for organizations of all sizes that deal with large amounts of data. they are centralized repositories of all the data in organizations, including metadata such as