(Editor's note: This is the first article in a two-part series that compares the difference in infrastructure for Grid, HPC and Cloud. In the second part, we will look at the developer perspective of these three classes of compute infrastructures.)
Infrastructure is important: if for nothing else, for the fact that it sets a physical limit based on which our performance is measures. We cannot achieve better performance than our infrastructure is capable of delivering. Sounds rather trivial, but it is probably the most important point that I like to hit on and talk about here.
The Chicken and the Egg
The question is whether it is the infrastructure that classifies the type or bias versa. Which one determines the other? If we are looking for an HPC environment, do we need to start with a set of rules? Or, we have made a decision based on our requirements and what we ended up with is considered an HPC environment. This is the question that many have asked, and that is why there is no clear set of guidelines that one can refer to in order to build such infrastructures. We will take the middle ground on this throughout this article. I will outline the common attributed that you may find in/for these infrastructures. I am of the philosophy that requirements dictate the final deployment, but there are a few things to consider:
* Hardware costs
* Infrastructure costs
* Software integration costs
* Risks: vendor lock-in, security, complexity of infrastructure
We will get back to these considerations and relate them to a set of attributes related of each deployment type.
Table 1 outlines our discussion.
If you notice, I have stayed away from anything that is application-related directly as we are focusing on the infrastructure in this article. Ok, let's get started on some of these attributes.
The one column that jumps out is the HPC column. Why? It's expensive, most likely made out of proprietary hardware and software, small in size and most likely all coupled with the application, backend, etc. Why do we want such infrastructure? We need and must have this type of infrastructure in order to meet our SLA requirements. HPC or Cluster Computing (I use them interchangeably despite the bad comments that I will receive after writing this article) is "small" in size -- a few hundred nodes and as high as a couple of thousand nodes. Each node is highly optimized to perform at its peak, and the application is configured such that it takes advantage of the node. For example, if you have four cores per node and 4GB RAM per Core, the chances are that your application was designed (or modified) to take this bit of information into account. We do not like to implement any security with this type of infrastructure. We like security to be only at the edges, and the nodes to do what they are good at: computation and computation only.