Software applications rarely experience a uniform workload. The workload varies by different months in a year, different weeks in a month, different days in a week as well as different time-slots within a day. Increased workload results in variations in the expected behaviour of the applications. For better performance, applications should be capable of handling and normalizing the variations so that users aren’t affected.
Managing resources to optimize performance in a traditional datacenter environment is inefficient because of the delay in reacting to the unexpected behaviour of an application. The reason behind this delay is human intervention to take corrective measures, such as provisioning additional resources to the application. To reduce this delay, and hence to optimize the performance, there should be a way to dynamically manage the resources in the datacenter. This can be achieved by using cloud-based applications which provides us the flexibility to automate the resource management. </p>
In a traditional datacenter environment, dedicated human resources are required to monitor the application and take corrective measures if anything goes wrong. For example, if the workload increases, then someone should introduce one more machine in the environment and add it to the cluster to handle the additional load. This process is time consuming as it needs human intervention. By the time corrective measures are implemented, the workload could be different than expected. So, this approach will work only in scenarios when increase in workload can be predicted well in advance. For this reason, most infrastructures are over-provisioned to handle any spikes in workload. This eventually results in increased maintenance costs and a largely under-utilized infrastructure.
A cloud-based infrastructure helps minimize this cost by providing the desired elasticity to handle workload spikes without the overheads that accompany a traditional data centre environment. To achieve this, it should support automatic management of its resources. In a cloud computing environment, this is achieved by creating new virtual machines (VM) on-the-fly to meet any workload spikes. One can easily scale up/down the infrastructure by creating or terminating the virtual machines. Most web servers available in the market today support clustering and load-balancing capabilities. Some web servers allow ramp down without any loss of information like session information. Using the appropriate web server speeds up the process and hence the quality of service.
In this article, we explain how to dynamically manage the resource needs of cloud-based applications. We will see how to use the monitored values and decide some action to meet the Service Level Objectives, SLOs. We illustrate how to automatically trigger an action based on monitoring data (coming from monitoring tool) using Drools and then scale up/down the resources using XEN. XEN is virtualization software which allows computer hardware to run multiple guest operating systems concurrently. This form the basis of cloud computing. There are multiple cloud solutions which use XEN, Eucalyptus is one of them. We will use Eucalyptus as provisioning software (Eucalyptus).
Let’s take an example of a web application deployed in a cloud environment. Consider that in a particular season, the number of users accessing that application increases. This will in turn increase the load on the servers on which the application is deployed. Hence, the current number of servers won’t be able to handle the increase in workload. If this is left unhandled, this might result in over-utilization of CPU and may gradually degrade the application’s performance. To manage this, we have to add more servers and cluster them with the existing ones so that load is balanced between newly added servers. Similarly, in an off-season, when the load on the application decreases, the resources provisioned for the application should be released gracefully to reduce costs.
Let’s have a look at how we can dynamically manage the resources to stabilize the load on the web application to avoid poor performance.
Consider that a web application is deployed on a virtual machine containing application server. We have used Eucalyptus to setup a private cloud infrastructure. Eucalyptus is an open source cloud platform from Eucalyptus Systems. It enables enterprises to establish their own cloud computing environments.
Let’s see the steps involved in dynamic scaling up/ down of resources based on server load.
Complete problem solution can be broken down into 3 steps for better understanding:
1. Monitor the virtual machine for any specific metric.
2. Optimize the monitored result and suggest corrective measures.
3. Execute the actions to meet SLOs.
Follow figure 1 for a better understanding.
Figure 1: Dynamic resource management solution.
As we can see from the figure 1, each virtual machine is equipped with a monitoring agent. The Agent’s responsibility is to collect the desired metrics from the VM. This monitored data is then sent to an Optimizer. The Optimizer engine is responsible for smoothening of sudden spikes in workload pattern and deciding which action (increase/decrease) to take based on the values coming from monitoring layer. This result is then sent to an Action engine which actually does the task of increasing or decreasing the virtual machine according to the load.
Let’s look at the solution in detail.
Step 1: Monitoring the virtual machine
Today, we have lots of monitoring tools like Hyperic, Nagios, Cacti, Ganglia, etc. available in the market to monitor physical or virtual machines deployed on the cloud. These tools help collect several useful metrics like CPU utilization, memory used, active thread count, response time and many more. In this use case, we use Hyperic to monitor the CPU utilization of the application servers as it directly impacts the performance of an application.
Figure 2: Monitoring workflow.
As we can see in the figure 2, each virtual machine is equipped with monitoring agents (Hyperic in our case). The monitoring agent will collect data for the enabled metrics and send the data to the Hyperic server. Our monitoring engine then collects the data from the server and sends the same to the optimizer engine. For more details on monitoring engine, please refer to this tutorial.
Step 2: Optimize the monitored result and suggest corrective measures
Once we get the monitored values, we need to process it to convert into some valid data based on which the corrective measure is decided.
As we infer from figure 3, the output of the monitoring engine goes to the Optimizer. The Optimization process is further broken down into two steps :
a. Averaging the monitored values to remove spikes
b. Suggesting actions using Drool engine
Figure 3: Optimizer.