uppose you have to deploy a highly available and scalable database backend solution for an Internet application (similar to what is described here). A cluster immediately comes to mind. A cluster is a conglomerate of two or more machines that are capable of sharing their workload in a real-time, near real-time, or scheduled “toggle” mode. The following are its three main components:
- Machines running some form of an enterprise-level operating system, such as Linux, Windows, or Unix
- A network?a key component of the clustered architecture
- Cluster-capable software products running within machines that are designated as the cluster members
Buying some hardware to prototype and test the proposed solution seems to be a reasonable course of action, but before you embark on a hardware purchasing spree, consider an option that will let you build a prototype of your clustered backend application right on your desktop: virtualization. Virtualization is based on the concept of creating an environment that appears to a “guest” operating system as hardware (a virtual machine), yet is simulated in a contained software environment by the host system.
Virtualization products support the creation of the first two essential clustering components in your scenario: provisioning of machines and the establishment of a network between these machines. The third key component, cluster-capable software, is independent of the physical architecture and therefore has to be installed as part of the prototyping process.
This article demonstrates how to prototype machine clusters for the proposed solution utilizing the popular virtualization product, VMware Workstation version 5.5.
|What You Need|
|Windows 2000, 2003, or XP Machine
VMware Workstation version 5.5 for Windows
Popular Linux Distribution ISO files
Establish the Virtual Environment
This section elaborates on the steps required to establish a multi-machine virtual cluster that could run pairs of Web servers, application servers, and databases (see Sidebar 1: Requirements for Virtual Cluster Prototype). It describes what it takes to establish the cluster topologies with and without hardware firewalls, as well as load balancers simulated through virtual machines (see Sidebar 2: Prototyping Strategy).
Step 1: Determine Guest Machine Template
To save time and expedite the effort, you should establish a “template machine” that contains all the required components for your virtual machines. The example prototype uses MySQL 5.1 with cluster extensions installed on the template machine.
Your configuration depends on your needs and the nature of the prototype (security, size, performance requirements, etc.). You could select a “hardened” installation with only the smallest, safest set of operating system components. In that event, you should still have at least one file-transfer protocol available, so that you can add components to your virtual machines later if needed. Another approach is installing the operating system with all the options you can imagine. This installation would certainly require more resources from the virtual machine.
|Figure 1. Virtual CD Drive Mapped to ISO Image|
Step 2: Install Guest Machine
VMware Workstation offers two convenient ways to install the operating system:
- From the host machine’s physical CD/DVD drive
- From the virtual CD/DVD drive (i.e., from the ISO image on the physical machine’s hard drive)
Virtual drive installation is very quick. You may find it more convenient to have ISO images for your operating system, but keep in mind that if you have a multi-CD installation you will need to remap the virtual CD (ISO image) every time you are asked to continue installation from the next CD. For this reason, I found downloading the Server ISO or DVD ISO images for Linux distributions (e.g., CentOS) very convenient (see Figure 1).
Each CentOS Linux 3 guest will be installed in a non-graphic mode, which occupies about 1GB of space on the hard drive and runs minimal kernel services requiring between 192MB and 256MB of memory per virtual machine.
When you install the guest operating system, VMware requires you to specify all the basic parameters for your virtual machine: memory, network, allocated drive storage. For the template operating system, I usually select the default options and NAT (Network Address Translation) networking. You can customize these options later.
|Figure 2. Specifying the Type of the Clone|
Step 3: Create Clones
The ability to clone the virtual machines is the primary reason for having a template operating system in the first place. You will use it to create clone machines (i.e., other members in the cluster).
With VMware 5.5, creating clones of the virtual guests is generally a simple process. It enables you to create a linked or a full clone (see Figure 2). Linked clone is an especially convenient feature for the type of prototype discussed here. As the name implies, it creates a clone whose installation files are linked to a “parent”, an original virtual machine, and for which VMware creates only the specific configuration files. If you do not plan to move these clone machines around, linked clones are probably the best solution. If you plan on moving the guest machines across multiple host machines, I recommend going with the full clone option. It creates a full replica of the parent virtual machine.
For the example, I created three clones of the templated Linux-with-MySQL installation.
Step 4: Configure and Customize Networking
The VMware Workstation installation process automatically configures two new (virtual) network adapters on your machine: VMNet1 and VMNet8. TheVMNet1 adapter is used for the private networking between the host and the virtual machines. VMNet8 is used for NAT networking, which enables sharing of the host’s external network access with the virtual machines.
These adapters are essential for the interconnectivity and proper operation of the network between the virtual machines and the host, and for the virtual machines’ access to the Internet.
Through these adapters, VMware provides DHCP services to the virtual machines, as well as NAT access to the Internet.
VMware Network Configuration
It is time now to look at some basics of VMware network configuration and how they pertain to the cluster configuration.
VMware Workstation supports three network modes:
- Bridged networking?Virtual machines have full access to the host’s network. However, in order to gain access to the network they need to be assigned their own IP addresses.
- NAT?With NAT configuration, guest machines do not have their own IP addresses on the external network. They are assigned IP addresses in the context of the private network within the virtual environment.
Virtual machines gain access to the external network via the host machine’s VMNet8 adapter. The host machine translates the traffic coming from the virtual machines via the VMNet8 adapter as well as external traffic.
- Host Only networking?This type of networking enables the connection only between the host machine and the virtual machine. Virtual machines do not have access to the external network.
The ability to create the virtual network adapters and configure network options as described above is essential to the cluster prototyping process.
For the example cluster, the best approach is to start with Host Only networking to establish the interconnectivity between the machines in the cluster, and then to test it from the host machine.
For the cluster configuration, you need to create the subnet for the machines in the cluster and assign static IP addresses to them. To assign static IP addresses, you should manually assign them in the C-class network ranging from
On Linux, you could configure the subnet, range, and IP address for each machine. One way would be to add the following command (use the machine IP address as specified in the MySQL tutorial):
/sbin/ifconfig eth0 192.168.0.10 netmask 255.255.255.0 broadcast 192.168.0.255
and execute it on startup.
Since the VMware host does not automatically provide internal DNS service for the virtual machines, you need to manually configure some of the machines to serve the purpose of a DNS server or to configure the host files (which is outside the scope of this article).
The simplest cluster configuration has no firewalls between the machines, enabling software components to interact with each other based on your configuration preferences. In a more sophisticated configuration, you could configure special-purpose machines to serve as the routers/firewalls. (More on this option in Step 6).
Step 5: Customize Cluster-Aware Software Components
Once you have established the networking between the virtual machines, and when you are operating within the boundaries of the virtual machine, the cluster-aware software components “see” the virtual machine exactly as they see the network and surrounding software. From this point on, the example just follows the cluster setup steps for the components it prototypes: MySQL Management Server on machine 184.108.40.206, MySQL Server on 220.127.116.11, and data servers on 18.104.22.168 and 22.214.171.124 (see Figure 3). You can just follow the MySQL 5.1 clustering instructions, as the virtualization process requires nothing extra except that you must have enough RAM to support all virtual machines running concurrently.
|Figure 3. MySQL 5 Virtual Cluster Topology|
Note: RedHat-based systems require a few extra steps in order to enable unicast-based clustering.
To enable your primary network card (likely eth0) for unicast, use the following command:
ifconfig eth0 multicast
Use this command to enable unicast:
route add -net 126.96.36.199 network 240.0.0.0 dev eth0
Step 6 (Optional): Configuring Firewall
To simulate a firewall, you could install a very small Linux virtual machine with two (virtual) network adapters (eth0, eth1) and configure Bridged networking on one adapter, allowing the incoming traffic from the external network into that machine. You would configure the other adapter for membership in your virtual cluster’s subnet. Utilizing iptables and ipchains on Linux, you could configure the rules for allowed traffic between the external system (through the bridged adapter) and into the adapter (on the private subnet). (Click here for a tutorial on configuring a dual-cards Linux system as the firewall.)
Now that you have configured the virtual machine to represent the cluster of physical machines getting the clustered application up and running is completely a matter of following the directions as laid out in the MySQL documentation. From this point on there is nothing specific to the virtual machine operations anymore. Make a note that any configuration error that you may experience in the setup process will likely be related to the improper setting of the networking on Linux. It is absolutely essential that you understands all the intricacies of the network configuration before embarking on the cluster prototyping.
Once the virtual cluster is established you can proceed with the testing and experimentation that is typical for this type of the architecture: load balancing properties by generating the load and examining the switching between the servers, suddenly bringing down (powering off) one of the servers, etc.
Keep in mind that virtual machines in this configuration will not exercise the same performance properties as their physical counterparts. They will perform slower.However, the performance ratios, failures and successes observed during the experimentation on the virtual machines will be the same for the physical counterparts. If you experience the performance issues with the data replication between two virtual data servers you will see the same issues in the physical environment.
The same will apply for all the positives that you may observe during the testing.In my practice I was able to successfully configure and prototype a very large enterprise application cluster (web servers cluster, application servers cluster, database) completely in the virtual environment of the desktop. Following the configuration steps from the virtual environment and paying attention to the lessons learned on it I was able to build and configure the production class physical environment based on my virtual prototypein a record short time.
Furthermore, I was able to replicate, diagnose, research and resolve with the high degree of fidelity any issue on the virtual cluster that originally appeared on the physical environment saving me hours of lengthy research and investigation on the less accessible physical environment.
Oh, the Possibilities…
The example in this article is only one idea for utilizing virtualization to prototype clustered, highly available applications. As you can imagine, it is only the tip of the iceberg. Here are some other interesting ideas that you may find useful.
Prototyping Different Database Designs
The database is the most critical performance component of almost any system. A proper relational design and a physical storage strategy often help make a difference in how well the complete application performs. With virtualization products capable of savepoints (an important feature that enables you to save the complete state of the virtual machine at a given point in time), you can establish a baseline database architecture and then explore how well the database performs under different data loads, storage strategies, and logical optimizations such as denormalization. Savepoints will enable you to safely fall back to the original state of the application, or to the one you liked best.
Securing the Network
As mentioned previously, virtualization does not cover only the installation and configuration of guest machines, but also the configuration of virtual networks. With some creativity, you could prototype and explore different configurations and elements of network security: machine hardening, setting up and operating honeypots and traps, probing the network for weaknesses, exploits, and data leaks?and do it all within the safe confines of your own machine.
As you can see, virtualization software opens the door to many professionally exciting prototyping opportunities. Although this article could not cover all the details involved in the relatively sophisticated prototyping process, the general concepts and ideas presented hopefully showed how helpful the virtualization concept can be, even in relatively complicated, multi-machine scenarios. So explore them. You will make yourself and your organization more agile and productive in accomplishing your technical goals.