RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Load Balancing for Web Application Performance and Scalability  : Page 2

Load balancing is a critical component for scaling web applications while maintaining performance levels.


DNS-Based Load Balancing

The Internet's Domain Name System (DNS) resolves a domain name into an IP address. Round-robin DNS load balancing is a popular method for load balancing that does not require special hardware or software components. To implement DNS load balancing, it's essential to have a good understanding of DNS and the DNS resolution process.

Imagine that an end user requests the URL www.iberindia.com from a browser. The browser needs to reach the servers of www.iberindia.com, but it first needs to translate the URL string into the IP address that identifies the iberindia server. DNS translates the domain portion (iberindia in this case) of the URL string into an IP address.

The browser first checks its cache of resolved addresses. If www.iberindia.com was visited recently, the IP address might be available in the browser's local DNS cache. In this case, the browser will directly make a connection with the IP address in the cache. When the cache contains more than one IP address for www.iberindia.com, the browser will try them in sequence until one succeeds in making an HTTP connection with the server.

Author's Note: The preceding scenario is a generic description only, because browsers can behave differently, and browser behavior can be customized. The browser's DNS cache is different and distinct from the browser's HTTP content cache. An end user typically has more control over the HTTP content cache than the DNS cache.

When the IP address is not cached, or when none of the IP addresses work, the browser must obtain a fresh IP address. The browser contacts the operating system to resolve the name into an IP address. (The operating system may maintain a DNS cache of its own, and will return any cached IP address.)

If the operating system does not have a cached IP address for the domain, it issues a query. Figure 1 depicts the chain of flow. The local resolver on the machine of the end user issues a query to the root domain name server. In turn, the root servers contact sub-domain name servers until an address is obtained, which is returned to the browser. If no matching IP address is found, the resolver returns an error code.

Figure 1. Typical DNS Request Flow: Depending on cache availability, DNS requests flow from a client to a local resolver, to a root domain name server, to sub-domain servers, and eventually, back to the client.

With that understanding of DNS in place, let's go back to the round robin scheme. A suitable model can manage DNS responses to end user DNS requests to engage a set of servers in a round robin fashion. For example, one way to implement this is by responding to a DNS request with multiple IP addresses. When the client requests an IP, the browser tries the first IP in the DNS response. If that doesn't work, it tries the next IP, and so on. With each successful connection to a server IP, the sequence of the returned IPs also changes. Thus, each time any client requests an IP, it gets a different IP, effectively distributing load across the set of servers.

This technique works to distribute load between both web servers and FTP servers. It is most popular in geographic load balancing, where end users might be spread across different parts of the world. For timely request servicing, the servers might also be located in different parts of the world. Obviously, it's desirable to service a given request using a server located as close as possible to the requesting machine's location. You can configure this type of load balancing by providing the end user with an IP list containing the local or near-local server first in the list of IPs returned the DNS request. When more than one server is available locally, you can permute the DNS list (and balance the load) by varying the sequence for the next request.

Unfortunately, the simplicity of this scheme also has a downside. For example, there's no automatic health check, so if a local set of servers all goes down, the DNS response containing the IP list for that particular location would still contain the downed server IPs. In addition, round robin DNS load balancing doesn't balance based on load from an end user, but merely on the sequence of end user requests. To overcome these issues, you can implement ways to poll the servers for both availability and load. As you can imagine, such tasks grow in complexity very rapidly, so for large-scale portals, it might be better to rely on a commercial player in this space than using a do-it-yourself (DIY) approach.

When considering a load balancing solution, look for three things: performance, reliability and scalability, then assess the tradeoffs between various solutions based on your requirements and budget.

Puneet M. Sangal, Practice Manager, has 12 years' experience in global management, project management, selling services, consulting, and software development in Europe, Pacific, India, and the U.S. He has studied executive management at IIM Calcutta, and holds an MS from Northeastern University in Boston, MA, and a BE from BIT Ranchi.
Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date