RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Load Balancing for Web Application Performance and Scalability

Load balancing is a critical component for scaling web applications while maintaining performance levels.


o create scalable enterprise web applications, you need to consider both client- and server-side components. A solid code base, judicial cache implementations, and content acceleration via compression helps to create an optimum foundation for a high-performance application. This article focuses on the server side, and introduces the concept of load balancing from a performance and scalability point of view. You'll see a high-level review of some common schemes for load balancing, and see how it can help scale the application to maintain a high level of performance for end users. This article doesn't cover the process of setting up load balancing (that's a subject in itself), but there is a wide variety of specialized products and books available that cover the topic.

Consider Yahoo as an example. Yahoo’s portal exposes web applications accessed by millions of users throughout the world. These are dynamic web applications that perform database transactions and render content in real time—more than just images or static HTML content. The end users often visit the same web applications on Yahoo repeatedly, and they expect the same (or better) performance each time. If they don't get that level of performance, Yahoo risks losing its user base to its competitors. Each user click causes a certain amount of load that goes to the servers. With millions of clicks, this server load multiplies rapidly.

Because a server has only finite power, it's imperative to have a collection of servers—server farm—to handle all the user requests. But each server runs separately, so as the load increases, the requests need to be scaled across multiple servers to maintain the same level of performance to the end users. Load balancing is the process that makes it possible to distribute the incoming requests across multiple servers.

If one or more servers goes down, the load balancer must recognize the changed capacity, and redirect requests accordingly. This ability to give a seamless unaffected experience to end users is part of load balancing and is called high availability.

Simply put, load balancing makes it possible to provide a single point of entry for a server farm, but distribute the load across all the servers in the farm. Load balancing is useful not only for HTTP web applications, but also for other applications that use other protocols, such as FTP and chat applications.

Load Balancing Schemes

A typical load-balancing solution has both hardware and software components, but solutions exist that are purely software or hardware based.

  • Software-based load balancers run on a server that all the clients connect to. The software listens on a port, and determines which server in the server farm should handle the request. Software load balancers are like a reverse proxy cache (although usually without the cache) that act on behalf of the server, forwarding incoming requests to outgoing servers. This implies that the servers themselves cannot be reached directly by the users. The load balancer decouples the requesting client from contacting the back-end servers directly, in essence providing a layer of security. One open source example product is Apache's mod_proxy_balancer extension. 
  • Hardware-based load balancers can be based on routing, tunneling or IP translation. These can be complicated, so if you're considering that route, you should probably plan to use professional help for setup and configuration. 

Load balancers use a variety of methods to select the server to service a given request. The method can be as simple as random choice or a round robin approach; however, more complex solutions, such as those offered by some commercial players, also factor in server load, the current traffic conditions, the server's proximity to the end user, geographic location, recent response times, etc. For example, if the load balancer knows that a particular server has more load than another (e.g. due to a geographic location), it can assign a ratio so one of the servers gets a greater load.

Many applications require HTTPS (i.e. SSL-enabled connections). These applications are both hardware and software resource intensive because of encryption and decryption overhead. You can mitigate the heavy burden by deploying special hardware for SSL connections that perform the encryption and decryption tasks, reducing the burden on the web servers. Similarly, you can assign one server to handle security (user authentication and authorization), decoupling those tasks and leaving other web servers dedicated solely to handling content requests and responses.

A load balancer can also buffer server responses. When the load becomes high, the load balancer can hand out responses from the buffer or cache, saving server capacity to perform other priority tasks.

The load balancer can use "health checks" to identify downed or overloaded servers. A server's health can be diagnosed in either near-real-time or on a scheduled basis. Bad servers can be replaced by good ones manually or automatically depending on the requirements and capabilities of the particular solution.

Large-scale portals can have huge issues from hackers, who employ various means to attack sites. For example, programs that continuously poll a URL to implement Denial of Service (DOS) and Distributed Denial of Service (DDOS) attacks can create huge server loads. Load balancers can provide features to mitigate or prevent such attacks.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date