Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Scalable Architectural Guidelines: Designing for Flexibility : Page 2

Both stateless and stateful components have a legitimate role in any scalable system architecture. This applies equally to both Client Server and n-Tier architectures. This article covers basics and not-so-basics concepts related to scaling up and scaling out.


advertisement
Scalability Issues
In the following paragraphs I’ll identify various scalability issues and present proposals which address these issues through specific software construction practices. By putting these architectural concepts into practice you’ll be able to develop your own highly scalable applications.

Outward Bound
So you’ve got the additional hardware and you’re ready to scale out. Before we begin to spread the software around, we need to consider two basic strategies; these are Replication and Distribution. Replication involves replicating the software across multiple platforms. Some sort of a load balancing mechanism is generally employed so that incoming client requests are routed to the server which is under the least load, at any given moment. Distribution involves actually breaking out the software into tiers, and installing different tiers on different platforms. With this scheme, no load balancing is required since the client continues to be serviced by the single outermost tier. It is thus irrelevant to the client whether or not the software is distributed across multiple servers since this distribution is invisible to the client.

Replication: This strategy is fairly common for really large scale systems. There are many benefits to replication. One advantage is that it leaves the architecture of each individual deployment unchanged. Each deployment is essentially an entire system, (except for the centralized database) which operates independently of any other installations. Another benefit to replication is redundancy. A replicated system may be maintained and upgraded in a phased manner since one server can be taken offline without shutting down the entire system, as long as at least one replicated server continues to operate. Similarly, the failure of a single server will not cause the entire system to fail.

The disadvantage to replication is that it can be a bit more expensive than distribution. Replication requires the duplication (at least) of the existing application infrastructure (except for the database server) plus an additional component, the load balancer, in order to spread requests out among the replicated servers. Different load balancing methods are available, however the best load balancing arrangements use real-time load information to route requests to the server which is under the least stress at the time of the request. The implication of this is, that under this type of load balancing, successive requests by a specific client may be serviced on different machines. We will consider the software implications of this in a few moments.

Distribution: The advantage to distribution is that distribution can be less expensive than replication. At the lowest end, distribution can be performed by spreading the software across the existing application and database servers. Distribution is usually taken as the first step to a scaled out deployment. Distribution without replication doesn’t require any load balancing since only one server is available for client connections. However, you don’t gain any of the redundancy advantages which come with scaled out replication.

Distribution can also be advantageous depending on the specific profile of your transactions. If you can determine that the backward facing data communications between your Business/Data layer and the database is minimal compared to its forward facing communication with the UI layer, then you might wish to distribute the Business/Data layer, or specific transactions, onto the database server in order to minimize network traffic and reduce associated latency.

Transaction—a unit of work measured from request to response. Not to be confused with an MTS/COM+ database transaction. To be sure, an MTS/COM+ database transaction qualifies as a transaction. But not every transaction is an MTS/COM+ database transaction.

This type of deployment architecture is fairly common, for medium and smaller systems, especially with IIS applications, where you’ll frequently find the UI and Business layers deployed on the IIS server, with the Data layer installed on the database server.



Naturally, this scaled out distribution assumes that the database server has sufficient resources to bear the load for both the database and the Business/Data software layers. Other scenarios which would benefit from a scaled out distribution might be where a particular component or logical tier consumes an inordinate amount of resources as compared to the rest of the system. In such a case, distributing that portion of the software onto it’s own dedicated server, depending on the specific circumstance, might very well alleviate the load on the system as a whole, resulting in increased system performance.

One more example of where it might make sense to distribute your application layers is where you can determine that the database server is idling while your IIS server is under heavy load. In this case it probably makes sense to distribute your software layers between these two servers in order to share the load equally between the two servers. Again, this is a preliminary scalability solution which addresses the current load problem with the hardware already available.

As you can see, scaled out distribution is a viable alternative for improving performance in a limited number of scenarios. Scaled out replication on the other hand, duplicates practically the entire supporting hardware which should result in an immediate 50% reduction in server load. (That is, on the application servers. It is true that this will probably result in an immediate load increase on the database server. However, if my experiences have been typical, a database server which is adequately supporting a single application server should have power to spare in order to support a second application server. Standard disclaimer your mileage may vary!) Nonetheless, both of these options are available when scaling out and indeed many scaled out hardware configurations will ultimately contain a mix of replicated and distributed software deployments. Let’s use the mixed deployment architecture presented below when considering the software construction ramifications for both of these types of scaled deployments.

The system configuration depicted above shows a deployment configuration with both Replicated and Distributed deployments. As you can see, both the UI and Business layers are replicated with load balancing. You can see as well how the UI and Business layers are distributed over different physical servers. Browser based clients call into the IIS servers from across the Internet. Windows clients call straight into the business layer since they provide their own client-based user interfaces. Let’s assume also, that the Data layer is installed locally to the Database server for network efficiency as described above.



A couple of points regarding this architecture. First, my intention is not to recommend this, or any particular deployment configuration. Obviously, no deployment strategy can be proposed without a careful and comprehensive analysis of the particulars of the specific application being considered. This is certainly not the case here, since I don’t know anything about the systems you work with. I’m simply proposing this configuration as the basis of our discussion since it contains a variety of the different scaled out deployment strategies we’ve mentioned.

Second, architecture gurus will immediately note the lack of a replicated database server. It is true that while a large scale deployment will usually implement some sort of replication, I have omitted this from the diagram. (In my defense, I’d like to point out that most large scale deployments will also include some sort of offline backup device, yet I’ve omitted this from the diagram as well, since it, like the replicated database, is largely irrelevant to our discussion.) I’d like to confine our discussion to specific coding practices which can be helpful in developing software so that it can evolve through various deployment architectures. In this context neither the backup device nor the database seems relevant. The replicated databases which I’ve seen, have used specific vendor supplied software in order to perform cross-replication. I haven’t encountered any specific application coding practices which are necessary in order to address this and consequently I’ve chosen to omit the topic of replicated databases from our discussion.


As we previously discussed, the ideal load balancing scenario allows any client request to be handled by the server with the lowest level of stress at the actual time of the transaction. It is therefore quite possible that subsequent client requests will be handled by separate servers. This has a couple of ramifications. First of all, it is easy to see how client / server relationships must be stateless (on the server) in order for this to be accomplished. If state is accumulated on Server A, it will be of absolutely no use if the next client request is serviced by Server B.

There are two ways to address this. The first way to address this is to establish a state maintenance repository which is available to all machines to which it is relevant. This would most probably imply additional hardware for state storage at the tier, or global application level. State is generally stored in the repository with a unique key identifying a particular session or transaction. This key is delegated back to the client and returned to the server on successive calls so that the appropriate state information can be retrieved from the repository. The diagram below shows the server architecture expanded to include this new state repository.



Theoretically, the application database can be used as the state repository; generally though, complex relational capabilities are not required for the relatively simple task of state maintenance. LDAP, the lightweight directory access protocol, is commonly used to implement a server side state repository. One product which I have worked with, Microsoft Site Server / Personalization & Membership, is based on LDAP.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date