long with the proliferation of the Internet and large-scale intranets, the requirements of enterprise applications have evolved. Launching a service in an expanding enterprise could put an ever-increasing load on the application forcing more server resources to be added to be able to sustain a reasonable user experience. You are left with two choices, scale up or scale out.
An application that performs well is no guarantee that it will scale well and the only way to know for sure is to test.
To verify the scalability of applications employing the object-relational mapping technique implemented by Pragmatier Data Tier Builder, we have performed extensive scalability testing of the Pragmatier PetMarché sample project. This is an account of the results of these tests.
The test team at Pragmatier would like to thank all the people who enthusiastically helped realise these tests, especially Allan Knudsen and Lars Backhans at Microsoft Sweden. We would also like to extend special appreciation to Jimmy Nilsson, independent consultant, who kept an eye on the process and gave invaluable feedback.
|Editor’s Note: The authors are principals of Pragmatier, makers of the Prgmatier Data Tier Builder.|
The Test Lab
The tests were performed at a test lab provided by Microsoft Sweden during a couple of weeks in March 2003. The tests were conducted in co-operation with experienced Microsoft testing personnel.
The primary goal of the Pragmatier PetMarché scalability tests was to perform a so-called Proof of Concept. In other words proving that applications using the robust object-relational mapping provided by Pragmatier Data Tier Builder can scale in an acceptable manner. Separate tests were performed to try Browse and Purchase scripts, Web services and Distributed Transactions.
The hardware configuration consisted of three physical tiers. The database server tier was comprised by one four-way 2GHz Xeon rack mounted server and one two-way 1.4Ghz PIII rack mounted server. The former was used as the main database and the latter maintaining shared state. Both servers in the database tier accessed a Compaq MSA 1000 shared storage disc of 1 Tb through a 1Gbit/s fibre channel.
The application server tier was a network load balanced (NLB) web farm consisting of 1 to 4 two-way 1.4GHz PIII rack mounted servers.
The client simulation tier consisted of two subnets, one with four four-way Xeon rack mounted servers and one with four dual processor Xeon workstations.
The client simulation tier, application server tier and database server tier were connected through 1Gbit/s Ethernet network.
The database server tier was running Windows 2000 Advanced Server with SQL Server 2000. The application server tier was running Network Load Balanced Windows 2000 Advanced with .NET framework 1.0 alternatively Network Load Balanced Windows 2003 with .NET framework 1.0. Shared state was stored on the less powerful database server.
The client simulation tier was running Windows 2000 and Application Center Test, which is a web application stress tool. ACT was setup to run on several machines.
The Sample Application
The tested application was a slightly modified version of Pragmatier PetMarché, an enterprise e-commerce application written in VB.NET. It shall be noted that the application would have to be slightly modified in a real world case to integrate it with existing IT-infrastructure and to add better security and stronger monitoring capabilities. Implemented properly, none of these extensions would affect the scalability of the application.
The shopping cart was stored in a shared session state server to allow Network Load Balancing (NLB) with network client affinity turned off. Each request was dynamically routed to a new server regardless of which server handled the previous request allowing fail-over and symmetric load balancing.
A series of tests were performed in order to identify scaling characteristics. The tests were run for 10 min with a warm-up and cool down period of 30s. The test period was long enough for the setup to stabilize and consistent reading were seen during all tests.
There were also a number of endurance tests performed where the application was run for more than 12 hours to show that there weren’t any signs of performance degradation over time. These tests were run at a consistently high load, approximately half of the maximum limit.
Each simulated user had a 10 second think time once a request had been fully received. This means that each user performs one request every 10s + TTLB, where TTLB is the time it takes from the request has been made to all the data has been received. With 1000 concurrent users you would have slightly less than 100 req/s.
The database was reset before each test-run to contain the same default data:
• 1,000 products, with totally 50 000 items
• 100,000 users
• 10,000 orders
The Browse and Purchase Test
During the “Browse and Purchase” test, two scripts where run simultaneously on separate ACT clients to simulate a near 50/50 mix of browsing and purchasing. A number of operations were performed to simulate a visitor browsing the website.
1 Go to the site home page
2 Perform three (3) free text searches on 1000 products
3 Examine the products in three (3) different categories, clicking the next button three (3) times
4 Examine the details of three (3) different product items
Similarly, a number of operations were performed to simulate a visitor who is making a purchase:
1 Perform two (2) free text searches on 1000 products
2 Sign in using a random UserID from one of 100 000 available customers
3 Add two (2) random items, from 50 000, to the shopping cart
4 Checkout, verify account information and place the order
5 The user signs out
6 Perform one last free text search
The Web service Test
A single web service test was performed were the clients used HTTP GET call with a supplied OrderID, whereby the web service returned that order including customer and delivery information. The following information was returned:
• Order Information
• Order Items
• Credit Card Information
• Shipping Address
• Billing Address
• Customer Contact Information
The Distributed Transaction Test
Finally, a test was performed using distributed transactions. The following actions were completed:
1 Sign in using a random UserID from one of 100 000 available customers
2 Add a random item from 50 000 to the cart, order, verify account information and check out. Do this totally ten (10) times
3 Sign out
Note: During the distributed transaction test, NLB was disabled because it was interfering with the DTC. Each client was assigned a specific server to make calls to.
The Test Results
The maximum number of pages served when running flat out with four application servers in the web farm was 66 million pages per day (24 billion pages per year) with sub second page response times. If only 1 in 1000 of those page visits lead to an actual purchase of lets say 20 USD, this e-commerce company would be selling virtual pets with an annual turnover of 482 MUSD. Without discussing how reasonable this market forecast is, we can conclude that it would have been a major operation and there is still a lot of room to scale up the database tier and scale out the web farm.
A note on the diagrams. Different diagrams can have different scales. Also note that req/s are measured on the left vertical axis, whereas response times (TTLB) are measured on the right vertical axis.
Another important thing to note is that we have used processor utilisation as a measure of how hard a server is working. This is of course not the only limiting factor, but we found it to be a pretty practical metric that is easy to understand. Rather than trying to weigh in all the components of the server to create a more accurate (and more cryptic) custom index, we trust your common sense to draw the right conclusions regarding this.
Browse and Purchase
The first test involves a simple Browse and Purchase experience. This is the most common activity on an e-commerce site. A snappy browsing experience is important in order to give the user a positive relationship with the web shop. A browser request shouldn’t take more than a second to fulfil, including the network roundtrip.
Test with one (1) application server: The diagram in figure 8 (immediately below) tells us how the average page response time, measured as Time To Last Byte (TTLB) received by the calling client, decreases as the number of concurrent users increase. The average processor utilisation increases from 15% to 89%, beyond this point, response times will drop due to exhausting the processors. The response times are fantastic, even with 2000 concurrent users each page is delivered in less than 100ms leaving 900ms for network roundtrip and to load graphics until a full second has passed. This allows very fast browsing. The test also shows that the amount of available memory on the application server only decreases from 648 Mb to 610 Mb. We could have done with a lot less memory, and there is plenty of room if the application grows in complexity.
Test with four (4) application servers: The diagram in figure 9 (below) shows the same test running on 4 Network Load Balanced (NLB) application servers. Even running flat out with 8000 concurrent users making 771 req/s (the equivalent of 66 million requests per day), the response time is as low as 540ms leaving plenty of room for network roundtrip. At this point the application servers are running with 91% processor utilisation and the database server at 72% processor utilisation. The strain on the network is high at this point and the workstations in the client simulation tier were getting a small number of network socket errors (58 in all, equivalent to 0.01%).
The average network bandwidth used by the client/application server communication during this test amounted to 11.6 Mb/s, which is slightly less than half of the maximum (25 Mb/s) possible with NLB on a single 1 Gigabit network.
An interesting note is that the shared state server only utilised 14.8% processing power when running 8000 concurrent users with 4 application servers in the web farm (figure 11). This indicates that using a shared session server won’t be a bottleneck when scaling the application further.
Looking at the diagram in figure 10 emphasises our findings. The number of pages served scales linearly when adding new servers to the web farm until the database server becomes the bottleneck. We used a four-way 2GHz Xeon server, when increasing the load further you will want to scale up the database server. Recent TPC tests running SQL Server on 32-way Itanium servers indicate that you will have plenty of room to grow…
As you can see in the diagram of figure 12, response times can be improved by adding more servers to the web farm when running with a constant load (in this case 2000 concurrent users). This is good news, proving that adding servers to the web farm can be used to improve response times, given that the servers in the web farm are the bottleneck.
The web service test mainly puts pressure on the database. With 1000 concurrent users, the database was running at 99.9% processor utilisation with the single application server only at 39.4%. This makes a lot of sense since we performed few and uncomplicated business rules, basically only gathering and presenting data (see the test description in section 8).
Note that the response times increased significantly when we passed 800 concurrent users. At this point it would be advisable to scale up the database server. This is further emphasised by the tests with 4 application servers, which shows more or less the same curve.
The last test that was performed was a distributed transaction test. During the test, which ran for 12 hours on three dual processor PIII application servers with Windows 2003, a total of 310k distributed transactions where committed, none aborted. This corresponds to 433 transactions per minute.
The test wasn’t designed to try the limits of the solution under stress; rather, it was performed to show how the application responded to extended use under normal pressure. There was no degradation of performance over time.
|Number of Transactions||
Pragmatier Data Tier Builder .NET Professional Edition allows you to rapidly develop massively scalable applications on the .NET platform. Examining the benefits of increased coding efficiency, automation and a more natural object-oriented data model promises productivity boosts of more than 400%, lower cost of maintenance and shorter time to implement changing requirements. This translates to an excellent Total Cost of Ownership.
On top of that, this report shows that you achieve near linear scaling when adding more application servers to a Network Load Balanced web farm.
Pragmatier Data Tier Builder .NET Professional Edition, Visual Studio .NET, the .NET platform and the .NET framework we believe present the most cost efficient solution to enterprise application development leveraging superior flexibility and excellent total cost of ownership.
More information can be found at www.pragmatier.com.
Issues and Optimisations
Test results are only useful if you have a complete picture of the circumstances surrounding the tests. It is easy to draw the wrong conclusions because you lack knowledge about the specifics – hence the saying “lies, damn lies and benchmarks”. We experienced a couple of issues and implemented a small number but important optimisations during the tests.
• During browsing, a call is made to the URL “[…]/petmarchev10” which returns a “Resource has moved” message quickly (.02 msec) since it doesn’t require any processing. We have subtracted these requests from all results in order to get more accurate values.
• During the distributed transaction tests we could not use Network Load Balancing because it conflicted with the Distributed Transaction Coordinator. The reason for this was never uncovered during the tests, instead we ran the servers in the application server tier in parallel, simulating that the application directs each visitor to a specific server when placing orders. This is in practice an only a minor set back, but caused some frustration when trying to pin down the problems.
• ACT only generates 2000 concurrent users per controller. We had to manually start four controllers simultaneously to simulate 8000 concurrent users. This had no affect on the tests except being cumbersome.
We made a handful of optimisations to improve the locking of resources in the database. These optimisations were implemented during the first days of the test and will be included in a free update of Pragmatier Data Tier Builder. Expect improved performance when running with many concurrent users.
If you have any comments regarding these tests, good or bad, please let us know. If you are interested in learning more about the tests, object-relational mapping on the .NET platform or especially Pragmatier Data Tier Builder visit our website or contact us.
Pragmatier Data Tier Builder .NET Professional Edition
Pragmatier Data Tier Builder is an integrated modelling tool, code and database generator. It generates a complete data tier including an O/R DAL (data access layer with object-relational mapping) and a fully normalised back-end database schema. It can also wrap and extend existing databases.
Jimmy Nilsson, JN SystemKonsult AB
Network Load Balancing Technical Overview
Google search: ”NLBtech2.doc”
Windows 2000 Advanced Server
Windows Server 2003
Microsoft Application Center Test 1.0
Visual Studio .NET
Object-Relational Mapping – Taking the Horror Out of Data Access
O/R Mapped Object Persistence Is The Boon