Speed Up Your SQL Inserts

Speed Up Your SQL Inserts

atabase performance is one of the most critical characteristics of almost any application, and for the developer or DBA it usually depends on how fast they can retrieve data. Hence, many a performance optimization and tuning book discusses the ways to make queries faster. Even RDBMS makers understand the need for fast data retrieval and provide different tools (indexes, configurations options, and so on) to facilitate it. However, database performance is not always represented by speed of data retrieval. In some situations, database performance relies on the speed of data inserts.

Suppose you need to design a control measuring system or survey site that stores results in a database. The task seems to be pretty straightforward, but a closer look at the specifications reveals that everything is not as simple as you might assume:

  1. You need to process and insert a very high volume of incoming transactions into a database.
  2. You have to provide 24×7 availability.
  3. The number of fields (parameters) that has to be stored in the tables can vary widely. For example, the number of questions differs in different surveys (i.e., number of controls or elements on the Web pages), or the number of detectors or measuring devices can differ for different processes in the control measuring system.
  4. The application will be used not only for data inserts but also for data retrieval, though not very intensively. (For extensive analyses, you can create an OLAP system, but that is another story.)

Comprehensive analyses of all the possible solutions for this scenario are beyond the scope of this article. However, you can generally attack the requirements from different directions. For example, you can use all or some of the following options:

  • High-speed Internet and fast networks
  • Powerful servers with fast CPUs and lots of RAM
  • Fast disks and high-performance RAID(s)
  • Load balancing for the Web servers
  • Failover clustering for database servers
  • Tables partitioning
  • Lots of storage space (SAN), and so on

All the above solutions, except for partitioning, are quite expensive. And even if you are ready to make a big investment, you still need to put together a few puzzles.For instance, any database needs to be maintained on a regular basis: indexes, backups, purges of old data, fragmentation, and so on. If you supply your data from the application (Web) server(s) directly, then during database maintenance, you can loose some of your ready-for-insert data or crash your application (or database) server. So you need to provide some kind of buffer that can temporary hold your data during the heavy resources consumption (which means slow inserts) on a database server. Of course, you can get plenty of maintenance time by adding more database servers. But in that case, you produce another problem: consolidating data from different servers into one database.

A good choice for a buffer would be a Berkeley DB, which is very efficient for repetitive static queries and stores data in key/value pairs. (Recall that the survey site or control measuring system examples submit data as control (element) name/value or detector position/value pairs.) But no buffer can grow endlessly, and if you can’t transfer data to a regular database quickly enough, your servers still will end up crashing.

Thus, the speed of inserts becomes one of the most critical aspects of the example applications.

How to Store Data?

The design of your database can significantly affect the performance of inserts. So you should be careful when choosing your database (storage) structure. For example, you might want to store data as XML. That choice is very attractive, but in this case it will slow down the inserts and occupy a lot of storage space. You may also want to build the database in the best traditions of database design: each table reproduces the real world object and each column in the table corresponds to the object’s property. But in this case (survey site or control measuring system), the number of properties (columns) is dynamic. It can vary from tens to thousands, making the classical design unsuitable.

You most likely will choose the next solution: storing your data in name/value pairs, which perfectly match with the HTML controls’ (elements’) name/value pairs and with Berkeley DB field/value data. Since most survey (control device) data values can be interpreted as integers, you probably will find it convenient to split data by type. You can create two tables, let’s say tbl_integerData and tbl_textData. Both tables will have exactly the same structure with only one exception: the data type for the “value” column will be integer for the first table and text (varchar) for the second.

Comparing Inserts

There are many ways to insert data into a table. Some of them are ANSI-compliant, while others are RDBMS-specific. But they all are either one-row inserts or many-rows inserts. Needless to say, that the many-rows insert is much faster than the repetitive one-row inserts, but how much faster? To figure that out, run the test in Listing 1.

Run all the batches from Listing 1 separately. Batch 1 creates and loads data into the table testInserts. The first insert (before the loop) loads 830 rows, selecting OrderID from the table Northwind..Orders. (If you are using SQL Server 2005 (SS2005) and haven’t installed the Northwind database yet, you can download it from the Microsoft Download Center.) Then each loop’s iteration doubles the number of rows in the testInserts table. The final number of rows for two iterations is 3,320.

To test one-row inserts, copy the result of Batch 3 into the new windows in Query Analyzer or Management Studio and then run it. In my tests on a few boxes with different hardware configurations, the execution time of the many-rows insert (Batch 2) was about 46 ms.; the execution time of the one-row inserts (produced by Batch 3) was approximately 36 sec. (These numbers relate to SS2000.) Thus, the many-rows insert is many times faster than the repetitive one-row insert.

A number of factors make repetitive one-row inserts slower. For example, the total number of locks, execution plans, and execution statements issued by SQL Server is much higher for repetitive one-row inserts. In addition, each insert (batch) needs to obtain object permissions, begin and commit transactions, and write data into a transaction log (even for a simple recovery model).

The following are just a few results that I got by using the Profiler and tracing the inserts:

  • BEGIN…COMMIT transaction pairs ? 7,265 for one-row inserts versus one pair for many-rows inserts
  • Writes to transaction log ? 11,045 and 6,360, respectively
  • Locks ? 26,986 and 11,670, respectively

You also should remember that SQL Server has a pretty complicated mechanism for finding the space for new rows. For the heap tables, as in Listing 1, SQL Server uses IAM (index Allocation Map) and PFS (Page Free Space) pages to find a data page with free space among the pages that have been already allocated to the table. If all the pages are full, SQL Server, using GAM (Global Allocation Map) and SGAM (Shared Global Allocation Map), tries to find a free page in a mixed extent or assign a new uniform extent to the table. For the Listing 1 example, which has a heap table and no deletes, SQL Server inserts data at the end of the last page, allocated to the table. This is may produce a “hot spot” at the end of the table in a multi-user environment or when a few application servers are talking to one database.

Thus, for repetitive one-row inserts, SQL Server will launch the allocation mechanism as many times as you have inserts. For the many-rows insert, the space will be allocated immediately to accommodate all the inserted rows. For tables with indexes, you can additionally expect splits of data pages for clustered indexes and/or the index updates for nonclustered indexes.

How to Make Your Inserts Faster

To make your inserts faster, the obvious solution is replacing the repetitive one-row inserts with many-rows inserts. Using the example in Listing 2, this section demonstrates how to do that.

To trace the inserts, I created the INSERT trigger on the table tmpInserts. Every time a trigger is fired, it just prints the word Hello. To transform one-row inserts into many-rows inserts, I ran an INSERT... SELECT statement, where the SELECT part consists of many simple SELECT statements connected by UNION (ALL). I placed everything in the string variable and executed it dynamically. As you can see, for row-by-row inserts, the trigger was fired as many times as inserts I made (three, in this example). For the many-rows insert, the trigger was fired only once.

So, how can you apply this inserts technique to an application for a control measuring system or a Web site (e.g., a survey site with very high volume of transactions)? Well, when a user submits the form, the application (Web) server receives it as a sequence of name-value pairs, corresponding to the controls (elements) on the form. All you need to do now is slightly modify that sequence and forward it to a database server, which will take care of the inserts.

The examples in Listing 3 and Listing 4 show how the string, transferred to a database server, should look and how it can be processed and inserted into the table(s).

I used the table testInserts that I created and loaded in Listing 1 (Batch 1). The value of the variable @numElements defines the number of name-value pairs, which is the length of the string that will be generated. The letter x serves as a placeholder. (I’ll explain its purpose later.)

Listing 4 is a stored procedure that will process and insert data submitted to a database server.

Here’s the whole trick. You need to replace each placeholder (x) in the string-parameter with the phrase UNION ALL SELECT, and then execute this modified string. Now you can test the solution as follows:

  1. Create and load the table testInserts, if you don’t have it yet (see Listing 1, Batch 1).Run the script in Listing 3.
  2. Create the test table t3. Create the stored procedure spu_insertStrings (Listing 4).
  3. Copy and paste the result of step 2 into the new Query Analyzer (Management Studio) window. You will get something like the following script:
    SET NOCOUNT ONGOSET QUOTED_IDENTIFIER OFF     spu_InsertStrings "a=1,b=10249x2,10251x . . . . . x249,11071x250,10250"GO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . spu_InsertStrings "a=3251,b=44050x3252,44053x . . . x3319,44289x3320,44292"GO

Don’t forget to include two SET statements in the very beginning of the script for NOCOUNT and QUOTED_IDENTIFIER. Then run the script and make a note of the execution time.

Using the string-inserts technique, I was able to insert 3,320 rows into the table t3 in 2 seconds. That was 18 times faster than in the repetitive row inserts. (These numbers relate to SS2000. With SS2005, I saw improvement in the 60-70 percent range.)

Some Limitations

The many-rows insert has one serious side effect. It can lock a table for a long time, which is unacceptable in multi-user environments. However, this is not the case in the example scenario, where you insert just a few hundred rows in one shot. That can’t cause the locking problem.

The 8,000-byte limit to the length of a varchar variable produces another inconvenience. However, you can solve that problem by storing incoming strings in a separate table and running another process that checks for the completed sets of the strings that belong to the same survey and user submission. Then you can insert such a set into the working table asynchronously.

In SS2005, where the varchar(max) data type can store up to 2 GB, you have much more flexibility. You can adjust the length of the string to any size up to 2 GB and try to get the optimal performance of the string inserts.

One last note: validate data in the body of your stored procedures. Although it will make your stored procedures heavier, your string inserts still will be much faster than repetitive one-row inserts.

devx-admin

devx-admin

Share the Post:
USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India,

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies in the USA. Through a

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising

FinTech Leadership

Terry Clune’s Fintech Empire

Over the past 30 years, Terry Clune has built a remarkable business empire, with CluneTech at the helm. The CEO and Founder has successfully created eight fintech firms, attracting renowned

The Role Of AI Within A Web Design Agency?

In the digital age, the role of Artificial Intelligence (AI) in web design is rapidly evolving, transitioning from a futuristic concept to practical tools used in design, coding, content writing

Generative AI Revolution

Is Generative AI the Next Internet?

The increasing demand for Generative AI models has led to a surge in its adoption across diverse sectors, with healthcare, automotive, and financial services being among the top beneficiaries. These

Microsoft Laptop

The New Surface Laptop Studio 2 Is Nuts

The Surface Laptop Studio 2 is a dynamic and robust all-in-one laptop designed for creators and professionals alike. It features a 14.4″ touchscreen and a cutting-edge design that is over

5G Innovations

GPU-Accelerated 5G in Japan

NTT DOCOMO, a global telecommunications giant, is set to break new ground in the industry as it prepares to launch a GPU-accelerated 5G network in Japan. This innovative approach will

AI Ethics

AI Journalism: Balancing Integrity and Innovation

An op-ed, produced using Microsoft’s Bing Chat AI software, recently appeared in the St. Louis Post-Dispatch, discussing the potential concerns surrounding the employment of artificial intelligence (AI) in journalism. These

Savings Extravaganza

Big Deal Days Extravaganza

The highly awaited Big Deal Days event for October 2023 is nearly here, scheduled for the 10th and 11th. Similar to the previous year, this autumn sale has already created

Cisco Splunk Deal

Cisco Splunk Deal Sparks Tech Acquisition Frenzy

Cisco’s recent massive purchase of Splunk, an AI-powered cybersecurity firm, for $28 billion signals a potential boost in tech deals after a year of subdued mergers and acquisitions in the

Iran Drone Expansion

Iran’s Jet-Propelled Drone Reshapes Power Balance

Iran has recently unveiled a jet-propelled variant of its Shahed series drone, marking a significant advancement in the nation’s drone technology. The new drone is poised to reshape the regional

Solar Geoengineering

Did the Overshoot Commission Shoot Down Geoengineering?

The Overshoot Commission has recently released a comprehensive report that discusses the controversial topic of Solar Geoengineering, also known as Solar Radiation Modification (SRM). The Commission’s primary objective is to

Remote Learning

Revolutionizing Remote Learning for Success

School districts are preparing to reveal a substantial technological upgrade designed to significantly improve remote learning experiences for both educators and students amid the ongoing pandemic. This major investment, which

Revolutionary SABERS Transforming

SABERS Batteries Transforming Industries

Scientists John Connell and Yi Lin from NASA’s Solid-state Architecture Batteries for Enhanced Rechargeability and Safety (SABERS) project are working on experimental solid-state battery packs that could dramatically change the

Build a Website

How Much Does It Cost to Build a Website?

Are you wondering how much it costs to build a website? The approximated cost is based on several factors, including which add-ons and platforms you choose. For example, a self-hosted