dding indexes to tables is the most effective way to optimize your SQL Server query performance. Tables that you query often may require several indexes in order to handle the various search arguments you send. There is one particular situation where indexes on large tables are a hindrance, though. When you’re bulk-copying data into the table, it’s best to have either no index at all, or just a clustered index. For those situations it is best to use Transact-SQL scripts to remove and add those indexes. In this article you’ll learn how to use Transact-SQL stored procedures to generate Transact-SQL indexing scripts that are more robust and less laborious than SQL Server’s Enterprise Manager scripting tool.
Recently I was involved in a project that needed to move a number of large database tables from one server to another. We decided to use the bcp.exe command to export the data, and BULK INSERT to import it into the new database. We did not want to have any indexes, or at a minimum a clustered index on the new tables, in order to optimize the import process.
Initially we used the Enterprise Manager scripting tool but ran into a number of problems, most related to the fact that we would have to do a lot of cut-and-paste to get a script that included just primary and unique constraints, as well as indexes, and nothing else. What we really wanted was a script to create the tables with no indexes, then a script to add all the indexes, clustered key first, once the tables were loaded.
What I needed was a scripting tool that we could modify and ensure that the order of the indexes was what we wanted. I could have tried to navigate the SQL-DMO object model and call the scripting from there, just as Enterprise Manager does, but we risked running into the same inflexibility. Instead, I made a tool using Transact-SQL, which you’ll see here, though in a much more refined and tested form.
Enterprise Manager Scripting
|Figure 1: You can use the Preview button to see the script that Enterprise Manager will generate.|
By default, it’s best to use the built-in tools that SQL Server provides rather than coding your own. You can use Enterprise Manager’s scripting tool to generate table and stored procedure scripts, and the results are very satisfactory. However, when you try to capture just indexes the results have some limits. To create an Enterprise Manager script you can first select a table in Enterprise Manager, right-click, choose the All Tasks option, and then choose Generate SQL Script. This brings up the scripting dialog. You can navigate the tabs to remove the table’s Create and Drop options and add the Indexes. You’ll get a table’s indexes that are not also primary key or unique constraints, as shown in Figure 1.
|Figure 2: When you choose constraints in addition to indexes, Enterprise Manager now gives additional information about defaults and check constraints; information that is unrelated to indexes.|
Missing the primary key and uniqueness constraints is a real problem. So what happens if you add in the constraints? The only option that Enterprise Manager gives is “Script PRIMARY keys, FOREIGN keys, defaults, and check constraints.” Unfortunately, checking this option gives back too much data, as shown in Figure 2.
If all you want to do is capture the indexes for one table, cutting and pasting the information from Enterprise Manager scripting may be enough. However, consider the following index script, generated for the authors table, with default and check constraint information removed:
ALTER TABLE [dbo].[authors] WITH NOCHECK ADD CONSTRAINT [UPKCL_auidind] PRIMARY KEY CLUSTERED ( [au_id] ) ON [PRIMARY] GO CREATE INDEX [aunmind] ON [dbo].[authors]([au_lname], [au_fname]) ON [PRIMARY] GO
First of all, you’re missing any DROP commands for the constraint and the index so you’ll have to add them in manually if you need them. Second, there’s no testing. That is, the script should test whether the index exists before trying to add it, otherwise you’ll get an error. What you really need is something that drops the old constraint or index if it’s there, then adds the new one. Here’s what would be better, following the model that Enterprise Manager uses for a table, in pseudo-code:
If the constraint or index exists Drop the constraint or index Go If the constraint or index does not exist Create the constraint or index Go
Also, you can’t predict what order the indexes will come out in. For large tables it’s wise to build the clustered index first and then all indexes later. Secondary indexes use the clustered key in their leaf nodes. You might even want the flexibility of clustered keys only, which Enterprise Manager scripting does not have.
Another potential problem arises when you try to drop primary key constraints that have foreign key references to them?you must first drop the primary key. Enterprise Manager scripting completely ignores this issue.
Even this is not enough. What about statistics? SQL Server keeps statistics on tables that can greatly aid query optimization for large tables, but Enterprise Manager will not script them. If you’re moving large amounts of data, though, you might want to recreate those statistics on a target database after you load your data.
In the code developed in this article you’ll see how you can use Transact-SQL custom stored procedures to generate index scripts in a safer and more flexible fashion?one that does not require cutting and pasting, and that has tests embedded in the code.
The Scripting Strategy
SQL Server 2000 gives you two major methods for determining indexing information: system functions and system tables. The system functions are OBJECTPROPERTY() and INDEXPROPERTY(), and the system tables are sysobjects, sysindexes, and sysindexkeys. Microsoft recommends using system functions whenever possible and the following code will do that. However, some index information can only be found in system tables so the strategy that follows uses both system tables and system functions. For more information, see the sidebar, Using System Functions Instead of System Tables.
The stored procedures that you’ll learn about in this article use the following strategy to script out index information. Here’s the strategy in outline form:
Driver procedure: usp_ScriptIndexes Create a temporary table Call usp_GetIndexInfo Fill the temporary table Loop: for each index, Call usp_GetIndexKeys Update temporary table Loop: for each table, Loop: for each index, Call usp_ScriptOneIndex Load options Generate drop script Generate create script
The outermost stored procedure, usp_ScriptIndexes, creates a temporary table to store index information, and then calls usp_GetIndexInfo to populate the temporary table. You have to take this approach because if the called procedure created the temporary table then it would fall out of scope when you return to the driver stored procedure. The usp_GetIndexInfo stored procedure calls a very specialized procedure, usp_GetIndexKeys, to get the index key information: that is to say, what the column names are for the index and whether they are ascending or descending. Once the temp table is populated, usp_ScriptIndexes can loop through the indexes in a flexible manner, table by table, generating the scripts for each index by calling usp_ScriptOneIndex.
Driving the Scripting
The outermost stored procedure, which I’m calling the driver, is usp_ScriptIndexes. Its interface is worth taking a look at before delving into the special tasks that need to be performed prior to its execution. Here’s the procedure declaration:
Use Utilities Set Nocount On Go If Object_ID('dbo.usp_ScriptIndexes', 'P') Is Not Null Drop Procedure dbo.usp_ScriptIndexes Go Create Procedure dbo.usp_ScriptIndexes @DbName varchar(40) , @OrderOption varchar(40) , @Create tinyint = 1 , @Drop tinyint = 1 , @IncludeStatistics tinyint = 1 , @Debug tinyint = 0 As
All of these procedures go in a database called “Utilities” that you can change to suit your needs. These procedures are more useful if they can sit in their own database. You can pass the database name that you want to have the indexes scripted. The second parameter is @OrderOption, which you can use to determine whether to generate the indexes with a primary key first, or a clustered key first. This is useful when the primary key and clustered index differ. The third and fourth parameters determine whether to generate create and/or drop commands in the script. The fifth parameter, @IncludeStatistics, lets you decide whether to include statistics in the output. You can modify this procedure to accept additional parameters if you would like more options.
Now that you know what the options are for the output, it’s time to dive into getting index information.
The Key: Getting Index Information
Getting complete and accurate information about all the indexes in a database is the key to scripting indexes. Once you have the index information, generating the index create and drop commands is relatively straightforward. Getting that information is the job of usp_GetIndexInfo, which fills up a temporary table called #TempSysindexes. Here’s the table definition:
Create Table dbo.#TempSysindexes ( DbName nvarchar(128) , TableID int , TableName nvarchar(128) , TableOwner nvarchar(128) , IndexName nvarchar(128) , IndexID int , IndexKeyString nvarchar(2048) , IsPrimaryKey tinyint , CnstIsClustKey tinyint , IsUniqueConstraint tinyint , IndexFillFactor tinyint , IsClustered tinyint , IsStatistics tinyint , IsAutoStatistics tinyint , IsHypothetical tinyint , IsPadIndex tinyint , IsUniqueIndex tinyint , IsPageLockDisallowed tinyint , IsRowLockDisallowed tinyint , CnstIsNonclustKey tinyint , CnstIsNotTrusted tinyint , FileGroupID int , FileGroupName nvarchar(128) )
There’s a lot of information here! The index-specific information is mostly picked up using the INDEXPROPERTY() function, which you can use to determine whether the index is a primary key, whether it’s clustered, or unique, what its fill factor is, whether it’s actually statistics and not an index, and so on. What the INDEXPROPERTY() function cannot deliver is what column names are in the index key string, and what the filegroup and filegroup name for the index is. In any case, Listing 1 shows usp_GetIndexInfo, and how it does its work.
A dynamic SQL String does the bulk of the work because the Transact-SQL ObjectProperty() and IndexProperty() system functions must be executed in the context of the database with the tables you’re trying to use them on. The first command of the dynamic string Uses the database, and even though you execute the procedure in the Utilities database, the Select command executes in the database you send as a parameter.
Once you load the index information, the procedure loops through each table id and then for each index belonging to that table id, loads the index key column and filegroup information. It was easy just to add that code to the dynamic string and execute the loop in the context of the target database.
Getting the Index Columns
Every index has a set of one or more columns from its table, and it stores the values of those columns in the actual index, in sorted order, for lookup purposes. There is no system function or system table that provides this information so you’ll call another procedure that will get it, usp_GetIndexColumns, which you can see in Listing 2.
usp_GetIndexColumns returns a comma-delimited string of the index key columns in a format that you can use to generate a create script for a constraint or index. In addition, SQL Server 2000 allows descending as well as ascending indexes, and this procedure captures that as well. In order to get that string it loads all the columns that belong to an index into a temporary table, in their proper order, along with whether the column has the Descending property. Then the procedure loops through the temporary table, concatenating the values to create the string. It loads that column data by joining on the sysobjects, syscolumns, sysindexes, and sysindexkeys tables, much in the same fashion that the SQL Server system stored procedure sp_helpindex does (which was the inspiration for this procedure?but unfortunately sp_helpindex could not be used to just get the column string).
Now that you’ve seen how to load the index information, it’s time to move back out to the outermost driver procedure and see how to generate the index script.
Scripting an Index
To generate the actual index script, usp_ScriptIndexes calls usp_ScriptOneIndex and calls it for each index that it wants to have scripted. This latter stored procedure also uses the temporary table #TempSysindexes to pick up specific information about each index so that the scripted output will be appropriate. Listing 3 shows usp_GenerateOneIndex.
The best way to understand how usp_GenerateOneIndex works is to see its output, in this case for the same authors table that you saw before. Listing 4 shows the output for the authors table, including one statistics entry.
Now you have a test for each action, and the script will not try to create or drop an index or constraint unless it really exists.
Foreign Key References
In the case of the primary key constraint UPKCL_auidind, the generated Transact-SQL will first check to see whether any foreign keys reference this table, and if so, print out a message that the foreign key must be dropped first. For instance, the publishers table’s primary key has a foreign key referencing it, and the generated script will send you the message:
Could not drop index UPKCL_pubind for table dbo.[publishers] because it is referenced by at least one foreign key constraint. Drop the foreign key constraint(s) first.
The test to do this is based on finding the primary key (or unique constraint) name in the INFORMATION_SCHEMA view called REFERENTIAL_CONSTRAINTS. Using the view allows you to bypass accessing any system tables, and insulates the code from changes to system tables down the road.
Is it an Index?
You need some way to determine whether a given name is really an index or not, and the IndexProperty() function will do the job. Every index will either be clustered or non-clustered, and if the name is not an index for the table in question, the result will be null. Consequently, you can just test for whether it’s not null:
And IndexProperty(Object_Id('dbo.[authors]'), 'UPKCL_auidind', 'IsClustered') Is Not Null
This applies to constraints as well as indexes.
Fillfactor, Nocheck, and Filegroup
Notice that usp_ScriptOneIndex has code in it detecting whether the Fillfactor is greater than zero, whether the CnstIsNotTrusted property is 1 from the temporary table, and what the Filegroup name is, and fills them in accordingly. So the primary key create command for the authors table also uses the Filegroup option:
Alter Table Pubscopy2.dbo.authors Add Constraint UPKCL_auidind Primary Key Clustered ([au_id]) On [PRIMARY]
I use the left and right square brackets to allow spaces in table and column names (see the Sidebar, Delimiting Strings).
There you have it! You can now create procedures that will generate Transact-SQL scripts to drop and create indexes in an orderly manner, with existence and foreign key testing embedded. And you can modify the driver procedure to customize it for your needs. Happy computing!