hen I was writing code for user interfaces, I needed to present data in different ways. Logically it wasn’t difficult, but since the presentation layout was often quite different from the storage (tables’) structure, I had to use numerous record sets, loops, and arrays to manipulate the X and Y axes of tables for display. This approach required a lot of code, made debugging more complicated, and bored me to death. So I came up with a simple back-end solution that allows fast and easy axes transformation. This article demonstrates my Y-to-X and X-to-Y transformation techniques.
Y-to-X or Vertical to Horizontal Transformation
The examples in this article use the employee table from the pubs sample database for SQL Server. If you are using SQL Server 2005 (SS2005) and didn’t install the pubs database yet, download it from the
Begin by running the following query:
SELECT DISTINCT DATEPART(yyyy, hire_date) hire_year FROM pubs..employee
You will get the following result:
hire_year 1988198919901991199219931994
Assume that you need to present the hire year horizontally, as follows:
hire_year 1988 1989 1990 1991 1992 1993 1994
You can do that pretty easily using the following script:
1 DECLARE @str nvarchar(100)2 SELECT @str = 'hire_year'3 SELECT @str = @str + CHAR(9) + hire_year FROM 4 (SELECT DISTINCT CAST(DATEPART(yyyy, hire_date) 5 AS varchar(10)) hire_year6 FROM pubs..employee) T17 SELECT @str
The whole trick is in line 3, where @str
accumulates data, appending the new values to the previous ones. The source of the values is a specified column, in this case the hire_date column. Char(9) serves as a delimiter (I selected TAB (ASCII code 9), but you can choose whatever you want: comma, expression, and so on).
Using the same technique, you can create a table where the values of the original column hire_date will represent the columns’ names:
DECLARE @str nvarchar(1000)SELECT @str = ''SELECT @str = @str + ',' + hire_year + ' [' + hire_year + ']' FROM (SELECT DISTINCT CAST(datepart(yy, hire_date)AS varchar(10)) as hire_year FROM pubs..employee ) T1-- delete the very first commaSELECT @str = SUBSTRING(@str,2,len(@str) - 1)SELECT @str = 'SELECT ' + @str + ' INTO hire_year FROM employee WHERE 1=0'EXECUTE sp_executesql @strSELECT * FROM hire_yearDROP TABLE hire_year
Create Your Own Aggregate Functions
Using the Y-to-X transformation, you can easily substitute a SQL standard aggregate function or create your own. For example, you can calculate the number of rows in the table and the summary on the column discount using a regular SELECT
statement:
SELECT COUNT(*), SUM(discount) FROM pubs..discounts
Or you can get the same result without the COUNT()
or SUM()
aggregate functions:
DECLARE @cnt int, @totalDiscount dec(5,2)SELECT @cnt = 0, @totalDiscount = 0SELECT @cnt = @cnt + 1, @totalDiscount = @totalDiscount + discount FROM pubs..discountsSELECT @cnt, @totalDiscount
I tested both solutions on the table with nearly 15,000,000 rows and found the solution without aggregate functions slower than the standard one by approximately 15 percent. But when I tried to implement logic that was a little more complicated (if the current quantity value can be divided by 10 without a reminder, then that value has to be subtracted from the summary amount), both solutions showed almost the same execution times on a large table:
-- 1) Using Y-to-X transformation techniqueDECLARE @sum int SELECT @sum = 0SELECT @sum = CASE WHEN qty%10 = 0 THEN @sum - qty ELSE @sum + qty END FROM pubs..salesSELECT @sum-- 2) Using standard aggregate functionSELECT SUM(CASE WHEN qty%10 = 0 THEN - qty ELSE + qty END) FROM pubs..sales
You’re probably wondering why you need to use a Y-to-X transformation technique for a task that can be done just as quickly or even faster using the traditional approach with aggregate functions. Well, because the two previous examples are very simple, they don’t let you feel the full power and flexibility of Y-to-X transformations. The next example will allow you to implement much more sophisticated logic.
Suppose you want to know the date and time when the total qty amount in the pubs..sales table exceeded 185. In addition, you want to know how many orders contributed to that amount. The following listing presents the solution:
DECLARE @cnt int, @sum int, @date datetimeSELECT @sum = 0, @cnt = 0, @date = 0SELECT * INTO #sales FROM pubs..sales ORDER BY ord_dateSELECT @date = CASE WHEN @sum < 185 THEN ord_date ELSE @date END, @sum = @sum + qty, @cnt = CASE WHEN @sum < 185 THEN @cnt + 1 ELSE @cnt END FROM #salesSELECT @cnt, @dateDROP TABLE #sales-- Result:--- -----------------------6 1993-05-24 00:00:00.000
The second task makes sense only when the data set is sorted by date. For that reason, I created the #sales table. Using Y-to-X transformation, the second (main) query calculates three values: order date, total quantity amount, and row counter. The first and third members of the select list stop updating their values when the value in @sum
exceeds 185.
Note that the result of the query depends on the order of variables in the select list. If you place the variables in order (@sum
, @cnt
, @date
), you will get another result: 6 and 1993-05-22 00:00:00.000. The @cnt
position was not changed relative to the @sum
position, and the @cnt
value still was calculated after the @sum
calculation. The result @cnt = 6 means six first orders have a total quantity amount of less than 185.
On the other hand, the position of the @date
variable was changed relative to the @sum
position, and the update of @date
occurred later than the calculation of @sum
. This is why you got the last order's date within the limit of 185 instead of the first order's date that exceeds the limit.
Row-by-Row Processing Without Cursors or While Loops
Using Y-to-X transformation, you can avoid cursors or loops when you need row-by-row data processing. Consider the following example: You need to drop indexes from the table. You can do it in many different ways, but all of them require you to apply some kind of loop operation. Using Y-to-X transformation, you can do it in one shot:
-- 1) Create testDropIndexes table and indexes IF EXISTS (SELECT * FROM sysobjects WHERE id = OBJECT_ID('testDropIndexes') AND OBJECTPROPERTY(id, 'IsUserTable') = 1)DROP TABLE testDropIndexesGOCREATE TABLE testDropIndexes ( col1 int NOT NULL, col2 int NOT NULL, col3 int NOT NULL, col4 int NOT NULL, col5 int NOT NULL) ON [PRIMARY]GOCREATE INDEX nci_col1 ON testDropIndexes(col1)CREATE INDEX nci_col2 ON testDropIndexes(col2)CREATE INDEX nci_col3 ON testDropIndexes(col3)CREATE INDEX nci_col4 ON testDropIndexes(col4)CREATE INDEX nci_col5 ON testDropIndexes(col5)-- 2) Drop the indexesDECLARE @str nvarchar(4000), @tblName varchar(100) SELECT @str = '', @tblName = 'testDropIndexes'SELECT @str = @str + 'DROP INDEX ' + @tblName + '.' + name + ';' FROM sysindexes WHERE object_name(id) = @tblName AND indID between 2 and 254 AND name NOT LIKE '_WA_SYS%'EXECUTE sp_executeSQL @str
Using the sysindexes table and Y-to-X transformation, the code accumulated the drop indexes statements in the variable @str
and executed them.
X-to-Y or Horizontal to Vertical Transformation
This section demonstrates using the X-to-Y transformation technique to pass many parameters to a stored procedure. As you know, passing a few parameters to a stored procedure is not a difficult task, but passing tens of parameters will produce many lines of additional code in the body of the stored procedure. If you go as far as passing hundreds of parameters with the requisite validation and processing (both SQL Server 2000 and 2005 allow a maximum of 2,100 parameters), it may become a non-trivial task.
One of the common ways to pass many parameters to stored procedures is collecting them into a long string and then separating them by delimiters. This is especially convenient when the number of parameters is dynamic. However, that approach may produce a problem: string processing is not the strongest feature in any RDBMS. How do you deal with such a situation? Well, you can try transforming a string into the table layout, which will make the validation and processing much easier. The following listing shows this technique using X-to-Y transformation:
SET QUOTED_IDENTIFIER OFFGOIF EXISTS (SELECT name FROM sysobjects WHERE name = 'spu_XtoYtransform' AND type ='P')DROP PROCEDURE spu_XtoYtransformGOCREATE PROCEDURE spu_XtoYtransform @iParam varchar(1000) = '', @sParam varchar(1000) = '', @del varchar(10) = '/'ASSET NOCOUNT ON-- X-to-Y transformationDECLARE @iStr nvarchar(4000), @sStr nvarchar(4000)CREATE TABLE #ip(ipID int IDENTITY(1,1) not null, ip int null) SELECT @iStr = 'INSERT INTO #ip(ip) SELECT A=' + REPLACE(@iParam,@del,' UNION ALL SELECT ')CREATE TABLE #sp(spID int IDENTITY(1,1) not null, sp VARCHAR(30) NULL) SELECT @sStr = 'INSERT INTO #sp(sp) SELECT A = "' + REPLACE(@sParam,@del,'" UNION ALL SELECT "') + '"'EXECUTE(@iStr + @sStr)SELECT * FROM #ip; SELECT * FROM #sp-- validation-- . . . . . . . . . . . . . . . . . . . . . . -- logic implementation-- . . . . . . . . . . . . . . . . . . . . . .GO-- How to run exampleEXEC spu_XtoYtransform '12/23/100/34/45/56/67','London/Paris/New-York/Toronto/Mexico/Lima/Tokyo','/'
Note that spu_XtoYtransform
receives two string parameters: one consists of integers and the other of alphabet characters.
To transform each string into the vertical table view, you need to replace the delimiters by using the phrase "UNION ALL SELECT" and dynamically executing that statement. By using "UNION ALL" instead of just "UNION", you'll get the values in the resulting column in exactly the same order as they were in the string.
The following listing provides the code for showing a table's row vertically as a column:
1 USE pubs2 GO3 DECLARE @str nvarchar(4000)4 SELECT @str = ''5 SELECT @str = @str + ','+ COLUMN_NAME + 6 ' FROM pubs..employee WHERE emp_id=' + '''' + 'PMA42628M' + ''''7 FROM pubs.INFORMATION_SCHEMA.COLUMNS8 WHERE TABLE_NAME = 'employee'9 SELECT @str = 'INSERT INTO #XtoY SELECT a=' + 10 SUBSTRING(@str,2,len(@str) - 1)11 CREATE TABLE #XtoY(X_values varchar(30) null)12 SELECT @Str = REPLACE(@str,',',' UNION ALL SELECT ')13 EXECUTE sp_executesql @str14 SELECT * FROM pubs..employee WHERE emp_id='PMA42628M'15 SELECT * FROM #XtoY16 DROP TABLE #XtoY
You can enumerate the columns manually or apply Y-to-X transformation to the INFORMATION_SCHEMA.COLUMNS
view as the example does. Then you just need to replace the comma delimiter using "UNION ALL SELECT" and execute the final statement.
Note: Code in the previous listing works in SQL Server 2000. To make it work in SS2005, you must explicitly convert each column into varchar, replacing line 5 with the following:
SELECT @str = @str + ',CAST('+ COLUMN_NAME + ' AS varchar(60))' +
Cross-Tab Queries and Axes Transformations
You may be wondering what the difference (or similarity) is between the new PIVOT (UNPIVOT) operators introduced in SS2005 and the axes transformation techniques in this article. Well, the PIVOT (UNPIVOT) operators are more suitable for cases when you need to aggregate data and present the result as a cross-tab report. You can get the same results using the SELECT...CASE
technique, the method that was commonly used to generate cross-tab queries in pre-SS2005 versions of SQL Server. However, PIVOT (UNPIVOT) operators are simpler and more flexible.The axes transformation technique does offer the same functionality:
-- Classical SELECT...CASE solution from BOL (SS2000)CREATE TABLE tblPivot (Year SMALLINT, Quarter TINYINT, Amount DECIMAL(2,1));GOINSERT INTO tblPivot VALUES (1990, 1, 1.1);INSERT INTO tblPivot VALUES (1990, 2, 1.2);INSERT INTO tblPivot VALUES (1990, 3, 1.3);INSERT INTO tblPivot VALUES (1990, 4, 1.4);INSERT INTO tblPivot VALUES (1991, 1, 2.1);INSERT INTO tblPivot VALUES (1991, 2, 2.2);INSERT INTO tblPivot VALUES (1991, 3, 2.3);INSERT INTO tblPivot VALUES (1991, 4, 2.4);GOSELECT Year, SUM(CASE Quarter WHEN 1 THEN Amount ELSE 0 END) AS Q1, SUM(CASE Quarter WHEN 2 THEN Amount ELSE 0 END) AS Q2, SUM(CASE Quarter WHEN 3 THEN Amount ELSE 0 END) AS Q3, SUM(CASE Quarter WHEN 4 THEN Amount ELSE 0 END) AS Q4 FROM tblPivot GROUP BY Year;-- Result:Year Q1 Q2 Q3 Q41990 1.1 1.2 1.3 1.41991 2.1 2.2 2.3 2.4 -- Solution with PIVOT operatorSELECT Year, [1] AS Q1, [2] AS Q2, [3] AS Q3, [4] AS Q4 FROM tblPivotPIVOT(SUM(Amount)FOR Quarter IN([1],[2],[3],[4]) ) AS pvt;-- Result:Year Q1 Q2 Q3 Q41990 1.1 1.2 1.3 1.41991 2.1 2.2 2.3 2.4 -- Solution with the axes transformationsDECLARE @a1 DEC(2,1), @a2 DEC(2,1), @a3 DEC(2,1), @a4 DEC(2,1), @a5 DEC(2,1), @a6 DEC(2,1), @a7 DEC(2,1), @a8 DEC(2,1);SELECT @a1 = 0,@a2 = 0,@a3 = 0,@a4 = 0,@a5 = 0,@a6 = 0,@a7 = 0,@a8 = 0;SELECT @a1 = CASE WHEN quarter = 1 AND Year = 1990 THEN @a1 + amount ELSE @a1 END, @a2 = CASE WHEN quarter = 2 AND Year = 1990 THEN @a2 + amount ELSE @a2 END, @a3 = CASE WHEN quarter = 3 AND Year = 1990 THEN @a3 + amount ELSE @a3 END, @a4 = CASE WHEN quarter = 4 AND Year = 1990 THEN @a4 + amount ELSE @a4 END, @a5 = CASE WHEN quarter = 1 AND Year = 1991 THEN @a5 + amount ELSE @a5 END , @a6 = CASE WHEN quarter = 2 AND Year = 1991 THEN @a6 + amount ELSE @a6 END , @a7 = CASE WHEN quarter = 3 AND Year = 1991 THEN @a7 + amount ELSE @a7 END , @a8= CASE WHEN quarter = 4 AND Year = 1991 THEN @a8 + amount ELSE @a8 END FROM tblPivot;SELECT Year = 1990, Q1 = @a1, Q2 = @a2, Q3 = @a3, Q4 = @a4UNION ALLSELECT 1991, @a5, @a6, @a7, @a8;-- Result:Year Q1 Q2 Q3 Q41990 1.1 1.2 1.3 1.41991 2.1 2.2 2.3 2.4
As you can see, the most elegant solution for generating cross-tab queries is a PIVOT solution. The solution with the axes transformation gives you the same result, but it looks pretty messy and has some limitations (e.g., number of variables equal to Years times Quarters). So, the axes transformation techniques described in this article are more suitable for tasks such as table axes rotations, user-defined aggregations, row-by-row processing without cursors (loops), and many others that are too numerous to include in one article.
Therefore, the axes transformation techniques don't replace or compete with PIVOT (UNPIVOT) operators. Rather, they give you additional functionality and flexibility.
Trusty Old Tricks
I started to using X-to-Y and Y-to-X transformation techniques a long time ago, and found they simplified my code and reduced development time. They also allowed me to apply a new approach to some well-known standard tasks. I'm sure they will do the same for you.