ometimes you need to generate sequences of date and/or time values. For example, take a data warehouse that has a multidimensional structure and stores the measures and dimensions in fact and dimension tables. Since one of the dimensions is almost always the time dimension, you would need to generate the sequence of date (time) values and load them into the time dimension table. This article demonstrates how to build date generators without the loops using SQL. In addition, it will show some useful techniques that can help you manipulate date and time data.
Date Generators
First, let’s create and load an auxiliary table that consists of one integer column:
SET NOCOUNT ON;DECLARE @startDate smalldatetime, @endDate smalldatetime, @i int;SELECT @i = 1, @startDate = 'Jan 01, 2007', @endDate = DATEADD(yy, 1, @startDate);IF EXISTS(SELECT * FROM sysobjects WHERE ID = (OBJECT_ID('dbo.sequence')) AND xtype = 'U') DROP TABLE dbo.sequence;CREATE TABLE dbo.sequence(num int NOT NULL);WHILE(@i <= (SELECT DATEDIFF(dd, @startDate, @endDate))) BEGIN INSERT INTO dbo.sequence VALUES(@i); SET @i = @i + 1; END
You can use SQL to generate sequences of numbers in many ways (I personally know about 10). The above listing keeps things uncomplicated by using a simple loop. The amount of numbers (365) in the auxiliary table is equal to the number of days in the year 2007 and they have been calculated automatically using the SQL Server DATEDIFF function. Now, you can generate the sequence of days in year 2007 as follows:
DECLARE @iniDate smalldatetime;SELECT @iniDate = 'Dec 31, 2006';SELECT DATEADD(dd, num, @iniDate) DayOfYearFROM sequence;
Results:
DayOfYear-----------------------2007-01-01 00:00:002007-01-02 00:00:002007-01-03 00:00:002007-01-04 00:00:00. . . . . . . . . . 2007-12-27 00:00:002007-12-28 00:00:002007-12-29 00:00:002007-12-30 00:00:002007-12-31 00:00:00
Generating the sequence of days for one year is very simple and fast. However, the task may become more complicated if days of the year don't give you enough granularity for your analyses and you need to dig deeper. You may want to generate the hours, minutes, or even seconds of the year, for example. In the last case, you would need to produce 365 * 24 * 60 * 60 = 31,536,000 date/time values.
Of course, you have a design option for your data warehouse. Instead of creating a star schema, where each dimension table will be directly connected to the fact table, you can employ a snowflake schema, where dimensions can have more complicated structures and consist of a few joined tables. An upcoming section talks about needing to generate a large amount of date/time values. From a technical perspective, you can do that using the idea shown in the previous code, but with a couple of changes. You need to load 31,536,000 numbers into an auxiliary table and then replace the days with seconds in the DATEADD function as follows:
DECLARE @startDate datetime, @iniDate datetime;SELECT @iniDate = 'Dec 31, 2006 23:59:59';SELECT DATEADD(ss, num, @iniDate) secOfYearFROM sequence order by 1;
Results:
secOfYear -----------------------2007-01-01 00:00:00.0002007-01-01 00:00:01.0002007-01-01 00:00:02.0002007-01-01 00:00:03.000. . . . . . . . . . 2007-12-31 23:59:56.0002007-12-31 23:59:57.0002007-12-31 23:59:58.0002007-12-31 23:59:59.000
Such a solution has a few drawbacks. Loading 31,536,000 numbers into a table?even without the loop?will take time. When loaded, that table will occupy about 0.5GB.Also, if you use the auxiliary table for many applications, tasks, and users, your single table will slow down the queries that don't need all the numbers and will use a WHERE clause to restrict the amount of returned data. To make it faster, you need to create a clustered index and maintain it. You can avoid all these drawbacks by slightly changing the approach as follows:
-- Hours of the year---------------------------------------------DECLARE @iniDate smalldatetime;SELECT @iniDate = 'Dec 31, 2006';SELECT DATEADD(hh, t2.num - 1, t1.d) YearHourFROM(SELECT DATEADD(dd, num, @iniDate) dFROM sequence) t1CROSS JOIN (SELECT num FROM sequence WHERE num < 25) t2ORDER BY 1;
Results:
YearHour-----------------------2007-01-01 00:00:002007-01-01 01:00:002007-01-01 02:00:00...................2007-12-31 20:00:002007-12-31 21:00:002007-12-31 22:00:002007-12-31 23:00:00(8760 row(s) affected)-- Minutes of yearDECLARE @iniDate smalldatetime;SELECT @iniDate = 'Dec 31, 2006';SELECT DATEADD(mi, t4.num - 1, t3.h)FROM(SELECT DATEADD(hh, t2.num - 1, t1.d) hFROM(SELECT DATEADD(dd, num, @iniDate) dFROM sequence) t1CROSS JOIN (SELECT num FROM sequence WHERE num < 25) t2) t3CROSS JOIN (SELECT num FROM sequence WHERE num < 61) t4ORDER BY 1
Results:
YearMin-----------------------2007-01-01 00:00:002007-01-01 00:01:002007-01-01 00:02:00.........................2007-12-31 23:56:002007-12-31 23:57:002007-12-31 23:58:002007-12-31 23:59:00(525600 row(s) affected)
Similarly, you can generate the seconds of the year. The solution in the above code is slower than solutions from the previous two. However, it gives you another alternative for producing sequences of dates in SQL.
Create Your Own Date Formats
Even though any programming language offers a lot of date/time formats, you can always face the situation when the format you need is missing. Using SQL, you can easily solve that problem and generate dates or times in your own format. As an example, suppose you require the unusual format of months represented by roman numerals (I've actually met people who used such a format), like this:
31/XII/2006 or 31-XII-2006
To generate dates in this format for a full year, you can use the following code:
SET NOCOUNT ON;IF EXISTS(SELECT * FROM sysobjects WHERE ID = (OBJECT_ID('romanNumbers')) AND xtype = 'U') DROP TABLE romanNumbers;CREATE TABLE romanNumbers( id int IDENTITY(1,1), romNum VARCHAR(4));INSERT INTO romanNumbers VALUES('I');INSERT INTO romanNumbers VALUES('II');INSERT INTO romanNumbers VALUES('III');INSERT INTO romanNumbers VALUES('IV');INSERT INTO romanNumbers VALUES('V');INSERT INTO romanNumbers VALUES('VI');INSERT INTO romanNumbers VALUES('VII');INSERT INTO romanNumbers VALUES('VIII');INSERT INTO romanNumbers VALUES('IX');INSERT INTO romanNumbers VALUES('X');INSERT INTO romanNumbers VALUES('XI');INSERT INTO romanNumbers VALUES('XII');SELECT CAST(num AS VARCHAR(12)) + '/' + romNum + '/' + '2007' AS MyFormatFROM (SELECT num FROM sequence WHERE num < 32) t1CROSS JOIN romanNumbers t2ORDER BY romNum, num;
Results:
MyFormat----------------------1/I/20072/I/20073/I/2007. . . . . 28/II/200729/II/200730/II/200731/II/2007. . . . . 28/XII/200729/XII/200730/XII/200731/XII/2007(372 row(s) affected)
Though the script works, you will find a few unpleasant surprises in the result. First, the query generated 372 days in the year 2007, which is incorrect. Next, each month has 31 days, including February. The reason for such a weird result is obvious: the code took the maximal possible number of days in the month (31) and created the pool of days. Now, you need to build and implement logic that will ensure the proper number of days in the month (year).
Notice that none of the previous examples contained anything about that logic. They used SQL Server built-in date functions that calculate the correct number of days in the month (year) internally, making that process transparent to the programmer. Fortunately, you can use these same functions to generate the correct result in your own date format, as follows:
SELECT MyFormat, YearDayFROM (SELECT id, num, CAST(num AS VARCHAR(12)) + '/' + romNum + '/' + '2007' AS myFormat FROM (SELECT num FROM sequence WHERE num < 32) t1 CROSS JOIN romanNumbers t2) t3INNER JOIN (SELECT DATEPART(dd,DATEADD(dd, num, 'Dec 31, 2006')) AS DayOfMonth, DATEADD(dd, num, 'Dec 31, 2006') YearDay FROM sequence) t4ON t3.num = t4.DayOfMonth AND t3.id = DATEPART(mm, t4.YearDay)
Results:
MyFormat YearDay---------------------- -----------------------1/I/2007 2007-01-01 00:00:00.0002/I/2007 2007-01-02 00:00:00.000. . . . . . . . . . . . . . . . . . . . . . . .27/II/2007 2007-02-27 00:00:00.00028/II/2007 2007-02-28 00:00:00.0001/III/2007 2007-03-01 00:00:00.000. . . . . . . . . . . . . . . . . . . . . . . .30/XII/2007 2007-12-30 00:00:00.00031/XII/2007 2007-12-31 00:00:00.000(365 row(s) affected)
The method in this section is very flexible and will come in handy when you can't find a suitable built-in date/time format.
Create Your Own Calendars
Almost every programmer has tried to create a calendar. In addition to the homemade variety, you'll find a lot of third-party calendars and many software products offer built-in calendars. However, those calendars have one significant drawback?you can't change their layout or traditional designation on the fly. Usually you can make changes only in their presentation layers.
This section discusses SQL Server 2005's new PIVOT operator and how you can use it to build traditional or unusual calendars. To start the example, create the following simple query for number of days in the months of 2007:
SELECT '#days' AS Month, * FROM (SELECT DATEADD(dd, num, 'Dec 31, 2006') days, DATENAME(mm, DATEADD(dd, num, 'Dec 31, 2006')) months FROM sequence) AS SourceTablePIVOT(COUNT(days)FOR months IN([January],[February],[March],[April],[May],[June],[July],[August], [September],[October],[November],[December])) AS PivotTable
Results:
Month January February March April May June July August September October November December----- ------- -------- ----- ----- --- ---- ---- ------ --------- ------- -------- --------#days 31 28 31 30 31 30 31 31 30 31 30 31
Next, you may want to know how many weekends are in each month. Instead of going to the calendar and counting the weekends, you can write the following simple query:
SELECT * FROM (SELECT '#weekends' Month, DateName(mm, DATEADD(dd, num, 'Dec 31, 2006')) MonthName, DATEPART(dd, DATEADD(dd, num, 'Dec 31, 2006')) YearDay FROM sequence WHERE DateName(dw, DATEADD(dd, num, 'Dec 31, 2006')) IN ('Sunday','Saturday')) sourceTablePIVOT (COUNT(YearDay)FOR MonthName IN ([January],[February],[March],[April],[May],[June],[July],[August], [September],[October],[November],[December])) AS pivotTable;
Results:
Month January February March April May June July August September October November December--------- ------- -------- ----- ----- --- ---- ---- ------ --------- ------- -------- --------#weekends 8 8 9 9 8 9 9 8 10 8 8 10
If you're a contract worker who is paid hourly and has an eight-hour workday, then you may want to know the number of working hours in each month. In that case, you will find the following script very helpful:
SELECT *FROM (SELECT '#WorkHours' AS [Month], DateName(mm, DATEADD(dd, num, 'Dec 31, 2006')) MonthName, COUNT(DATEADD(dd, num, 'Dec 31, 2006'))*8 AS WorkingHours FROM sequence WHERE DateName(dw, DATEADD(dd, num, 'Dec 31, 2006')) IN ('Monday','Tuesday','Wednesday','Thursday','Friday') GROUP BY DatePart(yy, DATEADD(dd, num, 'Dec 31, 2006')), DateName(mm, DATEADD(dd, num, 'Dec 31, 2006')) ) sourceTablePIVOT (SUM(WorkingHours)FOR MonthName IN([January],[February],[March],[April],[May],[June],[July],[August], [September],[October],[November],[December])) AS pivotTable;
Results:
Month January February March April May June July August September October November December---------- ------- -------- ----- ----- --- ---- ---- ------ --------- ------- -------- --------#WorkHours 184 160 176 168 184 168 176 184 160 184 176 168
In real life, you may have fewer working hours in a month because of the holidays. In that case, you can place the holiday dates in a separate table and modify the script from the above listing (try this as an exercise).
Finally, the following example shows you how to build a traditional calendar using the PIVOT operator:
SELECT * FROM (SELECT DatePart(mm, DATEADD(dd, num, 'Dec 31, 2006')) MonthNum, DateName(mm, DATEADD(dd, num, 'Dec 31, 2006')) MonthName, DatePart(wk, DATEADD(dd, num, 'Dec 31, 2006')) WeekOfYear, DATEPART(dd, DATEADD(dd, num, 'Dec 31, 2006')) YearDay, DateName(dw, DATEADD(dd, num, 'Dec 31, 2006')) weekDayName FROM sequence) sourceTablePIVOT (MIN(YearDay) FOR weekDayName IN(Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday)) AS pivotTableORDER BY MonthNum, WeekOfYear;
Results:
#Month MonthName WeekOfYear Sunday Monday Tuesday Wednesday Thursday Friday Saturday------ --------- ----------- ------- ------ ------- --------- -------- ------ --------1 January 1 NULL 1 2 3 4 5 61 January 2 7 8 9 10 11 12 131 January 3 14 15 16 17 18 19 201 January 4 21 22 23 24 25 26 271 January 5 28 29 30 31 NULL NULL NULL2 February 5 NULL NULL NULL NULL 1 2 32 February 6 4 5 6 7 8 9 102 February 7 11 12 13 14 15 16 172 February 8 18 19 20 21 22 23 242 February 9 25 26 27 28 NULL NULL NULL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 November 46 11 12 13 14 15 16 1711 November 47 18 19 20 21 22 23 2411 November 48 25 26 27 28 29 30 NULL12 December 48 NULL NULL NULL NULL NULL NULL 112 December 49 2 3 4 5 6 7 812 December 50 9 10 11 12 13 14 1512 December 51 16 17 18 19 20 21 2212 December 52 23 24 25 26 27 28 2912 December 53 30 31 NULL NULL NULL NULL NULL
Time and Date Manipulations
The flexible techniques shown in this article can be useful in projects when you need to generate and manipulate date/time data. Although the examples are implemented in SQL Server, you can easily adjust them for other RDBMS, because many RDBMS have similar date/time functions.