Question:
Joe Celko’s Store Survey Puzzle]
You are collecting statistical information stored by the quarter hour. What your customer wants is to get information by hour ? not on the hour. That is, we don’t want to know what the load was at 00:00 Hrs, at 01:00 Hrs, at 02:00 Hrs, and so forth. We want the average load for the first four quarter hours (00:00, 00:15, 00:30, 01:00), for the next four quarter hours (00:15, 00:30, 01:00, 01:15) and so forth. The sample table looks like this:
CREATE TABLE LoadData (calendar DATE NOT NULL, clock TIME NOT NULL, load REAL NOT NULL, PRIMARY KEY(calendar, clock));
Answer:
The best way is to add another column to hold the reporting period:
CREATE TABLE LoadData (calendar DATE NOT NULL, clock TIME NOT NULL, period INTEGER DEFAULT (0) CHECK (period BETWEEN 0 AND 96), load REAL DEFAULT (0) NOT NULL, PRIMARY KEY (calendar, clock));Then update the table with a series of statements like this:
UPDATE LoadData SET period = 0 WHERE clock IN (00:00, 00:15, 00:30, 01:00);Now the report becomes:
SELECT period, MAX(clock), AVG(load) FROM LoadData GROUP BY period;If you are not sure that the samples will fall exactly on the quarter hour then hedge your bets a little bit by letting the clock times fall within a time frame.
UPDATE LoadData SET period = 1 WHERE clock BETWEEN 00:00 AND 01:00;Just how you want to handle the query is up to you. The first query will still work, but it does not allow for the spacing of the samples should they be too close together.
Puzzle provided courtesy of:
Joe Celko
[email protected]