Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Calculate Percentiles with SQL Server 2005 : Page 2

Learn one of the techniques for calculating percentiles with SQL Server 2005 using the new SQL Server Common Table Expression and the latest ROW_NUMBER function.

Calculating Percentiles with Grouping
The query in the previous section evaluates only the overall percentile. What if you want to calculate the 75th percentile for each department? That query looks a little more complex because it contains two CTEs:

DECLARE @percentile FLOAT SELECT @percentile = .75; WITH partition_row_count(dept, row_cnt) AS ( SELECT dept, COUNT(*) FROM employee GROUP BY dept HAVING COUNT(*) > 1 ), emp_sal(dept, salary, prev_rank, curr_rank, next_rank) AS ( SELECT e.dept, e.salary, (ROW_NUMBER() OVER ( PARTITION BY e.dept ORDER BY e.salary) - 2.0) / (c.row_cnt - 1), (ROW_NUMBER() OVER ( PARTITION BY e.dept ORDER BY e.salary) - 1.0) / (c.row_cnt - 1), (ROW_NUMBER() OVER ( PARTITION BY e.dept ORDER BY e.salary) + 0.0) / (c.row_cnt - 1) FROM employee e JOIN partition_row_count c ON c.dept = e.dept ) SELECT t1.dept, CASE WHEN t1.salary = t2.salary THEN t1.salary ELSE t1.salary + (t2.salary - t1.salary) * ((@percentile - t1.curr_rank) / (t2.curr_rank - t1.curr_rank)) END FROM emp_sal t1 JOIN emp_sal t2 ON t1.dept = t2.dept WHERE (t1.curr_rank = @percentile OR (t1.curr_rank < @percentile AND t1.next_rank > @percentile)) AND (t2.curr_rank = @percentile OR (t2.curr_rank > @percentile AND t2.prev_rank < @percentile))

The first CTE partition_row_count returns the row count for each department, which the second CTE will utilize to calculate the rankings. Table 1 shows the results.

Table 1. 75th Percentile for Each Department

In the Real World
In 2005, I implemented an online survey application for PricewaterhouseCoopers. The application collects participants' data and compiles it into numerous performance measures, such as EBITA (Earnings Before the deduction of Interest, Tax, and Amortization expenses). Using the technique described in this article, the application provides participants with the quartiles of each performance measure and informs them of their position on the scale. I'll leave the job of modifying the SQL to find out the percentile position for any given value to you.

Cecil Lew is an independent consultant who specializes in database design and .NET framework development.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date