imiting the number of records in a group is a common request. For example, suppose your sales director requests a custom report showing the top salesperson for each region. At first, limiting the number of records in that group seems simple: Use GROUP BY and TOP, but that doesn’t work. Instead of limiting the number of records in each region, TOP retrieves just one record?whichever record happens to filter to the top of the recordset. You might work your way through a number of other possibilities with varying degrees of success, including:
- Combining GROUP BY and HAVING
- Adding a COUNT() function to the mix
- Using Transact SQL (T-SQL) to jump right to a subquery
The third solution goes in the right direction, but getting just the right syntax might be baffling. A subquery using IN is the way to go, unless you’re using MS Access to report the SQL Server data. Then, you don’t need code at all?the Report objects will limit the groups.
The SQL Server Solution
To limit the number of records in a group, you might need to limit the groups at the source: SQL Server. Most SQL Server solutions are a bit convoluted?stored procedures cycling through records to compare and count specific values. Fortunately, there’s a much easier solution. Try an IN subquery that uses the following syntax:
SELECT *FROM source d1WHERE d1.groupfieldIN (SELECT TOP n pkfield FROM source d2 WHERE d1.groupfield = d2.groupfield [ORDER BY groupsortfield])[ORDER BY tablesortfield]
Both ORDER BY clauses are optional, but in most cases you’ll use them. Unless you’re retrieving values at random, it doesn’t make much sense to omit ORDER BY from the subquery. Most of the arguments are self-explanatory, but just in case:
- Source is the data source.
- Groupfield is the field by which you’re grouping the records.
- N represents the value by which you want to limit each group.
- Pkfield is the primary key field in source.
- Groupsortfield is the field by which you sort the grouped records in the subquery.
- Tablesortfield is the field by which you sort the resulting recordset.
In a nutshell, the subquery returns the top n records from a grouped recordset. The recordset is based on groupfield, which exists in both the main and sub derived tables. The main SELECT retrieves a record for each matching record in the subquery and TOP limits the number of records. Without TOP, the subquery would return all the records. You could say it’s a sleight of hand, but it works!
|Figure 1. Using a Subquery to Limit the Number of Records in a Group: Here is the result of running code that uses a subquery to limit the number of records in a group.|
Figure 1 shows the result of running the following code against the PurchaseOrderDetail table in AdventureWorks:
SELECT PurchaseOrderID, UnitPriceFROM Purchasing.PurchaseOrderDetail d1WHERE d1.PurchaseOrderDetailIDIN (SELECT TOP 3 PurchaseOrderDetailID FROM Purchasing.PurchaseOrderDetail d2 WHERE d1.PurchaseOrderID = d2.PurchaseOrderID ORDER BY UnitPrice DESC)ORDER BY d1.PurchaseOrderDetailID
The ORDER BY clause in the subquery ensures that the subquery retrieves the highest-priced items for each purchase order. Since n is 3, the subquery returns three (or fewer) records for each PurchaseOrderDetailID value. The main query sorts the retrieved records by the purchase order number.
|Figure 2. Using a Subquery to Limit the Number of Records in a Group: A JOIN doesn’t complicate the basic syntax.|
Working with multiple tables is just as easy. Simply include a JOIN as you normally would (see Figure 2). As you can see, this syntax uses an INNER JOIN between Purchasing.PurchaseOrderHeader and Purchasing.PurchaseOrderDetail.
SELECT d1.*, d2.* FROM Purchasing.PurchaseOrderHeader d1INNER JOIN Purchasing.PurchaseOrderDetail d2 ON d1.PurchaseOrderID = d2.PurchaseOrderIDWHERE d2.PurchaseOrderDetailID IN ( SELECT TOP 3 PurchaseOrderDetailID FROM Purchasing.PurchaseOrderDetail dd WHERE dd.PurchaseOrderID = d2.PurchaseOrderID )ORDER BY d1.PurchaseOrderID
|Figure 3. Running the Enhanced Statement: You can enhance the basic statement.|
Once you’re familiar with the basic syntax, you can enhance the statement as necessary. For instance, Figure 3 shows the result of running the enhanced statement in the following code:
SELECT d1.*, d2.* FROM Purchasing.PurchaseOrderHeader d1INNER JOIN Purchasing.PurchaseOrderDetail d2 ON d1.PurchaseOrderID = d2.PurchaseOrderIDWHERE d2.PurchaseOrderDetailIDIN (SELECT TOP 3 PurchaseOrderDetailID FROM Purchasing.PurchaseOrderDetail dd WHERE dd.PurchaseOrderID = d2.PurchaseOrderID GROUP BY PurchaseOrderDetailID )AND d1.PurchaseOrderID IN (SELECT PurchaseOrderID FROM Purchasing.PurchaseOrderDetail dd WHERE dd.PurchaseOrderID = d1.PurchaseOrderID GROUP BY PurchaseOrderID HAVING COUNT(PurchaseOrderID) >= 3 )ORDER BY d1.PurchaseOrderID
The HAVING COUNT() component limits records to only those groups that have at least three records. In this syntax, the two limiting values are both 3, but they don’t have to be the same.
Turning the Code into a Stored Procedure
The group-limiting solution presented so far is adequate, but an input parameter would make it more flexible. By turning the code into a stored procedure, you can pass the number of records you want to see in each group when you call the stored procedure. To turn this code into a stored procedure, open a new query window against AdventureWorks and execute this code:
|Figure 4. Passing the Limiting Value Via a Call to the Stored Procedure: You can turn the code into a flexible stored procedure.|
CREATE PROCEDURE uspLimitGroup @n intASBEGIN SELECT H.*, D.* FROM Purchasing.PurchaseOrderHeader H INNER JOIN Purchasing.PurchaseOrderDetail D ON H.PurchaseOrderID = D.PurchaseOrderID WHERE D.PurchaseOrderDetailID IN ( SELECT TOP (@n) PurchaseOrderDetailID FROM Purchasing.PurchaseOrderDetail dd WHERE dd.PurchaseOrderID = D.PurchaseOrderID ) ORDER BY H.PurchaseOrderIDEND
Notice that the code substitutes the input parameter @n for the literal value 3 used by the previous code listings.
As you can see in Figure 4, passing the limiting value via a call to the stored procedure returns the same results. Simply pass the limiting value when you call the stored procedure. You could also pass the name of the tables, making the stored procedure even more flexible.
The Microsoft Access Solution
If you’re working with MS Access as your front end, you don’t need code to limit the number of records in a grouped report. Report objects can do the work for you. Use a main report to identify the group and a subreport to limit the grouped records. It’s an easy trick that works well.
First, you retrieve (link to) the SQL Server data. Then, base a Totals query on the SQL Server data, and include only the field by which you’re grouping the report. Base the main report on this Totals query. A second query is the basis for the subreport, which displays the data you need to see. By limiting this query to the number of records by which you want to limit each group, you get the effect of limiting a group without actually doing so. Technically, the limit belongs to the subreport, not the group.
Whether you’re using MS Access to link to SQL Server data or native tables, the following technique works:
- Create a new query on the table that contains the data that you want to limit.
- Drag the field by which you’re grouping the records to the design grid. This query will contain just this one field (groupfield from the SQL Server example in the previous section).
- In the MS Access Query Design window, choose Totals from the View menu. MS Access will add a GROUP BY aggregate to the field’s Total cell, as shown in Figure 5. You’ll base the main report on this query.
Figure 5. GROUP BY Aggregate Added to the Field’s Total Cell: Let a Totals query specify the group.
- Save the query (qryMainReport) and close it.
- Create a SELECT query and include all of the data you want to view in your report from the sources necessary, including the field by which you’re grouping the report (see Step 2). Both queries must include the grouping field. You’ll base the subreport on this query.
- In the grouping field’s Criteria cell, enter the name of the main report and its one field (it doesn’t exist yet, but that’s okay) as shown in Figure 6. Be sure to enter a complete identifier in the following form: Reports!reportname!groupfield.
- Set this query’s Top Values property to the number of records by which you’re limiting each group. For instance, if you want to limit each group to two records, you’d enter the value 2 (see Figure 7). To access this property, choose Properties from the View menu. Although you’re not going to sort these records in this exercise, if you did want to sort them, this is the right query for that: choose a Sort order from the Sort cell in the design grid.
- Save the second query (qrySubReport) and close it.
- Base a simple report on the first query (qryMainReport).
- With the main report in Design view, insert a Subform/Subreport control into the report’s Detail section. Use Figure 8 as a guide.
- Double-click the subreport to open its Properties window and set the following properties, as shown in Figure 9:
- Source Object: Query.qrySubReport (should be set for you)
- Link Child Field: OrderID
- Link Master Field: OrderID
- Resize the subreport to be about the size of a single row.
- Save the report.
|Figure 10. Print Preview of the Report: The report seemingly limits the number of grouped records.|
In Print Preview, the report seemingly limits the number of grouped records (see Figure 10). Despite the way it looks, this solution is simple and relies on the subreport’s query to limit the records. This technique doesn’t actually limit a grouped recordset or report. The one limitation is that you must change the query’s Top Value property to change the record limit.
Limiting the number of records in a group is one of those problems that seems simple until you start applying solutions. Regardless of which database you’re using, this article has shown you that both SQL Server and MS Access can handle the task.