Counting Duplicate Rows

Counting Duplicate Rows

How do I count the number of duplicate items in a table?

Lets break down your question into several steps.First, let’s create a sample table using the following code:

create table dups(	i int)godeclare @i intselect @i = 0while (@i < 35)begin   insert into dups(i) values (cast (rand() * 50  as int))	select  @i = @i + 1end

Now, let's find rows that are duplicates. For that we can use a simple group by statement:

select i,count(*) as num_recordsfrom dupsgroup by ihaving count(*)  > 1

My sample data produced the following result set:

i           num_records ----------- ----------- 0           25           218          222          227          231          234          244          249          2

This identifies the rows that have duplicates. But it does not return the total number of duplicates in the table. The first change we must make is to recognize the above rows that show 2 contain only one duplicate.

So we want a query that basically sums up the duplicates from the above query. To do so, we take the previous query and can put that in the from statement as a derived table. We then can use the sum function to create the total for us:

select sum(num_dups)from (select i,count(*)-1 as num_dupsfrom dupsgroup by ihaving count(*)-1  > 0) as mydups

See also  5 Ways to Improve Customer Experience

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist