devxlogo

Delete Duplicate Rows with a Single SQL Statement

Delete Duplicate Rows with a Single SQL Statement

Question:
Is it possible to delete duplicate rows in a table without using a temporary table (i.e., just do it with a single SQL statement)?

Answer:
All you need to do is compare the table to itself to find out which candidates are duplicates. Do this by assigning aliases to the table so you can use it twice, once as A and again as B, like this:

delete     from jobswhere     job desc in    (                select               a.job_desc         from             jobs a,             jobs b         where             a.job_desc = b.job_desc         group by             a.job_desc         having              count(a.job_desc) >1    )

When you do this you’ll get a count based on the column value you think is duplicated. I used “desc” because the IDs will be different, so the description is the thing that is the candidate for repetition.

Join the table to itself on that candidate to find matches of it. Everything will match to itself at least once?that’s why you group by the thing you think is a duplicate. Applying the HAVING clause to it squeezes out all the “ones” or singletons, leaving only the rows that have counts that are more than one?in other words, your duplicate rows.

By the way, this code trashes all the records that are duplicates. If you want to save one, add a comparison for the IDs to be different in the WHERE clause.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist