Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


Tip of the Day
Language: SQL Server
Expertise: Beginner
Apr 19, 2000

Delete Duplicate Rows with a Single SQL Statement

Question:
Is it possible to delete duplicate rows in a table without using a temporary table (i.e., just do it with a single SQL statement)?

Answer:
All you need to do is compare the table to itself to find out which candidates are duplicates. Do this by assigning aliases to the table so you can use it twice, once as A and again as B, like this:

delete 
    from jobs
where 
    job desc in
    (        
        select 
              a.job_desc 
        from 
            jobs a, 
            jobs b 
        where 
            a.job_desc = b.job_desc 
        group by 
            a.job_desc 
        having 
             count(a.job_desc) >1
    )
When you do this you'll get a count based on the column value you think is duplicated. I used "desc" because the IDs will be different, so the description is the thing that is the candidate for repetition.

Join the table to itself on that candidate to find matches of it. Everything will match to itself at least once—that's why you group by the thing you think is a duplicate. Applying the HAVING clause to it squeezes out all the "ones" or singletons, leaving only the rows that have counts that are more than one—in other words, your duplicate rows.

By the way, this code trashes all the records that are duplicates. If you want to save one, add a comparison for the IDs to be different in the WHERE clause.

DevX Pro
 
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date