Delete Duplicate Rows In SQL Server From A Table

Introduction

In this article, I am going to explain how to delete duplicate rows/records in SQL server using common table expression (CTE). This is one of the most common questions asked in a SQL interview.
Here we will be using SQL Server 2017 or you can use SQL Server 2008 or above.
Prerequisites
SQL Server 2017 or you can use SQL server 2008 or above version.
Now, first we will create a Database and a table.

Creating a Database and a Table

Step 1: Create a Database
Open your SQL Server and use the following script to create the “chittadb” Database.
Create database chittadb
Now, select the script query then press F5 or click on Execute button to execute the above script.
You should see a message, “Command(s) completed successfully.” This means your new database has been created.
Step 2: Create a table
Open your SQL Server and use the following script to create table “tbl_Mcastudents”.
  1. create table tbl_Mcastudents
  2. (
  3.    Id int primary key not null identity(1,1),
  4.    Name nvarchar(50),
  5.    Location nvarchar(30),
  6.    Gender varchar(10)
  7. )
Execute the above query to create “tbl_Mcastudents “.
You should see a message, “Command(s) completed successfully.”
Now, data has been  inserted into the table.
  1. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')
  2. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')
  3. Insert into tbl_Mcastudents values ('Chitta''Chennai''Male')
  4. Insert into tbl_Mcastudents values ('Rani''Puri''Female')
  5. Insert into tbl_Mcastudents values ('Rani''Puri''Female')
  6. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')
  7. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')
  8. Insert into tbl_Mcastudents values ('Mitu''BBSR''Male')
Execute the above query, you should see a message, “Command(s) completed successfully.”
Now retrieve all data from “tbl_Mcastudents” table.
  1. select * from tbl_Mcastudents
Output
Delete Duplicate Rows In SQL Server From A Table
There are many duplicate rows (10, 11, 12), (13, 14), and (15, 16, 17) for the tbl_Mcastudents that have the same Name, Location, and Gender.

Delete duplicate rows/records in SQL server using common table expression (CTE)

To delete the duplicate rows from the table in SQL Server, we follow these steps,
  • Find duplicate rows using GROUP BY clause or ROW_NUMBER()
  • Use DELETE statement to remove the duplicate rows.
Query
  1. WITH cte AS
  2. SELECT Id, Name, Location, Gender, ROW_NUMBER() OVER (PARTITION BY Name, Location, Gender
  3. ORDER BY Name, Location, Gender) row_num FROM tbl_Mcastudents
  4. )
  5. DELETE FROM cte WHERE row_num > 1;
In above query,
  • First, the CTE uses the ROW_NUMBER() function to find the duplicate rows specified by values in the Name, Location, and Gender
  • Then, the DELETEstatement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.
To execute the above query you should see a message,
(5 rows affected)
The above message indicates that the duplicate rows have been removed from the table.
Now retrieve all data from “tbl_Mcastudents” table after the duplicate rows have been deleted.
  1. select * from tbl_Mcastudents
Output
Delete Duplicate Rows In SQL Server From A Table

Conclusion

In this article, we have learned how to delete duplicate rows from a table in SQL Server.
Rate this post
error: Content is protected !!