Saturday, April 10, 2010

Delete Duplicates from a Table

WITH DupCTE(Col1,Col2, Ranking)
AS
(
SELECT
Col1
, Col2
, DENSE_RANK() OVER
(PARTITION BY ID ORDER BY NEWID() ASC) AS Ranking
FROM TableName
)

DELETE FROM DupCTE
WHERE Ranking > 1;
GO

Few days back I had a table which had duplicates on the Id columns also, so i had to flip a bit to write a shortest code i can write to delete or remove duplicates from the table.
Below I will demonstrate it by create a temp table, inserting it with some duplicate data and then remove the dups using CTE( Common Table Expressions)

1. Create Table with duplicate data:
--Create Table
CREATE TABLE #Dups
(
Id INT
, Name VARCHAR(50)
);
GO
--Insert data with Duplicates
INSERT INTO #Dups VALUES ( 1, 'Rahul');
INSERT INTO #Dups VALUES ( 1, 'Rahul');
INSERT INTO #Dups VALUES ( 2, 'Divya');
INSERT INTO #Dups VALUES ( 2, 'Divya');
INSERT INTO #Dups VALUES ( 3, 'Jason');

GO

2. CTE to remove Dups:
WITH DupCTE(Id, Name, Ranking)
AS
(
SELECT
ID
, Name
, DENSE_RANK() OVER(PARTITION BY ID ORDER BY NEWID() ASC) AS Ranking
FROM #Dups
)

DELETE FROM DupCTE
WHERE Ranking > 1;
GO


You can download the complete sample code from here.

Cheers!!

1 comment:

Google Dataset Search

Google's Vision statement  is “ to provide access to the world's information in one click. ” Google’s mission Statement is “ ...