0% found this document useful (0 votes)
15 views6 pages

Deleting Duplicate Rows in SQL

Uploaded by

suresh p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Deleting Duplicate Rows in SQL

Uploaded by

suresh p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

deepdecide

.com

DELETE
DUPLICATE ROWS

IN SQL
Duplicate rows in a database can cause data
integrity issues and affect the accuracy of data
analysis. It’s essential to identify and remove
duplicates to ensure the dataset is clean and
reliable.

Shah Jahan
linkedin.com/in/shah907
deepdecide
.com

Steps to Identify and Delete Duplicate Rows


1. Identifying Duplicates Using GROUP BY
To identify duplicate rows, you can use the GROUP BY clause combined with aggregate
functions like COUNT(). This will help you find rows that have the same values in one
or more columns.

Example

SELECT Column1, Column2, COUNT(*)


FROM your_table
GROUP BY Column1, Column2
HAVING COUNT(*) > 1;

This query groups the data based on Column1 and Column2 and counts the number of
occurrences of each group. If the count is greater than 1, those rows are duplicates.

linkedin.com/in/shah907
deepdecide .com

2. Deleting Duplicates Using DELETE with a CTE


A Common Table Expression (CTE) is an effective way to delete duplicate rows while
retaining the original row. Using the ROW_NUMBER() function helps in identifying the
duplicates.

Example

WITH CTE AS (
SELECT Column1, Column2, ROW_NUMBER() OVER (PARTITION BY
Column1, Column2 ORDER BY Column1) AS RowNum
FROM your_table
)
DELETE FROM CTE
WHERE RowNum > 1;

The ROW_NUMBER() function assigns a unique row number to each row within a
partition of the result set. Rows that have the same values in Column1 and Column2
are partitioned together.

The DELETE statement removes all rows where the RowNum is greater than 1,
effectively keeping only one instance of each duplicate.

linkedin.com/in/shah907
deepdecide.com

3. Deleting Duplicates Using a Temporary Table


Another way to remove duplicates is by inserting distinct rows into a temporary table
and then replacing the original table with the deduplicated data.

Example
CREATE TABLE temp_table AS
SELECT DISTINCT Column1, Column2
FROM your_table;

TRUNCATE TABLE your_table;

INSERT INTO your_table (Column1, Column2)


SELECT Column1, Column2 FROM temp_table;

DROP TABLE temp_table;

First, the distinct rows are inserted into a temporary table.

Then, the original table is truncated (which removes all rows) and repopulated with
the distinct data.

Finally, the temporary table is dropped.

linkedin.com/in/shah907
deepdecide
4. Deleting Duplicates Based on
.com

a Single Column
If you want to delete duplicates based on a specific column, you can use the following
method:

Example

DELETE FROM your_table


WHERE id NOT IN (
SELECT MIN(id)
FROM your_table
GROUP BY Column1, Column2
);

The MIN(id) function selects the row with the smallest ID for each group of duplicates.

The DELETE query removes all other rows that are not in this set, effectively keeping
one row per duplicate.

linkedin.com/in/shah907
deepdecide.com

JOIN NOW
linkedin.com/in/shah907

deepdecide .com

Free Notes: https://deepdecide.com/resources

Shah Jahan

You might also like