SQL Data Cleaning

How to Delete Duplicate Rows in SQL (MySQL, PostgreSQL, SQL Server)

Remove duplicate rows in SQL while keeping one copy. Covers ROW_NUMBER, self-join, CTE, and temporary table methods for all databases.

Mar 12, 2026 6 min read

Introduction

Duplicate rows are one of the most common data quality issues. The trick is deleting the duplicates while keeping exactly one copy of each unique row.

Find Duplicates First

Before deleting, identify which rows are duplicated and how many copies exist.

-- Find duplicates by email:
SELECT email, COUNT(*) AS copies
FROM users
GROUP BY email
HAVING COUNT(*) > 1
ORDER BY copies DESC;

Tip: Always find duplicates first. Never delete without verifying what will be removed.

Delete with ROW_NUMBER (PostgreSQL & SQL Server)

The cleanest method — assign row numbers to each group and delete all rows except the first.

-- PostgreSQL:
DELETE FROM users
WHERE id IN (
  SELECT id FROM (
    SELECT id,
      ROW_NUMBER() OVER (
        PARTITION BY email
        ORDER BY id ASC
      ) AS rn
    FROM users
  ) ranked
  WHERE rn > 1
);

-- SQL Server (using CTE):
WITH duplicates AS (
  SELECT id,
    ROW_NUMBER() OVER (
      PARTITION BY email
      ORDER BY id ASC
    ) AS rn
  FROM users
)
DELETE FROM duplicates WHERE rn > 1;

Tip: ORDER BY id ASC keeps the oldest row. Use ORDER BY id DESC to keep the newest instead.

Delete with Self-Join (MySQL)

MySQL doesn't support DELETE with CTE well in older versions. Use a self-join instead.

-- Delete duplicates keeping the lowest id:
DELETE u1
FROM users u1
INNER JOIN users u2
  ON u1.email = u2.email
  AND u1.id > u2.id;

-- MySQL 8.0+ supports the ROW_NUMBER approach too:
DELETE FROM users
WHERE id NOT IN (
  SELECT min_id FROM (
    SELECT MIN(id) AS min_id
    FROM users
    GROUP BY email
  ) AS keepers
);

Tip: The self-join method works in all MySQL versions. The ROW_NUMBER approach requires MySQL 8.0+.

Delete with Temporary Table (Safe Method)

The safest approach — copy unique rows to a temp table, truncate, and re-insert.

-- 1. Copy unique rows:
CREATE TABLE users_clean AS
SELECT DISTINCT ON (email) *
FROM users
ORDER BY email, id ASC;  -- PostgreSQL

-- For MySQL:
CREATE TABLE users_clean AS
SELECT * FROM users
WHERE id IN (
  SELECT MIN(id) FROM users GROUP BY email
);

-- 2. Swap tables:
ALTER TABLE users RENAME TO users_old;
ALTER TABLE users_clean RENAME TO users;

-- 3. Verify, then drop old:
DROP TABLE users_old;

Tip: This is the safest method for production — you can always rollback by swapping back to users_old.

Prevent Future Duplicates

After cleaning up, add constraints to prevent duplicates from happening again.

-- Add unique constraint:
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);

-- Or unique index:
CREATE UNIQUE INDEX idx_users_email ON users (email);

-- For soft-duplicate prevention (case insensitive):
CREATE UNIQUE INDEX idx_users_email_lower
ON users (LOWER(email));  -- PostgreSQL only

Tip: Always add a UNIQUE constraint after deduplication to prevent the problem from recurring.

Best Practices

  • Always backup before deleting duplicates
  • Find and review duplicates with SELECT before DELETE
  • Use the temporary table method for large tables in production
  • Add UNIQUE constraints after cleanup to prevent recurrence
  • Consider which copy to keep (oldest, newest, most complete)

Generate SQL Queries with AI2SQL

Skip the syntax lookup. Describe what you need in plain English and AI2SQL generates the correct SQL query instantly.

Try AI2SQL Free

No credit card required

Frequently Asked Questions

How do I delete duplicate rows but keep one?

Use ROW_NUMBER() OVER (PARTITION BY duplicate_column ORDER BY id) to number each group, then delete where row_number > 1. This keeps the first occurrence.

Which method is safest for production databases?

The temporary table method: copy unique rows to a new table, swap tables, verify, then drop the old table. It allows easy rollback.

Can AI2SQL help me deduplicate data?

Yes. Describe your situation like "delete duplicate users keeping the one with the lowest id" and AI2SQL generates the correct deduplication query.

Generate SQL from Plain English

Stop memorizing syntax. Describe what you need and let AI2SQL write the query for you.

Try AI2SQL Free

No credit card required