ORACLE @ Work: Delete duplicate rows from a table

Friday, August 06, 2010

Delete duplicate rows from a table

Following query would identify the duplicate rows on a table using self join using rowid among duplicates. For faster execution, this is suggested to use unique/primary key values in select and in joinings. If there is no PK/Unique key then its safe to specify all the columns associated with table in joining.

First way:

1. Identify Duplicates

select EMPNO,DEPT_ID, GRADE

from SCOTT.EMP a

where

rowid >

(select min(rowid) from SCOTT.EMP b

where

b.EMPNO = a.EMPNO

and

b.DEPT_ID = a.DEPT_ID

and b.GRADE = a.GRADE

);

2. Delete Duplicates

See the output, do some manual verification and go for deletion with following query.

delete from

SCOTT.EMP a

and

a.rowid >

any (select b.rowid

from

SCOTT.EMP b

where

b.EMPNO = a.EMPNO

and

b.DEPT_ID = a.DEPT_ID

and b.GRADE = a.GRADE

);

Second way:

1. Identify duplicates.

select count(*) from RF_BUSINESS_UNIT_HISTORY WHERE ROWID IN
(select rowid from RF_BUSINESS_UNIT_HISTORY
MINUS
select max(rowid) from RF_BUSINESS_UNIT_HISTORY
GROUP BY
CALENDAR_YEAR_MONTH,BUSINESS_UNIT_NO,BUSINESS_UNIT_NAME);

COUNT(*)
———-
2528

2. 2258 duplicates exist – these can be deleted with command below -

delete from RF_BUSINESS_UNIT_HISTORY WHERE ROWID IN
(select rowid from RF_BUSINESS_UNIT_HISTORY
MINUS
select max(rowid) from RF_BUSINESS_UNIT_HISTORY
GROUP BY
CALENDAR_YEAR_MONTH,BUSINESS_UNIT_NO,BUSINESS_UNIT_NAME);

Friday, August 06, 2010

Delete duplicate rows from a table

No comments: