Oracle查询某几个字段重复的记录,并删除重复记录,只保留不重复的记录
有某个数据表,由于数据表在设计阶段没有建立主键约束,跑了一段时间业务后导致有很多重复的记录,如何查询出重复的记录,并将重复的部分删除,保证某几个字段组合的唯一性。
只有将重复的记录删除,才能重写建立主键约束。
如上图:业务表中有TEMPLATE_NAME重复的记录,假设数据表中有很多数据,手动删除比较费时费力,需要用脚本删除。
最终效果为:重复的记录随便删除一条,保证TEMPLATE_NAME字段的唯一性。如下图:
--1.COUNT子查询,查询结果(记录a+与a重复的记录) SELECT * FROM MY_TABLE A WHERE (SELECT COUNT(*) FROM MY_TABLE B WHERE A.TEMPLATE_NAME=B.TEMPLATE_NAME)>1 ORDER BY A.TEMPLATE_NAME --2.INNER JOIN+ROWID过滤,查询结果(记录a+与a重复的记录) SELECT A.ROWID,B.ROWID, A.* FROM MY_TABLE A INNER JOIN MY_TABLE B ON A.TEMPLATE_NAME=B.TEMPLATE_NAME AND A.ROWID!=B.ROWID ORDER BY A.TEMPLATE_NAME --3.GROUP+HAVING过滤,查询结果(记录a+与a重复的记录) SELECT * FROM MY_TABLE A WHERE A.TEMPLATE_NAME IN ( SELECT B.TEMPLATE_NAME FROM MY_TABLE B GROUP BY B.TEMPLATE_NAME HAVING COUNT(B.TEMPLATE_NAME) > 1 ) ORDER BY A.TEMPLATE_NAME --4.OVER+PARTITION BY过滤,查询结果(仅查询与a重复的记录) SELECT * FROM ( SELECT A.TEMPLATE_NAME, ROWID,ROW_NUMBER() OVER (PARTITION BY A.TEMPLATE_NAME ORDER BY A.TEMPLATE_NAME) Rc FROM MY_TABLE A ) WHERE Rc>1 ORDER BY TEMPLATE_NAME --5.HAVING+MIN ,查询结果(仅查询与a重复的记录) SELECT * FROM MY_TABLE WHERE (TEMPLATE_NAME) IN ( SELECT TEMPLATE_NAME FROM MY_TABLE GROUP BY TEMPLATE_NAME HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM TEMPLATE_NAME GROUP BY TEMPLATE_NAME HAVING COUNT(*) > 1); --6.删除重复记录,只保留一条未重复 DELETE FROM MY_TABLE WHERE ROWID IN ( SELECT ROWID FROM (SELECT A.TEMPLATE_NAME, ROWID,ROW_NUMBER() OVER (PARTITION BY A.TEMPLATE_NAME ORDER BY A.TEMPLATE_NAME) Rc FROM MY_TABLE A) WHERE Rc>1 ) --7.删除重复记录,只保留一条未重复 DELETE MY_TABLE WHERE (TEMPLATE_NAME) IN ( SELECT TEMPLATE_NAME FROM MY_TABLE GROUP BY TEMPLATE_NAME HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM TEMPLATE_NAME GROUP BY TEMPLATE_NAME HAVING COUNT(*) > 1);