【删除重复记录】SQL删除重复记录
1.给name字段修改成唯一索引:
alter table test add unique index (name);
这样当向表中添加相同记录的时候,会返回1062的添加失败信息。
但是有一种情况是表中已经有n个重复的记录,这时候我们才想起来要添加唯一索引,再执行上面的操作时,数据库会告诉你已经有重复的记录了,建立索引失败,这时候,我们可以用下面的操作:
它会删除重复的记录(别怕,会保留一条)(但是这个命令在MySQL5.1.37之前是可以的,在5.1.48以后就行不通了),然后建立唯一索引,高效而且人性化。
另外需要注意的是alter ignore table 在percona版本的MySQL行不通,因为它创建索引的方式是:fast index creation
3.删除重复记录法:
创建一个表用来存放,要删除的记录的id信息:
`id` int ( 11 ),
`name` char(20)
) ENGINE = MyISAM;
如果要删除的记录不多的话,可以把这个表创建成内存表形式:
`id` int ( 11 ),
`name` char(20)
) ENGINE = HEAP;
然后在test表中删除重复记录:
delete a.* from test a,tmp_ids b where b.name=a.name and a.id>b.id ;
truncate table tmp_ids;
-- 检查重复code1 select count(identity) num, identity from event_log where code='code1' group by identity having count(identity) > 1 order by num desc
删除重复记录
DELETE FROM event_log WHERE `code`='code1' AND identity IN ( SELECT identity from ( SELECT identity FROM event_log WHERE code='code1' GROUP BY identity HAVING count(identity) > 1 ) a ) AND id NOT IN ( SELECT keepId FROM ( SELECT min(id) keepId FROM event_log WHERE code='code1' GROUP BY identity HAVING count(identity) > 1 ) b )
其中 a 和 b 两个中间表的作用是, 避免执行时出现 You can't specify target table 'xxxxx' for update in FROM clause 错误
分组按时间正序取第一条记录, 巧妙地使用了not exists
select d.* from t_charge d where not exists (select 1 from t_charge where user_id = d.user_id and created_at <</span> d.created_at)
按时间倒序则是
select f.* from t_charge f where not exists (select 1 from t_charge where user_id = f.user_id and created_at > f.created_at)
转自:https://www.cnblogs.com/milton/p/6354229.html
在生产环境中,我们有的列是不允许出现重复值的,亦或是某两列不允许同时重复,但由于前端未做限制,或者没限制住,出现了单列重复值,或者两列本应组成唯一组合却也出现重复,这两种情况都是不允许的。现在由于前端应用限制不住,要做删除操作后,添加唯一索引,从数据库层面进行限制,以下是处理过程:
mysql> select * from aixuan1;
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 1 | aa | 11 |
| 2 | bb | 22 |
| 3 | cc | 33 |
| 4 | cc | 44 |
| 5 | bb | 22 |
| 6 | aa | 11 |
| 7 | dd | 55 |
+----+------+-------+
7 rows in set (0.00 sec)
text字段全部重复的有:
mysql> select * from aixuan1 where text in (select text from aixuan1 GROUP BY text having count(*) > 1);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 1 | aa | 11 |
| 2 | bb | 22 |
| 3 | cc | 33 |
| 4 | cc | 44 |
| 5 | bb | 22 |
| 6 | aa | 11 |
+----+------+-------+
6 rows in set (0.00 sec)
筛选出text单列重复值
select * from aixuan1 where `text` in (select `text` from aixuan1 GROUP BY `text` having count(*) > 1) and id not in (select min(id) from aixuan1 group by text having count(*)>1)
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 4 | cc | 44 |
| 5 | bb | 22 |
| 6 | aa | 11 |
+----+------+-------+
3 rows in set (0.00 sec)
还可以这么查
mysql> select * FROM aixuan1 WHERE id NOT IN ( SELECT temp.mid FROM ( SELECT min(id) as mid FROM aixuan1 em GROUP BY em.text) AS temp);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 4 | cc | 44 |
| 5 | bb | 22 |
| 6 | aa | 11 |
+----+------+-------+
3 rows in set (0.00 sec)
筛选出text和text1同时重复的字段:
mysql> select * FROM aixuan1 WHERE id NOT IN ( SELECT temp.mid FROM ( SELECT min(id) as mid FROM aixuan1 em GROUP BY em.text,em.text1) AS temp);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 5 | bb | 22 |
| 6 | aa | 11 |
+----+------+-------+
2 rows in set (0.00 sec)
查出来了,删就好办了,把select换成delete就Ok了,具体说保留大的id还是保留小的id那条,只要子查询的id函数用min(id)或者max(id)即可
本文出自 “岁伏” 博客,请务必保留此出处http://suifu.blog.51cto.com/9167728/1796055