mysql去除重复数据

今天一个同学问我mysql去除重复数据,自己做了个测试顺便记录下:

查看表结构:

mysql> desc testdelete;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment |
| one   | varchar(40) | YES  |     | NULL    |                |
| two   | varchar(40) | YES  |     | NULL    |                |
| three | varchar(40) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
4 rows in set (0.10 sec)

 

 

表的数据:

mysql> select * from testdelete;
+----+------+------+-------+
| id | one  | two  | three |
+----+------+------+-------+
|  1 | A    | A    | A     |
|  2 | B    | B    | B     |
|  3 | C    | C    | C     |
|  4 | D    | D    | D     |
|  5 | E    | E    | E     |
|  6 | A    | A    | B     |
| 12 | A    | A    | A     |
| 13 | A    | A    | A     |
| 14 | A    | A    | A     |
| 15 | A    | A    | A     |
+----+------+------+-------+
10 rows in set (0.00 sec)

 

 

 

接下来进行测试:

1.根据one列查询重复的数据(根据单列判断重复)

SELECT * FROM testdelete 
WHERE ONE IN (SELECT ONE FROM testdelete GROUP BY ONE HAVING COUNT(ONE) > 1) 

 

 

结果:

 

 2.删除表中的重复记录:(根据单列删除且保留ID最小的一条

DELETE
FROM testdelete
WHERE ONE IN(SELECT
               ONE
             FROM testdelete
             GROUP BY ONE
             HAVING COUNT(ONE) > 1)
    AND id NOT IN(SELECT
                    MIN(id)
                  FROM testdelete
                  GROUP BY ONE
                  HAVING COUNT(ONE) > 1)

 

 

 报错:

 

原因:大概是因为不能直接在查询的语句中进行操作。

解决办法:将查询包装一层:

DELETE
FROM testdelete
WHERE ONE IN(SELECT
               ONE
             FROM (SELECT
                     ONE
                   FROM testdelete
                   GROUP BY ONE
                   HAVING COUNT(ONE) > 1) a)
    AND id NOT IN(SELECT
                    *
                  FROM (SELECT
                          MIN(id)
                        FROM testdelete
                        GROUP BY ONE
                        HAVING COUNT(ONE) > 1) b)

 

 

 结果:

(5 row(s) affected)
Execution Time : 00:00:00:094
Transfer Time : 00:00:00:000
Total Time : 00:00:00:094

 

再次查看数据:

 

 

将数据还原。 

 

 3.根据one,two,three判断重复:(根据单多判断重复)

SELECT * FROM testdelete a 
WHERE (a.one,a.two,a.three) IN (SELECT ONE,two,three FROM testdelete GROUP BY ONE,two,three HAVING COUNT(*) > 1) 

 

结果;

 

 

4.删除表中的重复数据(根据多列进行删除且保留ID最小的一条)

 

DELETE
FROM testdelete
WHERE (ONE,two,three)IN(SELECT
                          ONE,
                          two,
                          three
                        FROM (SELECT
                                ONE,
                                two,
                                three
                              FROM testdelete
                              GROUP BY ONE,two,three
                              HAVING COUNT( * ) > 1) a)
    AND id NOT IN(SELECT
                    MIN(id)
                  FROM (SELECT
                          MIN(id) AS id
                        FROM testdelete
                        GROUP BY ONE,two,three
                        HAVING COUNT( * ) > 1) b)

 

 

 结果:

(4 row(s) affected)
Execution Time : 00:00:00:125
Transfer Time : 00:00:00:000
Total Time : 00:00:00:125

 

查看数据:

 

 

数据还原

 5. 查找表中多余的重复记录(多个字段),不包含id最小的记录 (根据多个字段查重复不包含id最小的)

SELECT *
FROM testdelete a
WHERE (a.one,a.two,a.three)IN(SELECT
                                ONE,
                                two,
                                three
                              FROM testdelete
                              GROUP BY ONE,two,three
                              HAVING COUNT( * ) > 1)
    AND id NOT IN(SELECT
                    MIN(id) AS id
                  FROM testdelete
                  GROUP BY ONE,two,three
                  HAVING COUNT( * ) > 1)

 

 结果:

 

posted @ 2018-01-11 18:46  QiaoZhi  阅读(78341)  评论(3编辑  收藏  举报