mysql - 删除重复数据

如下表:

 

单字段重复:

例如:删除仅stu_name一个字段重复的数据

全部删除

-- 查询全部重复数据
select * from tb_4 where stu_name in (select stu_name from tb_4 group by stu_name having count(1) > 1)
-- 删除全部重复数据
delete from tb_4 where stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name having count(1) > 1)a)

部分删除(保留一行)

-- 查询多余重复数据
select * from tb_4 where id not in (select min(id) from tb_4 group by stu_name)
-- 删除多余重复数据(保留一行)
delete from tb_4 where id not in (select a.min_id from (select min(id) as min_id from tb_4 group by stu_name)a)

 

  

多字段重复:

 在单字段的基础上,将group by中的字段修改为你想要的即可

 例如:删除stu_name、score两个字段重复的数据

全部删除

-- 查询全部重复数据
select * from tb_4 where  stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name, score having count(1) > 1)a)
and score in (select a.score from (select score from tb_4 group by stu_name, score having count(1) > 1)a)
-- 删除全部重复数据
delete from tb_4 where stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name, score having count(1) > 1)a)
and score in (select a.score from (select score from tb_4 group by stu_name, score having count(1) > 1)a)

 

部分删除(保留一行)

-- 查询多余重复数据
select * from tb_4 where id not in (select min(id) from tb_4 group by stu_name, score)
-- 删除多余重复数据(保留一行)
delete from tb_4 where id not in (select a.min_id from (select min(id) as min_id from tb_4 group by stu_name, score)a)

 

 

 

 

 

 

 

保留多行

如下表:删除相同医院、相同科室5个以上的医生数据(删除红框中的数据)

 

 结合sql和python实现

1. 以下sql可以将相同医院相同科室符合条件的数据删除一行

delete from tb_test_doctor_1 where phone in 
(
select b.phone from 
(select phone from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) desc) as b 
)

2. 使用python循环,直到删完

import MySQLdb

conn = MySQLdb.connect(
    host='192.168.1.0',
    port=3306,
    user='root',
    passwd='123',
    db='test'
)
cur = conn.cursor()

# sql = 'select version()'
#
# cur.execute(sql)
# a = cur.fetchall()
# print(a)


sql_sum = 'select count(1) from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) DESC limit 1'
cur.execute(sql_sum)
sum_tuple = cur.fetchone()
sum = sum_tuple[0]
print(sum)

sql2 = """
delete from tb_test_doctor_1 where phone in 
(
select b.phone from 
(select phone from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) desc) as b 

)
"""
for n in range(sum):
    cur.execute(sql2)
    conn.commit()


cur.close()
conn.close()

 

posted @ 2021-11-29 21:32  小虫虫大虫虫  阅读(534)  评论(0编辑  收藏  举报