磁盘空间满了故障排查
2023年3月24日解决磁盘满的问题
把日志文件内容清空即可,注意千万不要把日志文件删除,否则程序会出错
[root@manager /var/lib/docker/containers]#cd ./16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#du -h --max-depth=1 0 ./checkpoints 0 ./mounts 18G . [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#ls 16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d-json.log checkpoints config.v2.json hostconfig.json hostname hosts mounts resolv.conf resolv.conf.hash [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#ll 总用量 18213436 -rw-r----- 1 root root 18650507394 3月 24 18:36 16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d-json.log drwx------ 2 root root 6 5月 22 2022 checkpoints -rw------- 1 root root 3500 10月 15 09:41 config.v2.json -rw-r--r-- 1 root root 1517 10月 15 09:41 hostconfig.json -rw-r--r-- 1 root root 13 10月 15 09:41 hostname -rw-r--r-- 1 root root 174 10月 15 09:41 hosts drwx--x--- 2 root root 6 5月 22 2022 mounts -rw-r--r-- 1 root root 38 10月 15 09:41 resolv.conf -rw-r--r-- 1 root root 71 10月 15 09:41 resolv.conf.hash [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#pwd /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#ls 16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d-json.log checkpoints config.v2.json hostconfig.json hostname hosts mounts resolv.conf resolv.conf.hash [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#cat /dev/null >16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d-json.log [root@manager /var/lib/docker/containers/16072e82adddf7cdb0eb599d6b39de229cbeb0d6ff517031b3678e174b71359d]#df -h 文件系统 容量 已用 可用 已用% 挂载点 devtmpfs 7.6G 0 7.6G 0% /dev tmpfs 7.6G 0 7.6G 0% /dev/shm tmpfs 7.6G 628K 7.6G 1% /run tmpfs 7.6G 0 7.6G 0% /sys/fs/cgroup /dev/vda1 40G 22G 19G 55% / tmpfs 1.6G 0 1.6G 0% /run/user/0 overlay 40G 22G 19G 55% /var/lib/docker/overlay2/3436553183a28f7fc96765336923eaa970e9b6db8af8d1c271111cc1e3146cbd/merged overlay 40G 22G 19G 55% /var/lib/docker/overlay2/f29673ee60e79ffa41648100337e13433f90e077ce2943167d82db55e7740093/merged
#查看到有大量的进程占用大量的空间,其中mysql占用了很多,因为我用阿里云收费版的数据库了,所以我执行systemctl stop mysqld后再次执行下面的命令发现mysql没了
[root@laiyue ~]#lsof -n |grep deleted
#查看文件占用内存情况
[root@laiyue ~]#find */ ! -type l | cut -d / -f 1 | uniq -c
#查看啊文件系统占用内存情况
[root@laiyue ~]#df -ia
#查看此案使用情况
[root@laiyue /var/lib/docker/overlay2]#du -hs /var/lib/docker 5.9G /var/lib/docker
#查看Docker的磁盘使用情况
[root@laiyue /var/lib/docker/overlay2]#docker system df TYPE TOTAL ACTIVE SIZE RECLAIMABLE Images 7 4 1.821GB 1.58GB (86%) Containers 4 4 147.5MB 0B (0%) Local Volumes 1 1 156.5kB 0B (0%) Build Cache 0 0 0B 0B
#清理磁盘,删除关闭的容器、无用的数据卷和网络,以及dangling镜像(即无tag的镜像)
[root@laiyue /var/lib/docker/overlay2]#docker system prune WARNING! This will remove: - all stopped containers - all networks not used by at least one container - all dangling images - all dangling build cache Are you sure you want to continue? [y/N] y Deleted Networks: query_order_default Total reclaimed space: 0B
#命令清理得更加彻底,可以将没有容器使用Docker镜像都删掉。注意,这两个命令会把你暂时关闭的容器,以及暂时没有用到的Docker镜像都删掉了…所以使用之前一定要想清楚.。我没用过,因为会清理 没有开启的 Docker 镜像
[root@laiyue /var/lib/docker/overlay2]#docker system prune -a WARNING! This will remove: - all stopped containers - all networks not used by at least one container - all images without at least one container associated to them - all build cache Are you sure you want to continue? [y/N] y
最终找到了解决方案
1进入根目录查询大文件发现data目录占用空间最大
[root@laiyue /]#du -h --max-depth=1 0 ./dev 0 ./proc 668K ./run 0 ./sys 26M ./etc 860M ./root 3.6G ./var 3.1G ./usr 269M ./boot 12K ./home 0 ./media 0 ./mnt 78M ./opt 0 ./srv 0 ./tmp 990M ./server 6.1G ./application 0 ./backup 25G ./data 12K ./Users 40G .
2进入data目录继续查看发现是mysql目录
[root@laiyue /]#cd data [root@laiyue /data]#du -h --max-depth=1 25G ./mysql 25G .
3再次进入mysql目录查看发现全是过期的备份文件
[root@laiyue /data]#cd mysql [root@laiyue /data/mysql]#du -h --max-depth=1 0 ./20220117 0 ./20220118 208M ./19 209M ./20220120 211M ./20220121 212M ./20220122 213M ./20220123 213M ./20220124 213M ./20220125 213M ./20220126 213M ./20220127 213M ./20220128 213M ./20220129 213M ./20220130 213M ./20220131 213M ./20220201 213M ./20220202 213M ./20220203 213M ./20220204
4执行删除并没,并做个定时删除任务
[root@laiyue /server/scripts]#cat baksql.sh #!/bin/bash /usr/bin/find /data/mysql -mtime +3 -type d | xargs rm -rf