Analyze a docker instance start failure on CentOS 6.x
错误信息:Cannot start container xxxxxxxxxxx | Error getting container xxxxxxxxxxxxxxx from driver devicemapper: Error mounting | invalid argument Error | failed to start containers
现象:4个Docker实例中,三个(基本没在使用)能正常启动,一个(内容最多的那个)不能正常启动。
触发诱因:服务器(Docker宿主机)意外断电。
[root@bogon ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@bogon ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES a91dadc56996 b1c89dd2c773 "/bin/auto_service.s 7 weeks ago Exited (137) 23 hours ago mawen 91a542541bb1 b1c89dd2c773 "/bin/auto_service.s 8 weeks ago Exited (128) 28 hours ago rgq fc0a891e1861 68a34cb5482c "/bin/auto_service.s 3 months ago Exited (0) 28 hours ago songheng 79177df3ddc2 b1c89dd2c773 "/bin/auto_service.s 5 months ago Exited (137) 23 hours ago guozhenya [root@bogon ~]# docker start 91a542541bb1 Error response from daemon: Cannot start container 91a542541bb1: Error getting container 91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 from driver devicemapper: Error mounting '/dev/mapper/docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1' on '/home/docker/images/devicemapper/mnt/91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1': invalid argument Error: failed to start containers: [91a542541bb1]
Error response from daemon: Cannot start container 91a542541bb1: Error getting container 91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 from driver devicemapper: Error mounting '/dev/mapper/docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1' on '/home/docker/images/devicemapper/mnt/91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1': invalid argument Error: failed to start containers: [91a542541bb1]
早期查到的原因及方案,并未解决此问题。
https://access.redhat.com/solutions/1565673
https://segmentfault.com/q/1010000003003635
https://www.lsproc.com/post/docker-faq/
https://blog.csdn.net/wangjia184/article/details/43151041
报错mount错误,无论是用GUI的磁盘管理工具,还是用如下命令行,都会报错。
[root@bogon mapper]# cd /dev/mapper/ [root@bogon mapper]# ll total 0 crw-rw----. 1 root root 10, 236 Oct 16 20:08 control lrwxrwxrwx. 1 root root 7 Oct 16 20:22 docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 -> ../dm-4 lrwxrwxrwx. 1 root root 7 Oct 16 20:15 docker-253:2-13369361-pool -> ../dm-3 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_home -> ../dm-2 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_root -> ../dm-0 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_swap -> ../dm-1 [root@bogon mapper]# sudo mkdir -p /mnt/base [root@bogon mapper]# ll total 0 crw-rw----. 1 root root 10, 236 Oct 16 20:08 control lrwxrwxrwx. 1 root root 7 Oct 16 20:22 docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 -> ../dm-4 lrwxrwxrwx. 1 root root 7 Oct 16 20:15 docker-253:2-13369361-pool -> ../dm-3 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_home -> ../dm-2 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_root -> ../dm-0 lrwxrwxrwx. 1 root root 7 Oct 16 20:08 VolGroup-lv_swap -> ../dm-1 [root@bogon mapper]# mount docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 /mnt/base mount: wrong fs type, bad option, bad superblock on /dev/mapper/docker-253:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so [root@bogon mapper]# dmesg | tail EXT4-fs (dm-4): bad geometry: block count 5242880 exceeds size of device (2621440 blocks) EXT4-fs (dm-4): bad geometry: block count 5242880 exceeds size of device (2621440 blocks) EXT4-fs (dm-4): bad geometry: block count 5242880 exceeds size of device (2621440 blocks)
device mapper这个驱动的详细解释,很科普的一篇文章:
https://coolshell.cn/articles/17200.html
帖子留言更精彩。
http://www.cnblogs.com/GarfieldEr007/p/5424629.html
结论:DeviceMapper这种东西问题太多了,我们应该把其加入黑名单。
尝试各种方式修复Docker的硬盘文件,结果还是失败了。
[root@bogon mapper]# fsck.ext4 docker-253\:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 e2fsck 1.41.12 (17-May-2010) The filesystem size (according to the superblock) is 5242880 blocks The physical size of the device is 2621440 blocks Either the superblock or the partition table is likely to be corrupt! Abort<y>? yes [root@bogon mapper]# e2fsck docker-253\:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 e2fsck 1.41.12 (17-May-2010) The filesystem size (according to the superblock) is 5242880 blocks The physical size of the device is 2621440 blocks Either the superblock or the partition table is likely to be corrupt! Abort<y>? yes [root@bogon mapper]# resize2fs docker-253\:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 resize2fs 1.41.12 (17-May-2010) resize2fs: New size smaller than minimum (4808845) [root@bogon mapper]# resize2fs docker-253\:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1 52428800 resize2fs 1.41.12 (17-May-2010) The containing partition (or device) is only 2621440 (4k) blocks. You requested a new size of 52428800 blocks.
各种尝试的方案:
https://access.redhat.com/solutions/55010
最后只能格式化了:
https://unix.stackexchange.com/questions/115698/fix-ext4-fs-bad-geometry-block-count-exceeds-size-of-device
https://serverfault.com/questions/548237/cant-mount-home-after-trying-to-resize-bad-geometry-block-count-exceeds-size
https://www.linuxquestions.org/questions/linux-hardware-18/size-in-superblock-is-different-from-the-physical-size-of-the-partition-298175/
mke2fs -t ext4 docker-253\:2-13369361-91a542541bb1478834df2c40796fbbbba4a0448063d4401871c7f2b63e5246f1
另外的一些收获:
https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html
https://bugzilla.redhat.com/show_bug.cgi?id=1121736
https://docs.docker.com/install/linux/docker-ee/rhel/#prerequisites
On Red Hat Enterprise Linux, Docker EE supports storage drivers, overlay2
and devicemapper
. In Docker EE 17.06.2-ee-5 and higher, overlay2
is the recommended storage driver. The following limitations apply:
-
OverlayFS: If
selinux
is enabled, theoverlay2
storage driver is supported on RHEL 7.4 or higher. Ifselinux
is disabled,overlay2
is supported on RHEL 7.2 or higher with kernel version 3.10.0-693 and higher. -
Device Mapper: On production systems using
devicemapper
, you must usedirect-lvm
mode, which requires one or more dedicated block devices. Fast storage such as solid-state media (SSD) is recommended. Do not start Docker until properly configured per the storage guide.
再分别聊聊Docker storage drivers
执行docker info:
Win10+Hyper-V
C:\Users\RenGuoQiang>docker info Containers: 2 Running: 2 Paused: 0 Stopped: 0 Images: 3 Server Version: 18.06.0-ce Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: d64c661f1d51c48782c9cec8fda7604785f93587 runc version: 69663f0bd4b60df09991c08812a60108003fa340 init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 4.9.93-linuxkit-aufs Operating System: Docker for Windows OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 1.934GiB Name: linuxkit-00155df70119 ID: TPBZ:PK4T:IR52:NNN6:X4BI:2P4W:QBXD:T5ZH:4UAZ:HCPC:5QZY:HJ23 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): true File Descriptors: 34 Goroutines: 55 System Time: 2018-10-17T05:48:24.8681549Z EventsListeners: 1 Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
Ubuntu xx:
Storage Driver: aufs
centos6/7
[root@bogon mapper]# docker info Containers: 4 Images: 41 Storage Driver: devicemapper Pool Name: docker-253:2-13369361-pool Pool Blocksize: 65.54 kB Backing Filesystem: extfs Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 14.24 GB Data Space Total: 107.4 GB Data Space Available: 93.13 GB Metadata Space Used: 26.58 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.121 GB Udev Sync Supported: true Deferred Removal Enabled: false Data loop file: /home/docker/images/devicemapper/devicemapper/data Metadata loop file: /home/docker/images/devicemapper/devicemapper/metadata Library Version: 1.02.117-RHEL6 (2016-12-13) Execution Driver: native-0.2 Logging Driver: json-file Kernel Version: 4.4.161-1.el6.elrepo.x86_64 Operating System: <unknown> CPUs: 4 Total Memory: 31.43 GiB Name: bogon ID: KU2E:PTFN:25CJ:234F:LTHQ:7IEB:JMT6:T4NQ:UPB7:BOCV:LKQF:6QKX
https://stackoverflow.com/questions/27800340/error-starting-docker-containers
This is known bug occuring with devicemapper
driver only.
Here is the reference of the bug: https://github.com/docker/docker/issues/4036
Best solution is to switch either to aufs
or overlayfs
drivers.
Note that this question seems to be a duplicate from this one: Docker building fails randomly with Error mounting