记一次曲折的运维经历-2020-04-23
jenkins build jobs error:
ERROR: Build step failed with exception com.github.dockerjava.api.exception.DockerClientException: Could not build image: devmapper: Thin Pool has 38659 free data blocks which is less than minimum required 38911 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior at com.github.dockerjava.core.command.BuildImageResultCallback.getImageId(BuildImageResultCallback.java:79) at com.github.dockerjava.core.command.BuildImageResultCallback.awaitImageId(BuildImageResultCallback.java:51) at com.nirima.jenkins.plugins.docker.builder.DockerBuilderPublisher$Run.buildImage(DockerBuilderPublisher.java:387) at com.nirima.jenkins.plugins.docker.builder.DockerBuilderPublisher$Run.run(DockerBuilderPublisher.java:313) at com.nirima.jenkins.plugins.docker.builder.DockerBuilderPublisher.perform(DockerBuilderPublisher.java:463) at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.build(MavenModuleSetBuild.java:946) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:896) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1816) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) Build step 'Build / Publish Docker Image' marked build as failure
Finished: FAILURE
度娘的答案:device-mapper在删除镜像时没有回收,这是一个内核bug。
解决方法:
标准的解决办法是打Linux的补丁,使得在内核层面解决问题。
如果不想打补丁,不知道怎么打补丁,也不想去倒腾了,可以手工临时释放空间的办法。
我们先来手工清理空间:
1.清理exited进程:
docker rm $(docker ps -q -f status=exited)
2.清理dangling volumes:
docker volume rm $(docker volume ls -qf dangling=true)
3.清理dangling image:
docker rmi $(docker images --filter "dangling=true" -q --no-trunc)
本以为清理完成后,jenkins构建不会出现类似问题,以上操作不会对running的容器产生影响。
谁承想jenkins挂了,先把jenkins搞起来吧,ok,jenkins起来了,却发现原有的jobs不见了,这时在系统管理,看到一堆的错误提示需要升级jenkins相关插件。
升级完相关插件后,jenkins构建时又出现其他错误,
CMD /data/tomcat/bin/startup.sh start ERROR: Build step failed with exception java.lang.NullPointerException: uri was not specified at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) at com.github.dockerjava.core.DefaultDockerClientConfig$Builder.withDockerHost(DefaultDockerClientConfig.java:337) at io.jenkins.docker.client.DockerAPI.makeClient(DockerAPI.java:244) at io.jenkins.docker.client.DockerAPI.getOrMakeClient(DockerAPI.java:200) at io.jenkins.docker.client.DockerAPI.getClient(DockerAPI.java:169) at io.jenkins.docker.client.DockerAPI.getClient(DockerAPI.java:152) at com.nirima.jenkins.plugins.docker.DockerCloud.isTriton(DockerCloud.java:743) at com.nirima.jenkins.plugins.docker.builder.DockerBuilderPublisher.getDockerAPI(DockerBuilderPublisher.java:254) at com.nirima.jenkins.plugins.docker.builder.DockerBuilderPublisher.perform(DockerBuilderPublisher.java:463) at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.build(MavenModuleSetBuild.java:946) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:896) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1816) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) Build step 'Build / Publish Docker Image' marked build as failure Finished: FAILURE
在系统管理-cloud-add a new cloud
Docker Host URI :tcp://内网ip:2375
2375 docker 的远程连接端口。
jenkins 再次构建没有以上错误,但回到了原始的问题。
尝试构建其他服务会不会有类似问题,在构建的过程中没有发现类似问题,启动容器时出现类似错误。
综上出现异常,docker build 和 docker run 都会报同样的错误。 那就用docker info 查看docker 的详细信息吧,这里我们发现
Data Space Available: 20.27 GB
Metadata Space Available: 2.076 GB
原有200G,可用20G,
df -h free -hm 都够用呐
空间不够用了? 不应该啊,先前磁盘不够用,报的异常是设备上没有可用空间,不是这个错误,
docker使用的是thinpool,thin-pool由一个metadata设备和data设备组成,按需分配数据块,删除时也回收数据块,提高存储空间利用率。
使用 lsblk 以及lvs 命令观察当前的分配情况,与此同时收集device-mapper的日志,持续跟踪direct-lvm工作情况,可以直接使用:
# journalctl -fu dm-event.service
-- Logs begin at Mon 2020-02-10 22:41:43 CST. --
Apr 23 16:48:40 VM_4_11_centos lvm[1529]: Insufficient free space: 9728 extents needed, but only 1538 available
Apr 23 16:48:40 VM_4_11_centos lvm[1529]: Failed command for vgdocker-thinpool.
Apr 23 17:31:30 VM_4_11_centos lvm[1529]: Insufficient free space: 9728 extents needed, but only 1538 available
Apr 23 17:31:30 VM_4_11_centos lvm[1529]: Failed command for vgdocker-thinpool.
Apr 23 18:14:20 VM_4_11_centos lvm[1529]: Insufficient free space: 9728 extents needed, but only 1538 available
Apr 23 18:14:20 VM_4_11_centos lvm[1529]: Failed command for vgdocker-thinpool.
Apr 23 18:57:10 VM_4_11_centos lvm[1529]: Insufficient free space: 9728 extents needed, but only 1538 available
Apr 23 18:57:10 VM_4_11_centos lvm[1529]: Failed command for vgdocker-thinpool.
Apr 23 19:40:00 VM_4_11_centos lvm[1529]: Insufficient free space: 9728 extents needed, but only 1538 available
Apr 23 19:40:00 VM_4_11_centos lvm[1529]: Failed command for vgdocker-thinpool.
可用空间不足:需要9728个数据块,但只有1538个可用
***删除垃圾镜像和容器后重新构建服务此问题得到修复。
学习是一种信仰