最近在harbor服务器执行垃圾回收的时候,磁盘空间不释放

检测日志:

docker logs harbor-jobservice
User
[root@i-cnyu8n9j harbor]# docker logs harbor-jobservice
Appending trust CA to ca-bundle ...
 /harbor_cust_cert/harbor_ca.crt Appended ...
CA appending is Done.
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.oci.image.index.v1+json registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.docker.distribution.manifest.list.v2+json registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.docker.distribution.manifest.v1+prettyjws registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.oci.image.config.v1+json registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.docker.container.image.v1+json registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.cncf.helm.config.v1+json registered
2024-04-09T15:46:27Z [INFO] [/controller/artifact/processor/processor.go:58]: the processor to process media type application/vnd.cnab.manifest.v1 registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/native/adapter.go:36]: the factory for adapter docker-registry registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/harbor/adaper.go:31]: the factory for adapter harbor registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/dockerhub/adapter.go:25]: Factory for adapter docker-hub registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/huawei/huawei_adapter.go:27]: the factory of Huawei adapter was registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/googlegcr/adapter.go:29]: the factory for adapter google-gcr registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/awsecr/adapter.go:47]: the factory for adapter aws-ecr registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/azurecr/adapter.go:15]: Factory for adapter azure-acr registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/aliacr/adapter.go:31]: the factory for adapter ali-acr registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/jfrog/adapter.go:30]: the factory of jfrog artifactory adapter was registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/quayio/adapter.go:38]: the factory of Quay.io adapter was registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/helmhub/adapter.go:30]: the factory for adapter helm-hub registered
2024-04-09T15:46:27Z [INFO] [/replication/adapter/gitlab/adapter.go:17]: the factory for adapter gitlab registered
2024-04-09T15:46:27Z [INFO] [/common/config/store/driver/rest.go:31]: get configuration from url: http://core:8080/api/internal/configurations
2024-04-09T15:46:27Z [INFO] [/jobservice/logger/sweeper_controller.go:97]: 0 outdated log entries are sweepped by sweeper *sweeper.FileSweeper
2024-04-09T15:46:27Z [INFO] [/common/dao/base.go:84]: Registering database: type-PostgreSQL host-postgresql port-5432 databse-registry sslmode-"disable"
2024-04-09T15:46:27Z [INFO] [/common/dao/base.go:89]: Register database completed
2024-04-09T15:46:27Z [INFO] [/jobservice/migration/manager.go:111]: No migration needed
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *notification.SlackJob with name SLACK
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *sample.Job with name DEMO
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *scan.Job with name IMAGE_SCAN
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *replication.Replication with name REPLICATION
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *replication.Scheduler with name IMAGE_REPLICATE
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *retention.Job with name RETENTION
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *scheduler.PeriodicJob with name SCHEDULER
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *notification.WebhookJob with name WEBHOOK
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *all.Job with name IMAGE_SCAN_ALL
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:426]: Register job *gc.GarbageCollector with name IMAGE_GC
2024-04-09T15:46:27Z [INFO] [/jobservice/period/enqueuer.go:71]: Scheduler: periodic enqueuer is started
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/c_worker.go:151]: Basic worker is started
2024-04-09T15:46:27Z [INFO] [/jobservice/lcm/controller.go:86]: Status restoring loop is started
2024-04-09T15:46:27Z [INFO] [/jobservice/hook/hook_agent.go:146]: Hook event retrying loop is started
2024-04-09T15:46:27Z [INFO] [/jobservice/worker/cworker/reaper.go:57]: Reaper is started
2024-04-09T15:46:27Z [INFO] [/jobservice/runtime/bootstrap.go:205]: API server is serving at 8080 with [http] mode at node [6c94d5b1e81b:172.19.0.9]
2024-04-09T15:47:23Z [INFO] [/jobservice/worker/cworker/c_worker.go:74]: Job incoming: {"name":"IMAGE_GC","id":"7e0ca3b6fcfbee9a9af62929","t":1712677639,"args":null,"unique":true}
2024-04-09T15:47:23Z [INFO] [/common/config/store/driver/rest.go:31]: get configuration from url: http://core:8080/api/internal/configurations
2024-04-09T15:47:23Z [INFO] [/common/config/store/driver/rest.go:31]: get configuration from url: http://core:8080/api/internal/configurations
2024-04-09T15:47:23Z [INFO] [/common/registryctl/client.go:41]: initializing client for registry http://registryctl:8080 ...
2024-04-09T15:47:23Z [INFO] [/common/config/store/driver/rest.go:31]: get configuration from url: http://core:8080/api/internal/configurations
2024-04-09T15:47:23Z [INFO] [/jobservice/runner/redis.go:182]: start to run gc in job.
2024-04-09T15:47:23Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:132]: start to delete untagged artifact.
2024-04-09T15:47:23Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:132]: end to delete untagged artifact.
2024-04-09T15:47:23Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:132]: required candidate: %+v[]
2024-04-09T15:47:23Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:132]: end to delete required artifact.
2024-04-09T15:47:23Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:281]: flush artifact trash
2024-04-09T15:47:24Z [ERROR] [/registryctl/client/client.go:96]: Failed to start gc: 500
2024-04-09T15:47:24Z [ERROR] [/jobservice/runner/redis.go:182]: failed to get gc result: Failed to start GC: 500
2024-04-09T15:47:24Z [ERROR] [/jobservice/runner/redis.go:87]: Job 'IMAGE_GC:7e0ca3b6fcfbee9a9af62929' exit with error: run error: Failed to start GC: 500

上面可以看到垃圾回收(GC)任务被触发并开始执行,但最终因为内部服务错误(HTTP 500错误)而失败。错误指向的是在尝试启动垃圾回收过程时,与registryctl服务的交互中发生了错误。

查看Registryctl服务的日志:

docker logs registryctl

提示“

2024-04-09T15:47:24Z [ERROR] [/registryctl/api/registry.go:49]: Fail to execute GC: exit status 1, command err: failed to garbage collect: failed to mark: filesystem: filesystem: invalid checksum digest format

”错误

这个问题比较坑,是磁盘数据不一致的错误,只能去github搜索答案,果然有人出现和我一样的错误,貌似harbor有bug,解决方案上是删除磁盘上的空链接

rm $(find /path/to/registry -iname link -size 0)

执行上面代码后,重启harbor问题解决,/path/to/registry 是harbor的存储目录,以真实安装环境为准。

参加问题链接:

https://github.com/distribution/distribution/issues/1960