problems_docker

1 启动容器报错

报错内容如下:

root@asus:~# docker create -ti --name python centos:latest python
root@asus:~# docker start -ai python
Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "exec: \"python\": executable file not found in $PATH": unknown

原因:该容器所引用的镜像是centos:latest,其中没有python程序。
解决方法:网上拉取一个包含python的镜像,再执行上面的命令:

docker pull python
docker create -ti --name python python python
docker start -ai python

2 执行docker-compose up -d报错1

报错内容如下:

kibana    |  FATAL  Error: [server.host]: string.base

原因:docker-compose.yml中该配置 SERVER_HOST: 0 出错。

kibana:
    environment:
      SERVER_NAME: kibana
      SERVER_HOST: 0

解决方法:将 SERVER_HOST: 0 改为 SERVER_HOST: 0.0.0.0

3 执行docker-compose up -d报错2

报错内容如下:

kibana    | {"type":"log","@timestamp":"2020-02-04T04:28:36Z","tags":["warning","elasticsearch","admin"],"pid":9,"message":"Unable to revive connection: http://elasticsearch:9200/"}

原因:docker中的es的主机名不是默认的elasticsearch,而是es01,需要在docker-compose.yml中手动配置kibana中关联的es主机名。

解决方法:

vim docker-compose.yml
# 在kibana配置中,添加以下内容
kibana:
    environment:
      ELASTICSEARCH_HOSTS: http://es01:9200

4 执行docker-compose up -d报错3

报错内容如下:

ls01      | [2020-02-04T09:30:21,691][ERROR][logstash.licensechecker.licensereader] Unable to retrieve license information from license server {:message=>"No Available connections"}
ls01      | [2020-02-04T09:30:21,872][WARN ][logstash.licensechecker.licensereader] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}
ls01      | [2020-02-04T09:30:51,690][ERROR][logstash.licensechecker.licensereader] Unable to retrieve license information from license server {:message=>"No Available connections"}
ls01      | [2020-02-04T09:30:51,923][WARN ][logstash.licensechecker.licensereader] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}

原因:docker中的es的主机名不是默认的elasticsearch,而是es01,需要在docker-compose.yml中手动配置logstash中关联的es主机名。

解决方法:

vim docker-compose.yml
# 在logstash配置中,添加以下内容
ls01:
    environment:
      - XPACK_MONITORING_ELASTICSEARCH_HOSTS=http://es01:9200

5 执行curl lenovo:8000/v2/_catalog报错3

报错内容如下:

root@asus:~# curl lenovo:8000/v2/_catalog
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]}

原因:该网站需要用户名和密码的认证。

解决方法:加上参数 -u,它会提示要求输入用户名root的密码:

root@asus:~# curl -u root lenovo:8000/v2/_catalog
Enter host password for user 'root':
{"repositories":["centos-latest"]}

6 执行docker stack deploy -c docker-compose.yml elk报错

报错内容如下:

elk_kibana.1.87xp10k2smwo@asus    | {"type":"log","@timestamp":"2020-02-05T06:54:13Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch for the [data] cluster. TypeError: Cannot use 'in' operator to search for 'type' in null"}

待解决。

7 执行docker build . -t centos-halo:v1.0报错

报错内容如下:

COPY failed: stat /var/lib/docker/tmp/docker-builder015587688/usr/java/jdk1.8.0_231/README.html: no such file or directory

原因和解决方法,参考链接:https://blog.csdn.net/small_to_large/article/details/77435541

8 容器中无法使用ping,vim,ifconfig等命令,且安装出错

问题:执行apt-get install vim报错如下:

   Reading package lists... Done
   Building dependency tree       
   Reading state information... Done
   E: Unable to locate package vim

解决方法:执行apt-get update ,然后可以正常安装vim等常用命令。
这个命令的作用是:同步 /etc/apt/sources.list 和 /etc/apt/sources.list.d 中列出的源的索引,这样才能获取到最新的软件包。
另:
安装ping命令: apt-get install -y iputils-ping(ubuntu)yum -y install iputils-ping(centos)
安装ifconfig命令:apt-get install net-tools(ubuntu)yum -y install net-tools(centos)
安装VIM命令:apt-get install vim(ubuntu)yum -y install vim(centos)
安装TELNET命令:apt-get install telnet(ubuntu)yum -y install telnet(centos)

9 pycharm can not connect and use the remote python interpreter

RCA: firewall is off, or need to restart pycharm after configuration.
action:

  1. open the firewall.
  2. restart pycharm.
    then it is working.
    solution:
  3. open the firewall.
  4. restart pycharm.
    note:
    will try only to restart the poycharm next time. maybe it is the only step I need to do.

10 "pip install requests" failed

action&desc:

  1. run the "pip install requests" command in Dockerfile in pycharm, to remotely install requests, failed with the error log below:
Step 2/2 : RUN pip install requests
 ---> [Warning] IPv4 forwarding is disabled. Networking will not work.
 ---> Running in 1b982dc6b85f
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdcfa23efa0>: Failed to establish a new connection: [Errno -3] Try again')': /simple/requests/
ERROR: No matching distribution found for requests
Error response from daemon: The command '/bin/sh -c pip install requests' returned a non-zero code: 1
Failed to deploy '<unknown> Dockerfile: dockerdir/Dockerfile': Can't retrieve image ID from build stream
  1. run the "pip install requests" command in the container directly, but failed with the following error log:
/ # pip install requests
Collecting requests
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f34098b4710>: Failed to establish a new connection: [Errno -3] Try again',)': /simple/requests/

RCA1:
network unavailability in docker container.
solution1:
restart remote server, then the container is restarted too.

RCA2:
after restarting, it still has error log.
solution2:
need to wait for a while, then it will work itself.

11 failed to execute "docker run --entrypoint htpasswd registry -Bbn testuser testpw > /my-registry/auth/htpasswd"

error log:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: exec: "htpasswd": executable file not found in $PATH: unknown.
ERRO[0000] error waiting for container: context canceled

RCA:
registry version is not compatible.

solution:
use the registry:2.7.0 instead of registry:latest
[root@vserver ~]# docker pull registry:2.7.0
docker run --entrypoint htpasswd registry -Bbn testuser testpw > /my-registry/auth/htpasswd

12 scrapy and pyspider installing failed

desc:
Error response from daemon: The command '/bin/sh -c pip install twisted && pip install gevent' returned a non-zero code: 1

action:

docker run -dti {python:alpine3.13's container}  
docker exec -ti {python:alpine3.13's container} /bin/sh  
execute "pip install twisted" seperately, succeeded.   
execute "pip install gevent" seperately, failed.

error log1:

execute "pip install gevent", failed with the error below:   
  error: command 'gcc' failed with exit status 1  
  ----------------------------------------  
  ERROR: Failed building wheel for yarl  
Successfully built aiohttp idna-ssl  
Failed to build multidict yarl 

solution1:
install gcc first, then install the remaining modules.

error log2:

execute "pip install gevent", failed with the error below:   
socket.timeout: The read operation timed out  
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.  
Error response from daemon: The command '/bin/sh -c pip install twisted &&    pip install gevent' returned a non-zero code: 2 

solution2:

use arg ' --default-timeout=1000' to lengthen the read/check time.   
execute "pip --default-timeout=1000 install gevent"  

error log3:

execute "pip install gevent", failed with the error below:   
c/_cffi_backend.c:15:17: fatal error: ffi.h: No such file or directory  
     #include <ffi.h>  
                     ^  
    compilation terminated.  
    error: command 'gcc' failed with exit status 1  

solution3:
execute "apk add --no-cache libffi-dev "

error log4:

execute "pip install gevent", failed with the error below:   
gcc: error trying to exec 'cc1plus': execvp: No such file or directory  
      error: command 'gcc' failed with exit status 1  

solution4:

execute "apk add --no-cache gcc-c++ ", failed with the error below:   
ERROR: unsatisfiable constraints:  
  gcc-c++ (missing):  
    required by: world[gcc-c++]  
then execute "apk add --no-cache g++" after seached for some online materials.   

error log5:

execute "pip install gevent", failed with the error below:   
running build_ext  
  generating cffi module 'build/temp.linux-x86_64-3.6/gevent.libuv._corecffi.c'  
  creating build/temp.linux-x86_64-3.6  
  Running '(cd  "/tmp/pip-install-_hofsllg/gevent_f657e0f166474603bedd0a08a891171b/deps/libev"  && sh ./configure -C > configure-output.txt )' in /tmp/pip-install-_hofsllg/gevent_f657e0f166474603bedd0a08a891171b  
  ./configure: ./configure.lineno: line 1: /usr/bin/file: not found  
  config.status: error: in `/tmp/pip-install-_hofsllg/gevent_f657e0f166474603bedd0a08a891171b/deps/libev':  
  config.status: error: Something went wrong bootstrapping makefile fragments  
      for automatic dependency tracking.  Try re-running configure with the  
      '--disable-dependency-tracking' option to at least be able to build  
      the package (albeit without support for automatic dependency tracking).  

solution5:
changed from alpine:3.6 to alpine:3.13 in Dockerfile

error log6:

Step 3/9 : RUN apk add --no-cache gcc g++ musl-dev  
 ---> Running in f27cf42cad40  
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/x86_64/APKINDEX.tar.gz  
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: Permission denied  
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/x86_64/APKINDEX.tar.gz  
ERROR: unable to select packages:  

solution6:
在根据官网的Docker-compose构建示例项目时遇到的问题,原因是因为无法访问官网给出的外网镜像,可能还会出现 WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: network error
通过在Dockerfile中引入下面两个命令可以解决该问题:
RUN sed -i ‘s/https/http/’ /etc/apk/repositories
RUN apk add curl

error log7:
execute "pip install gevent", error log5 occurs again:
./configure: ./configure.lineno: line 1: /usr/bin/file: not found

solution7:
execute the following command:

RUN apk add --no-cache build-base &&\  
    pip --default-timeout=10000 install gevent &&\
    apk del build-base

error log8:
[Warning] IPv4 forwarding is disabled. Networking will not work.

solution8:
echo net.ipv4.ip_forward=1 >> /usr/lib/sysctl.d/00-system.conf
systemctl restart network && systemctl restart docker

error log9:
error: [Errno 2] No such file or directory: 'cargo'

solution9:
execute "RUN apk add --no-cache cargo"

error log10:

Error response from daemon: The command '/bin/sh -c pip install scrapy' returned a non-zero code: 1  
Building wheel for cryptography (PEP 517): finished with status 'error'  
error: can't find Rust compiler  
  This package requires Rust >=1.41.0.  
  ----------------------------------------  
  ERROR: Failed building wheel for cryptography  

solution10:

# 为rust crates.io换上国内中科大的源, then install rust and cargo  
RUN mkdir -p ~/.cargo &&\  
    cd ~/.cargo &&\  
    rm -rf config &&\  
    touch config &&\  
    echo "[source.crates-io]" >> config &&\  
    echo "replace-with = 'ustc'" >> config &&\  
    echo "" >> config &&\  
    echo "[source.ustc]" >> config &&\  
    echo "registry = "https://mirrors.ustc.edu.cn/crates.io-index"" >> config &&\  
    apk add --no-cache rust cargo  

13 构建nacos的docker镜像总是启动失败

Dockerfile中执行如下命令报错:

SENTINEL_HOME=/develop/sentinel
COPY sentinel-dashboard-1.8.1.jar $SENTINEL_HOME
java -jar $SENTINEL_HOME/sentinel-dashboard-1.8.1.jar

ERRORLOG: Error: Unable to access jarfile /sentinel-dashboard-1.8.1.jar

RCA: this command COPY sentinel-dashboard-1.8.1.jar $SENTINEL_HOME caused this error, because COPY command considers $SENTINEL_HOME as a file name, instead of a dicectory.
linux的cp命令也是一样的处理方式,此处 $SENTINEL_HOME=/develop/sentinel 目录不存在,docker会尝试创建该目录,但是后面的sentinel后面没有斜杠,所以docker或linux会认为它是一个文件名称,所以执行 dockerfile的COPY或linux的cp命令时,会将目标文件sentinel-dashboard-1.8.1.jar 成功拷贝过去,但是文件名称变为 sentinel。然后下面执行java -jar时就报错了,因为根本没有目标文件。

SOLUTION: change the command to COPY sentinel-dashboard-1.8.1.jar $SENTINEL_HOME/

14 Dockerfile排除错误的思路

总体思路有2条:

  1. 做减法(尽管已经很快了,但相对第2种方法会慢一些):

  2. 如果没有什么实际场景,可以使用折半思想,先去掉末尾一半的命令,构建镜像,启动容器,验证前面一半命令是否正确;

  3. 如果有错误,则去掉前面一半的一半命令,再次验证,如果无误,则尝试添加前一步去掉的命令的一半,再次验证,直到找到有错误的命令。

  4. 去掉第一条导致错误的命令,构建镜像启动容器,登录到容器中,查看其报错消息,使用相应解决方法。

  5. 直到没有错误。

  6. 做加法(最快的方式,但是这种方法不能执行和Dockerfile中完全相同的命令,并且有些命令的使用方法也不同,还有些命令是Dockerfile中独有的,所以准确性稍微差一点,但也不会差很多):

  7. 先不要使用Dockerfile构建。先使用 docker run -itd --name=centos_test centos bash,创建一个空白的容器。

  8. 如果第1步添加了参数 -d,即后台启动,则再使用 docker exec -it centos_test bash登录到容器的终端,否则,应该是容器启动后就自动直接登录到了终端。

  9. 然后一条条命令执行过去,看哪一条报错了。

  10. 注意:这里的命令和Dockerfile中的命令不一样,需要将Dockerfile中的命令翻译为linux命令。

15 Dockerfile构建nacos启动后自动退出

ACTION:
一开始我的命令是: CMD startup.sh -m standalone 
发现不行,有问题,提示 startup.sh 命令找不到,后来就换了下面的写法:
CMD cd $NACOS_HOME/bin/ && ./startup.sh -m standalone
后来不报错了,但是docker run之后,一直自动退出,后来网上查了资料,说:
docker 运行的容器要求至少有一个进程在执行,如果没有进程执行则会自动退出。
网上查资料说可以用top命令或者tail -f 来处理。于是就使用了下面的写法:
CMD cd $NACOS_HOME/bin/ && ./startup.sh -m standalone && tail -f $NACOS_HOME/logs/start.out

16 启动docker容器报错

# ERROR LOG
docker logs -f nexus
# 报错内容
Unable to delete file: /nexus-data/cache/bundle63/version0.0/revision.location

该文件 /nexus-data/cache/bundle63/version0.0/revision.location是在docker卷中的,宿主机的物理路径是: /home/witt/bak/docker/cache/bundle63/version0.0/revision.location
于是我在宿主机上使用 root 账户删除该文件,但是又报另一个错:/nexus-data/cache/bundle50/version0.0/revision.location
然后,我把/home/witt/bak/docker/cache/路径下的所有文件都移动到一个备份目录中,再次尝试启动dockers容器:docker start nexus,成功!

17

posted @ 2021-08-25 17:32  mediocrep  阅读(1564)  评论(0编辑  收藏  举报
既然选择了远方,便只顾风雨兼程!