参考:
https://blog.csdn.net/qq_42906753/article/details/105138596

1、Manage Docker as a non-root user时,出现问题,

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.

原因:未启动docker
解决办法:

service  docker start

部署机ip:192.168.170.142/24 目标机ip:192.168.170.145/24

目标机:

部署机

参考https://blog.csdn.net/weixin_44002829/article/details/97619826
下载python3.6编译安装
命令:wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0.tgz
(如果没有安装wget, 先安装,命令 : yum install wget)

解压: tar -xzvf Python-3.6.0.tgz (解压在home目录)

指向路径: cd Python-3.6.0 (不知道文件夹在哪可以查找一下 用ls 指令查一下在那个目录下,然后cd)

编译: ./configure --prefix=/usr/local

如果遇到 configure: error: no acceptable C complier found in $PATH
解决: yum install gcc

继续 :

make altinstall

更改 /usr/bin/python链接

cd /usr/bin

mv python python.backup

ln -s /usr/local/bin/python3.6 /usr/bin/python
ln -s /usr/local/bin/python3.6 /usr/bin/python3

更改yum脚本的python 依赖
(这个改了不知道有什么用)

ls yum*
vi /usr/bin/yum
vi /usr/libexec/urlgrabber-ext-down

(将执行指令后进入的文件的开头为

!/usr/bin/python 改为 #!/usr/bin/python2)

之后python3.6就完成了.
下载FATE

curl -OL https://github.com/FederatedAI/KubeFATE/releases/download/v1.3.0/kubefate-docker-compose.tar.gz#下载
tar -xvzf kubefate-docker-compose.tar.gz   #解压

进入docker-deploy目录,对parties.conf修改。

下载安装虚拟化所用工具(pip install virtualenvwrapper)时,出现错误:

Could not fetch URL https://pypi.org/simple/virtualenvwrapper/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/virtualenvwrapper/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)) - skipping
  Could not find a version that satisfies the requirement virtualenvwrapper (from versions: )
No matching distribution found for virtualenvwrapper
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.

使用ssh

在部署机上使用ssh root@192.168.170.145 可以在部署机上连接目标机。

在部署机上,下载并解压Kubefate1.3的kubefate-docker-compose.tar.gz资源包

# curl -OL https://github.com/FederatedAI/KubeFATE/releases/download/v1.3.0/kubefate-docker-compose.tar.gz

# tar -xzf kubefate-docker-compose.tar.gz

定义需要部署的实例数目

进入docker-deploy目录

# cd docker-deploy/

编辑parties.conf如下

vi parties.conf 
user=root                                   
dir=/data/projects/fate                     
partylist=(10000 9999)                      
partyiplist=(192.168.170.142 192.168.170.145)       
servingiplist=(192.168.170.142 192.168.170.145)     
exchangeip=  

执行生成集群启动文件脚本
#bash generate_config.sh

执行启动集群脚本

# bash docker_deploy.sh all

命令输入后需要每个用户输入4次root用户的密码

验证集群基本功能

#docker exec -it confs-10000_python_1 bash

之后出现为error:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

解决:重启docker

#systemctl daemon-reload

#systemctl restart docker.service

出现问题:

Status: Downloaded newer image for federatedai/fateboard:1.3.0-release
Creating docker-deploy_proxy_1        ... done
Creating docker-deploy_redis_1      ... done
Creating docker-deploy_mysql_1      ... done
Creating docker-deploy_federation_1 ... done
Creating docker-deploy_egg_1          ... done
Creating docker-deploy_meta-service_1 ... done
Creating docker-deploy_roll_1         ... done
Creating docker-deploy_python_1       ... error

ERROR: for docker-deploy_python_1  Cannot create container for service python: failed to mount local volume: mount /path/to/host/dir/examples:/var/lib/docker/volumes/docker-deploy_shared_dir_examples/_data, flags: 0x1000: no such file or directory

ERROR: for python  Cannot create container for service python: failed to mount local volume: mount /path/to/host/dir/examples:/var/lib/docker/volumes/docker-deploy_shared_dir_examples/_data, flags: 0x1000: no such file or directory
ERROR: Encountered errors while bringing up the project.
posted on 2020-10-29 09:04  20199302  阅读(459)  评论(0编辑  收藏  举报