记ssh错误排查-ansible
背景
ansible 节点 到三个k8s节点,执行ansible UCA -m ping -o 只有 一个成功 , 两个失败。
ansible到目标主机使用秘钥登录。
节点
ansible
k8s1
k8s2
k8s3
排查过程描述:
ansible默认通过ssh去连接目标主机,并执行制定命令脚本。
此次遇到问题在执行ansible UCA -m ping -o 时输出如下:
[root@ansible ~]# ansible UCA -m ping -o
[WARNING]: Unable to parse /root/hosts as an inventory source
Wednesday 10 June 2020 16:16:48 +0800 (0:00:00.062) 0:00:00.062 ********
[WARNING]: Unhandled error in Python interpreter discovery for host HD1-SHMY1-UCA-K8S-Node2: Failed to connect to the host via ssh: Warning: Permanently
added '100.66.0.2' (ECDSA) to the list of known hosts. Permission denied (publickey).
[WARNING]: Unhandled error in Python interpreter discovery for host HD1-SHMY1-UCA-K8S-Node3: Failed to connect to the host via ssh: Warning: Permanently
added '100.66.0.3' (ECDSA) to the list of known hosts. Permission denied (publickey).
HD1-SHMY1-UCA-K8S-Node2 | UNREACHABLE!: Data could not be sent to remote host "100.66.0.2". Make sure this host can be reached over ssh: Permission denied (publickey).
HD1-SHMY1-UCA-K8S-Node3 | UNREACHABLE!: Data could not be sent to remote host "100.66.0.3". Make sure this host can be reached over ssh: Permission denied (publickey).
HD1-SHMY1-UCA-K8S-Node1 | SUCCESS => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python"}, "changed": false, "ping": "pong"}
Wednesday 10 June 2020 16:16:49 +0800 (0:00:00.924) 0:00:00.986 ********
===============================================================================
ping ---------------------------------------------------------------------------------------------------------------------------------------------- 0.92s
Playbook run took 0 days, 0 hours, 0 minutes, 0 seconds
从输出里看是权限不允许, 于是尝试ssh连接
[root@ansible ~]# ssh 100.66.0.2 -vvv
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
debug1: Reading configuration data /root/.ssh/config
debug1: /root/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug2: resolving "100.66.0.2" port 22
debug2: ssh_connect_direct: needpriv 0
debug1: Connecting to 100.66.0.2 [100.66.0.2] port 22.
debug1: Connection established.
debug1: permanently_set_uid: 0/0
...中间输出省略
debug1: SSH2_MSG_SERVICE_ACCEPT received
__debug3: send packet: type 50__
__debug3: receive packet: type 51__
debug1: Authentications that can continue: publickey
debug3: start over, passed a different list publickey
debug3: preferred gssapi-keyex,gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /root/.ssh/id_rsa
debug3: send_pubkey_test
__debug3: send packet: type 50__
debug2: we sent a publickey packet, wait for reply
__debug3: receive packet: type 51__
debug1: Authentications that can continue: publickey
debug1: Trying private key: /root/.ssh/id_dsa
debug3: no such identity: /root/.ssh/id_dsa: No such file or directory
debug1: Trying private key: /root/.ssh/id_ecdsa
debug3: no such identity: /root/.ssh/id_ecdsa: No such file or directory
debug1: Trying private key: /root/.ssh/id_ed25519
debug3: no such identity: /root/.ssh/id_ed25519: No such file or directory
debug2: we did not send a packet, disable method
debug1: No more authentication methods to try.
Permission denied (publickey).
看上边输出, 由于我使用了key登录, 故关注此处
debug1: Offering RSA public key: /root/.ssh/id_rsa
__debug3: send packet: type 50__
debug2: we sent a publickey packet, wait for reply
__debug3: receive packet: type 51__
debug1:提供了我的key文件路径之后, send了一个type 50的包,等待reply时收到了一个type 51的包, 谷歌一通后发现:
http://www.snailbook.com/docs/assigned-numbers.txt
4.1.2. Initial Assignments
Message ID Value Reference
----------- ----- ---------
SSH_MSG_USERAUTH_REQUEST 50 [SSH-USERAUTH]
SSH_MSG_USERAUTH_FAILURE 51 [SSH-USERAUTH]
http://www.snailbook.com/docs/userauth.txt
6. Authentication Protocol Message Numbers
These are the general authentication message codes:
SSH_MSG_USERAUTH_REQUEST 50
SSH_MSG_USERAUTH_FAILURE 51
SSH_MSG_USERAUTH_SUCCESS 52
SSH_MSG_USERAUTH_BANNER 53
7. Public Key Authentication Method: "publickey"
The following method-specific message numbers are used by the
"publickey" authentication method.
SSH_MSG_USERAUTH_PK_OK 60
也就是数发送了个ssh用户认证请求之后,返回了个用户验证错误,正确的返回应该是type 60的包。
于是我又ssh了k8s1(可以正常ssh的),输出如下:
debug1: Offering RSA public key: /root/.ssh/id_rsa
debug3: send_pubkey_test
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
果然正常的验证正确返回值是type 60的包。
常规的配置检查了一遍没有异常, 此时ls -l /root/.ssh
, 发现三个k8s节点的文件有些异常
k8s1
[root@k8s-node1 .ssh]# ll
total 12
-rw------- 1 root root 2569 Jun 10 16:37 authorized_keys
-rw------- 1 root root 1679 Apr 17 09:34 id_rsa
-rw-r--r-- 1 root root 410 Apr 17 09:34 id_rsa.pub
k8s2
[root@k8s-node2 .ssh]# ll
total 8
-rw------- 1 root root 2571 Jun 10 16:37 authorized_keys
-rw------- 1 root root 1679 Jun 10 16:05 id_rsa
k8s3
[root@k8s-node3 .ssh]# ll
total 8
-rw------- 1 root root 2571 Jun 10 16:37 authorized_keys
-rw------- 1 root root 1679 Jun 10 16:05 id_rsa
k8s1的authorized_keys大小和2、3的不一致,但是cat了一下内容完全一样。
于是将k8s1上的authorized_keys scp到2和3上,于是 都可以通过秘钥访问了。
结论
ssh发送用户认证请求之后,验证key失败。 至于authorized_keys的大小为什么不一致,由于不能重现,就不得而知了。