Fork me on GitHub

pve_ceph问题汇总

 

在同一个网络内,建立了两个同名的群集

Jun 24 11:56:08 cu-pve05 kyc_zabbix_ceph[2419970]: ]}
Jun 24 11:56:08 cu-pve05 corosync[3954]: error   [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Invalid packet data
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Invalid packet data
Jun 24 11:56:08 cu-pve05 kyc_zabbix_ceph[2419970]: Response from "192.168.7.114:10051": "processed: 3; failed: 48; total: 51; seconds spent: 0.001189"
Jun 24 11:56:08 cu-pve05 kyc_zabbix_ceph[2419970]: sent: 51; skipped: 0; total: 51
Jun 24 11:56:08 cu-pve05 corosync[3954]: error   [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Invalid packet data
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Invalid packet data
Jun 24 11:56:08 cu-pve05 corosync[3954]: error   [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]: alert   [TOTEM ] Invalid packet data
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Digest does not match
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:08 cu-pve05 corosync[3954]:  [TOTEM ] Invalid packet data
Jun 24 11:56:09 cu-pve05 corosync[3954]: error   [TOTEM ] Digest does not match
Jun 24 11:56:09 cu-pve05 corosync[3954]: alert   [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:09 cu-pve05 corosync[3954]: alert   [TOTEM ] Invalid packet data
Jun 24 11:56:09 cu-pve05 corosync[3954]:  [TOTEM ] Digest does not match
Jun 24 11:56:09 cu-pve05 corosync[3954]:  [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:09 cu-pve05 corosync[3954]:  [TOTEM ] Invalid packet data
Jun 24 11:56:09 cu-pve05 corosync[3954]: error   [TOTEM ] Digest does not match
Jun 24 11:56:09 cu-pve05 corosync[3954]: alert   [TOTEM ] Received message has invalid digest... ignoring.
Jun 24 11:56:09 cu-pve05 corosync[3954]: alert   [TOTEM ] Invalid packet data

 

 

删掉其中一个后,又报下面的错,在节点视图下的syslog下看到的

Jul 11 18:48:01 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on write
Jul 11 18:48:04 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:48:07 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:48:35 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on read
Jul 11 18:48:39 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on read
Jul 11 18:48:42 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:48:45 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:48:51 cu-pve03 pvedaemon[4111390]: worker exit
Jul 11 18:48:51 cu-pve03 pvedaemon[4692]: worker 4111390 finished
Jul 11 18:48:51 cu-pve03 pvedaemon[4692]: starting 1 worker(s)
Jul 11 18:48:51 cu-pve03 pvedaemon[4692]: worker 4148787 started
Jul 11 18:49:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:49:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:49:06 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:49:10 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:49:13 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:49:16 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:49:23 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:49:30 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:49:36 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on write
Jul 11 18:49:39 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:49:42 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on read
Jul 11 18:49:50 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on write
Jul 11 18:50:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:50:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:50:00 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:50:07 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on write
Jul 11 18:50:14 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:50:26 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:50:31 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on write
Jul 11 18:50:34 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:50:37 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on write
Jul 11 18:50:40 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on write
Jul 11 18:50:55 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:51:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:51:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:51:02 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:51:09 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on read
Jul 11 18:51:16 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:51:19 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on read
Jul 11 18:51:33 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:51:43 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on read
Jul 11 18:51:48 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on read
Jul 11 18:51:51 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on write
Jul 11 18:52:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:52:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:52:03 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:52:14 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on write
Jul 11 18:52:26 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on write
Jul 11 18:52:34 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:52:37 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on write
Jul 11 18:52:40 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:52:43 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:52:50 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:53:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:53:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:53:01 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:53:05 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on write
Jul 11 18:53:08 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on write
Jul 11 18:53:11 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on write
Jul 11 18:53:14 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:53:28 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket closed (con state CONNECTING)
Jul 11 18:53:51 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:53:55 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket error on read
Jul 11 18:53:58 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:54:00 cu-pve03 systemd[1]: Starting Proxmox VE replication runner...
Jul 11 18:54:00 cu-pve03 systemd[1]: Started Proxmox VE replication runner.
Jul 11 18:54:01 cu-pve03 kernel: libceph: mon0 192.168.7.4:6789 socket closed (con state CONNECTING)
Jul 11 18:54:06 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket error on read
Jul 11 18:54:09 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:54:17 cu-pve03 kernel: libceph: mon2 192.168.7.6:6789 socket closed (con state CONNECTING)
Jul 11 18:54:18 cu-pve03 kernel: libceph: mds0 192.168.7.5:6800 socket closed (con state CONNECTING)
Jul 11 18:54:32 cu-pve03 pveproxy[4118327]: worker exit
Jul 11 18:54:32 cu-pve03 pveproxy[7729]: worker 4118327 finished
Jul 11 18:54:32 cu-pve03 pveproxy[7729]: starting 1 worker(s)
Jul 11 18:54:32 cu-pve03 pveproxy[7729]: worker 4150738 started
Jul 11 18:54:37 cu-pve03 kernel: libceph: mon1 192.168.7.5:6789 socket error on read

 

 

 

问题点:
1、te环境backup 40MB/s
2、数据库虚拟机备份,考虑挂载点的问题,/ceph/fileserver/...
3、复制kycfs上的文件。
vzdump 202 --compress lzo  --storage kycfs --mode snapshot --node cu-pve05 --remove 0
vzdump 151 --mode stop --remove 0 --storage kycfs --compress lzo --node cu-pve02 --bwlimit 200000
--------------------------------------------------------
INFO: starting new backup job: vzdump 151 --mode stop --remove 0 --storage kycfs --compress lzo --node cu-pve02
INFO: Starting Backup of VM 151 (qemu)
INFO: Backup started at 2019-07-10 16:26:44
INFO: status = stopped
INFO: update VM 151: -lock backup
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: cu-dbs-151
INFO: include disk 'scsi0' 'kycrbd:vm-151-disk-0' 100G
INFO: include disk 'scsi1' 'kycrbd:vm-151-disk-1' 300G
INFO: snapshots found (not included into backup)
INFO: creating archive '/mnt/pve/kycfs/dump/vzdump-qemu-151-2019_07_10-16_26_44.vma.lzo'
INFO: starting kvm to execute backup task
INFO: started backup task 'a29c0ebc-52ee-4823-a5e6-56e7443c2cae'
INFO: status: 0% (499122176/429496729600), sparse 0% (423092224), duration 3, read/write 166/25 MB/s
INFO: status: 1% (4353687552/429496729600), sparse 0% (4277657600), duration 22, read/write 202/0 MB/s

---------------------------------------------------------
INFO: starting new backup job: vzdump 192 --compress lzo --bwlimit --storage kycfs --mode snapshot --node cu-pve06 --remove 0
INFO: Starting Backup of VM 192 (qemu)
INFO: Backup started at 2019-07-10 16:28:53
INFO: status = stopped
INFO: update VM 192: -lock backup
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: cu-tpl-192
INFO: include disk 'ide0' 'kycrbd:vm-192-disk-0' 100G
INFO: creating archive '/mnt/pve/kycfs/dump/vzdump-qemu-192-2019_07_10-16_28_53.vma.lzo'
INFO: starting kvm to execute backup task
INFO: started backup task '58adf55a-971c-49aa-b42d-595f8e3a0cf3'
INFO: status: 0% (197656576/107374182400), sparse 0% (114630656), duration 3, read/write 65/27 MB/s
INFO: status: 1% (1090519040/107374182400), sparse 0% (556826624), duration 15, read/write 74/37 MB/s
INFO: status: 2% (2181038080/107374182400), sparse 0% (563113984), duration 42, read/write 40/40 MB/s
INFO: status: 3% (3257532416/107374182400), sparse 0% (581787648), duration 69, read/write 39/39 MB/s

 

posted on 2019-07-11 18:59  阳光-源泉  阅读(2116)  评论(0编辑  收藏  举报

导航