debian11 网卡报错carrier-changed

起因

今天中午12点多公司某台拥有双网卡(内网和外网)的debian11操作系统的机器内网IP地址发生了变化,经过询问都说没有人动过这台机器的IP地址。

这让我感到十分疑惑,没有人改动怎么IP地址怎么还会变化呢?

背景

这台Debian11的操作系统拥有两张网卡,一张外网网卡,一张内网网卡。由于公司业务方面的需求,在内网网卡上绑定了多个IP地址。(客户方对单个IP地址的访问次数有数量限制,所以我们在程序里使用多个IP地址,满足一定的访问需求。)

着手调查

是不是有人误操作

一开始是不太相信没有人动过这个说法的,开始调查是不是有人改了不敢承认。

root@jiangm01:~# last
...省略...
root     pts/0        122.224.95.83    Thu Mar 21 13:35 - 13:47  (00:11)
root     pts/0        122.224.95.83    Thu Mar 21 13:31 - 13:35  (00:04)
root     pts/0        122.224.95.83    Wed Mar 13 20:54 - 20:54  (00:00)
...省略...

通过last可以看到在中午12点多是没有登录的,13点多才有人登录该机器。

排除有人误操作导致网卡IP地址变动

查看系统日志

既然不是人为改动的,那就着手从系统日志入手,看看能不能从日志里面得到一些信息。

查看日志可以通过 /var/log/messages /var/log/syslog journalctl

通过查看系统日志得到一些信息,完整日志如下

Mar 21 12:04:48 jiangm01 kernel: [15296614.913697] tg3 0000:02:00.2 enp2s0f2: Link is down
Mar 21 12:04:48 jiangm01 NetworkManager[1871]: <info>  [1710993888.1016] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
Mar 21 12:04:54 jiangm01 NetworkManager[1871]: <info>  [1710993894.1061] device (enp2s0f2): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.50 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.49 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.48 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.47 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.46 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.158.216 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Withdrawing address record for 10.216.212.44 on enp2s0f2.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Leaving mDNS multicast group on interface enp2s0f2.IPv4 with address 10.216.212.44.
Mar 21 12:04:54 jiangm01 avahi-daemon[931]: Interface enp2s0f2.IPv4 no longer relevant for mDNS.
Mar 21 12:04:54 jiangm01 NetworkManager[1871]: <info>  [1710993894.1394] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
Mar 21 12:04:54 jiangm01 dbus-daemon[933]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.64' (uid=0 pid=1871 comm="/usr/sbin/NetworkManager --no-daemon ")
Mar 21 12:04:54 jiangm01 systemd[1]: Starting Network Manager Script Dispatcher Service...
Mar 21 12:04:54 jiangm01 dbus-daemon[933]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Mar 21 12:04:54 jiangm01 systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 21 12:05:04 jiangm01 systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Mar 21 12:05:31 jiangm01 kernel: [15296658.233501] tg3 0000:02:00.2 enp2s0f2: Link is up at 1000 Mbps, full duplex
Mar 21 12:05:31 jiangm01 kernel: [15296658.233513] tg3 0000:02:00.2 enp2s0f2: Flow control is off for TX and off for RX
Mar 21 12:05:31 jiangm01 kernel: [15296658.233515] tg3 0000:02:00.2 enp2s0f2: EEE is disabled
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4185] device (enp2s0f2): carrier: link connected
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4187] device (enp2s0f2): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4195] policy: auto-activating connection 'Wired connection 1' (5459ba21-39ce-4013-81d8-1aa4de536c07)
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4200] device (enp2s0f2): Activation: starting connection 'Wired connection 1' (5459ba21-39ce-4013-81d8-1aa4de536c07)
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4201] device (enp2s0f2): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4206] device (enp2s0f2): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4287] device (enp2s0f2): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: Joining mDNS multicast group on interface enp2s0f2.IPv6 with address fe80::a6dc:beff:fefa:69fa.
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: New relevant interface enp2s0f2.IPv6 for mDNS.
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: Registering new address record for fe80::a6dc:beff:fefa:69fa on enp2s0f2.*.
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: Joining mDNS multicast group on interface enp2s0f2.IPv4 with address 192.168.123.89.
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: New relevant interface enp2s0f2.IPv4 for mDNS.
Mar 21 12:05:31 jiangm01 avahi-daemon[931]: Registering new address record for 192.168.123.89 on enp2s0f2.IPv4.
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4302] device (enp2s0f2): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 dbus-daemon[933]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.64' (uid=0 pid=1871 comm="/usr/sbin/NetworkManager --no-daemon ")
Mar 21 12:05:31 jiangm01 systemd[1]: Starting Network Manager Script Dispatcher Service...
Mar 21 12:05:31 jiangm01 dbus-daemon[933]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Mar 21 12:05:31 jiangm01 systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4719] device (enp2s0f2): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4721] device (enp2s0f2): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4743] policy: set 'Wired connection 1' (enp2s0f2) as default for IPv4 routing and DNS
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4744] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4744] device (enp2s0f2): Activation: successful, device activated.
Mar 21 12:05:33 jiangm01 NetworkManager[1871]: <info>  [1710993933.0713] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
Mar 21 12:05:41 jiangm01 systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Mar 21 12:07:52 jiangm01 systemd[1]: fwupd.service: Succeeded.
Mar 21 12:07:52 jiangm01 systemd[1]: fwupd.service: Consumed 1.281s CPU time.

关于日志内容的解释 ——(来自百度文心一言的解释)

从提供的日志中,我们可以看到enp2s0f2网络接口经历了一系列的状态变化,同时NetworkManager也在尝试管理这些变化。以下是对日志的详细解释:

主机名更改(第一次)

Mar 21 12:04:48 jiangm01 NetworkManager[1871]: <info>  [1710993888.1016] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
这表示主机名'jiangm01'在NetworkManager之外被更改了。

网络接口不可用

Mar 21 12:04:54 jiangm01 NetworkManager[1871]: <info>  [1710993894.1061] device (enp2s0f2): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
enp2s0f2接口的状态从activated(激活)变为unavailable(不可用),原因是carrier-changed(载波改变)。

主机名更改(第二次)

Mar 21 12:04:54 jiangm01 NetworkManager[1871]: <info>  [1710993894.1394] policy: set-hostname: current hostname was changed outside NetworkManager: 'jiangm01'
再次提示主机名'jiangm01'在NetworkManager之外被更改。

网络接口重新连接

Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4185] device (enp2s0f2): carrier: link connected
这表明enp2s0f2接口的载波已经重新连接。

网络接口状态变更

Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4187] device (enp2s0f2): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
接口状态从unavailable变为disconnected(断开连接)。

自动激活有线连接

Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4195] policy: auto-activating connection 'Wired connection 1' (5459ba21-39ce-4013-81d8-1aa4de536c07)
NetworkManager开始自动激活名为'Wired connection 1'的有线连接。

启动有线连接

Mar 21 12:05:31 jiangm01 NetworkManager[1871]: <info>  [1710993931.4200] device (enp2s0f2): Activation: starting connection 'Wired connection 1' (5459ba21-39ce-4013-81d8-1aa4de536c07)
NetworkManager开始启动名为'Wired connection 1'的连接。

网络接口状态变化序列

Mar 21 12:05:31 jiangm01 NetworkManager[1871]: ... state change: disconnected -> prepare ...  
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: ... state change: prepare -> config ...  
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: ... state change: config -> ip-config ...  
Mar 21 12:05:31 jiangm01 NetworkManager[1871]: ... state change: ip-config -> ip-check

通过日志可以知道 内网网卡enp2s0f2状态发生了一些变化。并且这些变化和NetworkManager有一定的关系。

检查内网网卡是否正常

通过日志看到网卡状态的变化,那么我就要检查一下现在内网的这张网卡是否正常。

root@jiangm01:~# ethtool enp2s0f2
Settings for enp2s0f2:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Half 1000baseT/Full
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Half 1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
                                             1000baseT/Full
        Link partner advertised pause frame use: No
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Twisted Pair
        PHYAD: 3
        Transceiver: internal
        MDI-X: off
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes

看到Link detected: yes 说明网线是正常连接的,网卡也是正常的。

noprefixroute

经过ip aifconfig等命令发现了一些不同。

正常的网卡IP地址都是scope global 网卡名 而发生改变的网卡IP地址是 scope global noprefixroute 网卡名

于是我开始查询noprefixroute是什么意思?

noprefixroute是作用域全局的意思。并且通过查看博客得知:和network manager有关,并且涉及默认路由。

重启网络

因为调整网络很容易导致远程服务无法连接,所以我们在故障发生时就通知了IDC的管理人员,现在他们已经就位了,我可以放心大胆的重启网络了。

关闭NetworkManager

systemctl stop NetworkManager
systemctl disable NetworkManager
systemctl status NetworkManager

关闭之后请手动检查网络配置文件 /etc/network/interfaces 由于涉及公司IP,我这里就不贴出来了。

重启网络

systemctl restart networking
ip a

执行ip a之后就看到之前我们设置的内网网卡绑定多个的IP地址 已经出现了,只不过多出来的那个noprefixroute的192IP地址 不知道是怎么加上去的?

只是知道和NetworkManager以及默认路由有关。

删除默认路由

想法:删除内网网络的默认路由,删除192的路由地址。

查看路由的方法:netstat -rnip route

推荐 ip route

查看路由(已经将外网信息删除,数据脱敏)

root@jiangm01:~# ip route
default via 183.237.134.129 dev enp2s0f1 onlink 
10.216.0.0/16 via 10.216.158.193 dev enp2s0f2 
10.216.158.192/27 dev enp2s0f2 proto kernel scope link src 10.216.158.216 
10.216.212.32/27 dev enp2s0f2 proto kernel scope link src 10.216.212.44 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
省略的外网信息

删除路由

root@jiangm01:~# ip route del 10.216.0.0/16 via 10.216.158.193 dev enp2s0f2  
root@jiangm01:~# ip route del 10.216.158.192/27 dev enp2s0f2 proto kernel scope link src 10.216.158.216  
root@jiangm01:~# ip route del 10.216.212.32/27 dev enp2s0f2 proto kernel scope link src 10.216.212.44 

重启网卡

systemctl restart networking

报错:

Mar 21 14:35:45 jiangm01 ifup[733834]: RTNETLINK answers: File exists
Mar 21 14:35:45 jiangm01 ifup[733819]: ifup: failed to bring up enp2s0f2:0
Mar 21 14:35:45 jiangm01 ifup[733843]: RTNETLINK answers: File exists
Mar 21 14:35:45 jiangm01 ifup[733819]: ifup: failed to bring up enp2s0f2:1

我在另外一台机器上也遇到了相似的情况,网卡配置文件命名没有任何改动,第一次重启成功,第二次重启就报错。也查了相关资料,说是和路由有关,将路由清空之后再次重启还是一样的报错,感觉Debian11可能有些配置是特殊的。

重启机器

既然无法重启网络加载配置文件,并且我已经将NetworkManager关闭了,路由也清理过了,之前配置的内网IP地址也已经生效了,业务可以正常运行了。

只是192网段没有去掉,心想反正机房人员已经就位了,没有网络也不怕,那就重启试试,会不会将192的网段给去掉。

验证

重启之后网络正常,192网段也没有了,整个过程机房人员也没有参与,就这样解决了。

# ip a
4: enp2s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a4:dc:be:fa:69:fa brd ff:ff:ff:ff:ff:ff
    inet 10.216.158.216/27 brd 10.216.158.223 scope global enp2s0f2:0
       valid_lft forever preferred_lft forever
    inet 10.216.212.44/27 brd 10.216.212.63 scope global enp2s0f2:1
       valid_lft forever preferred_lft forever
    inet 10.216.212.45/27 brd 10.216.212.63 scope global secondary enp2s0f2:2
       valid_lft forever preferred_lft forever
    inet 10.216.212.46/27 brd 10.216.212.63 scope global secondary enp2s0f2:3
       valid_lft forever preferred_lft forever
    inet 10.216.212.47/27 brd 10.216.212.63 scope global secondary enp2s0f2:4
       valid_lft forever preferred_lft forever
    inet 10.216.212.48/27 brd 10.216.212.63 scope global secondary enp2s0f2:5
       valid_lft forever preferred_lft forever
    inet 10.216.212.49/27 brd 10.216.212.63 scope global secondary enp2s0f2:6
       valid_lft forever preferred_lft forever
    inet 10.216.212.50/27 brd 10.216.212.63 scope global secondary enp2s0f2:7
       valid_lft forever preferred_lft forever
    inet6 fe80::a6dc:beff:fefa:69fa/64 scope link 
       valid_lft forever preferred_lft forever
posted @ 2024-03-21 16:56  热气球!  阅读(373)  评论(0编辑  收藏  举报