nfs客户端异常,导致 df -h一直卡主,NFSD: client xx.xx.xx.xx testing state ID with incorrect client ID

Mar  8 16:10:00 HKT-SW6-E5-STG-1-55 kernel: [19956403.500413] nfsd4_validate_stateid: 21 callbacks suppressed
Mar  8 16:10:00 HKT-SW6-E5-STG-1-55 kernel: [19956403.500415] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:00 HKT-SW6-E5-STG-1-55 kernel: [19956403.500819] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:00 HKT-SW6-E5-STG-1-55 kernel: [19956403.708314] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:00 HKT-SW6-E5-STG-1-55 kernel: [19956403.915963] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.123970] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.331941] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.332545] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.332893] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.333147] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID
Mar  8 16:10:01 HKT-SW6-E5-STG-1-55 kernel: [19956404.333436] NFSD: client XX.XX.XX.XX testing state ID with incorrect client ID

以上为 /var/log/syslog系统日志的报错,通过dmesg -T这个命令也可以发现这个报错
字面意思为 客户端拿到的状态id异常,

# 这是gpt给的答案
根据你提供的信息,这似乎是关于 NFSD(Network File System 守护程序)的日志信息。这条日志表明客户端正在使用错误的客户端 ID 进行状态 ID 测试。这可能意味着客户端与服务器之间存在某种通信或身份验证问题。你可以尝试检查客户端和服务器之间的配置,确保它们使用正确的客户端 ID 进行通信。可能需要进一步调查以解决这个问题。

解决报错的办法参考别的博客,都没怎么说明白,但是问题是解决了的。
我的操作系统是ubuntu2004,需要修改nfs服务端的配置文件,ubuntu的配置文件为 /etc/default/nfs-kernel-server

  1. 修改nfs配置文件,/etc/default/nfs-kernel-server(这是ubuntu服务的默认位置,centos的可能不一样,参考的博客好像都是centos的位置,这里写上,供大家参考 /etc/sysconfig/nfs)
# 需要在末尾添加这行
RPCNFSDARGS="-N 4"

2.然后重启nfs服务

sudo systemctl restart nfs-kernel-server.service

3.客户端取消挂载重新挂载

sudo fusermount -uz /xx-xx
sudo mount ... 或者 sudo mount -a

重新挂载之后问题解决了,但是还是没有搞明白是什么原因,但是有个命令可以看出是因为nfs通信版本的问题导致的,

nfsstat -s

我比较了两台机器
故障机器

Server rpc stats:
calls      badcalls   badfmt     badauth    badclnt
219566150   0          0          0          0       

Server nfs v3:
null             getattr          setattr          lookup           access           
3       100%     0         0%     0         0%     0         0%     0         0%     
readlink         read             write            create           mkdir            
0         0%     0         0%     0         0%     0         0%     0         0%     
symlink          mknod            remove           rmdir            rename           
0         0%     0         0%     0         0%     0         0%     0         0%     
link             readdir          readdirplus      fsstat           fsinfo           
0         0%     0         0%     0         0%     0         0%     0         0%     
pathconf         commit           
0         0%     0         0%     

Server nfs v4:
null             compound         
5         0%     219566814 99%     

Server nfs v4 operations:
op0-unused       op1-unused       op2-future       access           close            
0         0%     0         0%     0         0%     25619252  3%     23241372  3%     
commit           create           delegpurge       delegreturn      getattr          
0         0%     0         0%     0         0%     4646785   0%     89930392 11%     
getfh            link             lock             lockt            locku            
13494359  1%     0         0%     0         0%     0         0%     0         0%     
lookup           lookup_root      nverify          open             openattr         
13696471  1%     0         0%     0         0%     23319996  3%     0         0%     
open_conf        open_dgrd        putfh            putpubfh         putrootfh        
0         0%     0         0%     219479982 28%     0         0%     43        0%     
read             readdir          readlink         remove           rename           
124653549 16%     33664     0%     0         0%     0         0%     0         0%     
renew            restorefh        savefh           secinfo          setattr          
0         0%     0         0%     0         0%     0         0%     0         0%     
setcltid         setcltidconf     verify           write            rellockowner     
0         0%     0         0%     0         0%     0         0%     0         0%     
bc_ctl           bind_conn        exchange_id      create_ses       destroy_ses      
0         0%     37        0%     34        0%     57        0%     30        0%     
free_stateid     getdirdeleg      getdevinfo       getdevlist       layoutcommit     
0         0%     0         0%     0         0%     0         0%     0         0%     
layoutget        layoutreturn     secinfononam     sequence         set_ssv          
0         0%     0         0%     9         0%     219569577 28%     0         0%     
test_stateid     want_deleg       destroy_clid     reclaim_comp     allocate         
77867     0%     0         0%     4         0%     31        0%     0         0%     
copy             copy_notify      deallocate       ioadvise         layouterror      
0         0%     0         0%     0         0%     0         0%     0         0%     
layoutstats      offloadcancel    offloadstatus    readplus         seek             
0         0%     0         0%     0         0%     0         0%     0         0%     
write_same       
0         0% 

未报错机器

Server rpc stats:
calls      badcalls   badfmt     badauth    badclnt
219511732   0          0          0          0       

Server nfs v4:
null             compound         
0         0%     219512233100%     

Server nfs v4 operations:
op0-unused       op1-unused       op2-future       access           close            
0         0%     0         0%     0         0%     25553455  3%     23254275  3%     
commit           create           delegpurge       delegreturn      getattr          
0         0%     0         0%     0         0%     4649957   0%     89986873 11%     
getfh            link             lock             lockt            locku            
14525315  1%     0         0%     0         0%     0         0%     0         0%     
lookup           lookup_root      nverify          open             openattr         
14727437  1%     0         0%     0         0%     23251345  3%     0         0%     
open_conf        open_dgrd        putfh            putpubfh         putrootfh        
0         0%     0         0%     219506748 28%     0         0%     1         0%     
read             readdir          readlink         remove           rename           
124621979 16%     33706     0%     0         0%     0         0%     0         0%     
renew            restorefh        savefh           secinfo          setattr          
0         0%     0         0%     0         0%     0         0%     0         0%     
setcltid         setcltidconf     verify           write            rellockowner     
0         0%     0         0%     0         0%     0         0%     0         0%     
bc_ctl           bind_conn        exchange_id      create_ses       destroy_ses      
0         0%     0         0%     1         0%     2         0%     1         0%     
free_stateid     getdirdeleg      getdevinfo       getdevlist       layoutcommit     
0         0%     0         0%     0         0%     0         0%     0         0%     
layoutget        layoutreturn     secinfononam     sequence         set_ssv          
0         0%     0         0%     0         0%     219516707 28%     0         0%     
test_stateid     want_deleg       destroy_clid     reclaim_comp     allocate         
0         0%     0         0%     0         0%     1         0%     0         0%     
copy             copy_notify      deallocate       ioadvise         layouterror      
0         0%     0         0%     0         0%     0         0%     0         0%     
layoutstats      offloadcancel    offloadstatus    readplus         seek             
0         0%     0         0%     0         0%     0         0%     0         0%     
write_same       
0         0%

明显可以看到,故障机器多了v3的通信数据包详情,添加配置之后,这个显示也没有变化,但是问题解决了,如果有哪位朋友知道具体原因,欢迎评论区讨论。

posted @   jasmine456  阅读(563)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
点击右上角即可分享
微信分享提示