生产场景NFS共享存储优化及实战
生产场景NFS共享存储优化:
1、硬件:sas/ssd磁盘,买多块,raid0/raid10,网卡好
2、NFS服务器端优化加all_squash,async
/backup/NFS 192.168.0.0/24(rw,async,all_squash)用这两个选项效率高了,但是就不可靠了。
3、客户端挂载:rsize,wsize,noatime,nodiratime四个选项为性能优化选项,nosuid,noexec两个选项为安全优化选项
mount -t nfs -o noatime,nodiratime,rsize=131072,wsize=131072 192.168.0.114:/backup/NFS /mnt
mount -t nfs -o nosuid,noexec,noatime,nodiratime,rsize=131072,wsize=131072 192.168.0.114:/backup/NFS /mnt
4、内核优化:
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
一、NFS高并发环境下的服务端重要优化(mount -o 参数)
a.async 异步同步,此参数会提高I/O性能,但会降低数据安全(除非对性能要求很高,对数据可靠性不要求的场合。一般生产环境,不推荐使用)
b.noatime 取消更新文件系统上的inode访问时间,提升I/O性能,优化I/O目的,推荐使用。
c.nodiratime 取消更新文件系统上的directory inode访问时间,高并发环境,推荐显式应用该选项,提高系统性能
d.noexec 挂载的这个文件系统,要不要执行程序(安全选项)
e.nosuid 挂载的这个文件系统上面,可不可以设置UID(安全选项)
f.rsize/wsize 读取(rsize)/写入(wsize)的区块大小(block size),这个设置值可以影响客户端与服务端传输数据的缓冲存储量。一般来说,如果在局域网内,并且客户端与服务端都具有足够的内存,这个值可以设置大一点,比如说32768(bytes),提升缓冲区块将可提升NFS文件系统的传输能力。但设置的值也不要太大,最好是实现网络能够传输的最大值为限。
查看客户端挂载的参数:
grep mnt /proc/mounts
[root@oldboy ~]#grep mnt /proc/mounts
192.168.0.114:/backup/NFS /mnt nfs4 rw,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
二、企业生产环境文件系统只读案例(1):
解决办法
1.重启看是否可以修复(很多机器可以)
2.使用用fsck – y 来修复文件系统
3.若,在进行修复的时候有的分区会报错,重新启动系统问题依旧
查看下分区结构
[root@localhost mobile]# more /etc/fstab
[root@localhost ~]# more /proc/mounts
[root@localhost ~]# mount
/dev/sda3 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (ro)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
查看ro挂载的分区,如果发现有ro,就重新mount
umount /dev/sda1
mount /dev/sda1 /boot
如果发现有提示“device is busy”,找到是什么进程使得他busy
fuser -m /boot 将会显示使用这个模块的pid
fuser -mk /boot 将会直接kill那个pid
然后重新mount即可。
4.直接remount,命令为
[root@localhost ~]# mount -o rw,remount /boot
==================================================
linux系统重启或无故变为只读造成网站无法正常访问的简单临时的做法:
1、mount:
用于查看哪个模块输入只读,一般显示为:
/dev/hda1 on / type ext3 (rw)
none on /proc type proc (rw)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda5 on /home type ext3 (rw)
none on /dev/shm type tmpfs (rw)
/dev/hda2 on /usr/local type ext3 (rw)
/dev/nb1 on /EarthView/RAW type ext3 (ro)(变为只读了)
2、如果发现有ro,就重新mount,或者umount以后再remount
3、umount /dev/nb1
如果发现有提示“device is busy”,找到是什么进程使得他busy
fuser -m /mnt/data 将会显示使用这个模块的pid
fuser -mk /mnt/data 将会直接kill那个pid
然后重新mount即可。
4、还有一种方法是直接remount,命令为
mount -o rw,remount /mnt/data
具体深入的做法,情况不同可以自行选择:
服务器/var/log/messages报错 :
end_request: I/O error, dev sda, sector 122194293 Buffer I/O error on device sda1, logical block 446493 lost page write due to I/O error on sda1
下面是整个处理全过程
[root@php5 ~]# fdisk -lu #第一步 :找出本地扇片所在的分区。
Disk /dev/sda: 73.4 GB, 73407868928 bytes
255 heads, 63 sectors/track, 8924 cylinders, total 143374744 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 63 4096574 2048256 83 Linux
/dev/sda2 4096575 75778604 35841015 83 Linux
/dev/sda3 75778605 129034079 26627737+ 83 Linux
/dev/sda4 129034080 143364059 7164990 5 Extended
/dev/sda5 129034143 139267484 5116671 83 Linux
/dev/sda6 139267548 143364059 2048256 82 Linux swap
[root@php5 ~]# tune2fs -l /dev/sda3 |grep "Block size" #找到block大小。
Block size: 4096
(122194293-75778605)*512/4096 =528691 利用公式算出逻辑块地址
b = (int)((L-S)*512/B)
[root@php5 ~]# debugfs
debugfs 1.35 (28-Feb-2004)
debugfs: open /deb/sda3
/deb/sda3: No such file or directory while opening filesystem
debugfs: open /dev/sda3
debugfs: icheck 582391
Block Inode number
582391 277584
debugfs: ncheck 277584
Inode Pathname
277584 /users/inn.net.cn/data/upload/download/innshow004.rar
debugfs: quit
[root@php5 ~]#dd if=/dev/zero of=/dev/sda1 bs=4096 count=1 seek=582391 #找到这个快的文件之后,需要做好备份,我们强制把它设置为0字节。
[root@php5 ~]# sync
企业生产环境fstab修改错误导致系统无法启动故障修复案例(2):
1.进入维护模式或救援模式
2.mount -o rw,remount /
3.然后修改/etc/fstab
三、NFS客户端mount挂载优化
a.安全性挂载参数:
mount -t nfs -o nosuid,noexec,nodev,rw 10.0.0.19:/data/bbs /mnt
[root@oldboy ~]#grep mnt /proc/mounts
192.168.0.114:/backup/NFS /mnt nfs4 rw,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
[root@oldboy ~]#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.6G 2.2G 5.1G 31% /
tmpfs 495M 0 495M 0% /dev/shm
/dev/sda1 190M 27M 153M 15% /boot
192.168.0.114:/backup/NFS
7.6G 4.7G 2.6G 65% /mnt
[root@oldboy ~]#umount /mnt
[root@oldboy ~]#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.6G 2.2G 5.1G 31% /
tmpfs 495M 0 495M 0% /dev/shm
/dev/sda1 190M 27M 153M 15% /boot
[root@oldboy ~]#history |grep mount
117 mount
645 mount /dev/sdb1 /mnt
649 umount /mnt
652 mount /dev/sdb2 /mnt
680 showmount -e 192.168.0.114
686 umount /mnt
688 mount -t nfs 192.168.0.114:/backup/NFS /mnt
724 echo "mount -t nfs 192.168.0.114:/backup/NFS /mnt" >>/etc/rc.local
727 mount
729 showmount -e 192.168.0.114
737 mount
739 grep mnt /proc/mounts
741 umount /mnt
743 history |grep mount
[root@oldboy ~]#mount -t nfs -o nosuid,noexec,nodev,rw 192.168.0.114:/backup/NFS /mnt
[root@oldboy ~]#grep mnt /proc/mounts
192.168.0.114:/backup/NFS /mnt nfs4 rw,nosuid,nodev,noexec,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
[root@oldboy ~]#ll
total 16
-rw-r--r-- 1 root root 292 May 12 22:16 a.log
drwxrwxr-x 7 1000 kl 4096 May 11 22:07 keepalived-1.2.7
-rw-r--r-- 1 root root 0 Jul 11 10:06 oldboy.log
drwxr-xr-x 3 root root 4096 Jul 5 20:58 server
drwxr-xr-x 4 root root 4096 May 11 22:07 tools
[root@oldboy ~]#cd server
[root@oldboy server]#ll
total 4
drwxr-xr-x 2 root root 4096 Jul 5 21:57 scripts
[root@oldboy server]#cd scripts
[root@oldboy scripts]#ll
total 8
-rw-r--r-- 1 root root 33 Jul 5 21:00 ping.sh
-rw-r--r-- 1 root root 160 Jul 5 21:57 tar.sh
[root@oldboy scripts]#pwd
/root/server/scripts
[root@oldboy scripts]#cp ping.sh /mnt
[root@oldboy scripts]#ll /mnt
total 8
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rw-r--r-- 1 nfsnobody nfsnobody 33 Jul 18 22:30 ping.sh
[root@oldboy scripts]#cd /mnt
[root@oldboy mnt]#./ping.sh
-bash: ./ping.sh: Permission denied
[root@oldboy mnt]#chmod +x ping.sh
[root@oldboy mnt]#ll
total 8
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwxr-xr-x 1 nfsnobody nfsnobody 33 Jul 18 22:30 ping.sh
[root@oldboy mnt]#./ping.sh
-bash: ./ping.sh: Permission denied
[root@oldboy mnt]#/mnt/ping.sh
-bash: /mnt/ping.sh: Permission denied
[root@oldboy ~]#chmod +x /mnt/ping.sh
[root@oldboy ~]#ll /mnt/ping.sh
-rwxr-xr-x 1 nfsnobody nfsnobody 33 Jul 18 22:30 /mnt/ping.sh
[root@oldboy ~]#sh /mnt/ping.sh
PING www.a.shifen.com (220.181.111.188) 640(668) bytes of data.
648 bytes from 220.181.111.188: icmp_seq=1 ttl=53 time=49.4 ms
648 bytes from 220.181.111.188: icmp_seq=2 ttl=53 time=45.9 ms
648 bytes from 220.181.111.188: icmp_seq=3 ttl=53 time=46.9 ms
--- www.a.shifen.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 6053ms
rtt min/avg/max/mdev = 45.912/47.431/49.432/1.487 ms
[root@oldboy ~]#cp /bin/cat /opt/
[root@oldboy ~]#/opt/cat /mnt/ping.sh
ping -c3 -s640 -i3 www.baidu.com
[root@oldboy ~]#cp /bin/rm /mnt/
[root@oldboy ~]#ll /mnt/rm
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwxr-xr-x 1 nfsnobody nfsnobody 33 Jul 18 22:30 ping.sh
-rwxr-xr-x 1 nfsnobody nfsnobody 57440 Jul 20 20:59 rm
[root@oldboy ~]#chmod u+s /mnt/rm
“为了方便普通用户执行一些特权命令,SUID/SGID程序允许普通用户以root身份暂时执行该程序,并在执行结束后再恢复身份。”
chmod u+s 就是给某个程序设置suid的特殊权限,可以像root用户一样操作这个程序。
[root@oldboy ~]#ll /mnt/rm
-rwsr-xr-x 1 nfsnobody nfsnobody 57440 Jul 20 20:59 /mnt/rm
[root@oldboy ~]#su - php001
[php001@oldboy ~]$ /mnt/rm /mnt/ping.sh
-bash: /mnt/rm: Permission denied
[root@backup ~]# cd /backup/NFS
[root@backup NFS]# ll
total 116
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwxr-xr-x 1 nfsnobody nfsnobody 33 Jul 18 22:30 ping.sh
-rwsr-xr-x 1 nfsnobody nfsnobody 57440 Jul 20 20:59 rm
[root@backup NFS]# chown -R root.root rm
[root@oldboy mnt]#ll
total 116
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwxr-xr-x 1 nfsnobody nfsnobody 33 Jul 18 22:30 ping.sh
-rwsr-xr-x 1 root root 57440 Jul 20 20:59 rm
[root@oldboy mnt]#./rm /mnt/ping.sh
[root@oldboy mnt]#ll
total 112
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwsr-xr-x 1 root root 57440 Jul 20 20:59 rm
总结:
1.nosuid,noexec对于shell脚本,php脚本的执行也生效
2.对于二进制程序,例如cat,也生效
3.通过mount -o指定挂载参数和在/etc/fstab里指定挂载参数效果是一样的。网络文件系统和本地的文件系统效果也是一样的。
b.性能优化参数:
[root@oldboy ~]#cat /proc/mounts
rootfs / rootfs rw 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
devtmpfs /dev devtmpfs rw,relatime,size=494980k,nr_inodes=123745,mode=755 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
/dev/sda3 / ext4 rw,relatime,barrier=1,data=ordered 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
/dev/sda1 /boot ext4 rw,relatime,barrier=1,data=ordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
192.168.0.114:/backup/NFS /mnt nfs4 rw,nosuid,nodev,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
NFS客户端性能优化主要设置rsize和wsize两个选项的值,如下:
[root@oldboy ~]#umount /mnt
[root@oldboy ~]#mount -t nfs -o nosuid,noexec,rsize=1024,wsize=1024,rw 192.168.0.114:/backup/NFS /mnt
[root@oldboy ~]#grep mnt /proc/mounts
192.168.0.114:/backup/NFS /mnt nfs4 rw,nosuid,noexec,relatime,vers=4,rsize=1024,wsize=1024,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
[root@oldboy ~]#time dd if=/dev/zero of=/mnt/testfile bs=9k count=20000 #测试文件系统性能(测试单个文件写入速度)
20000+0 records in
20000+0 records out
184320000 bytes (184 MB) copied, 16.6013 s, 11.1 MB/s
real 0m16.802s
user 0m0.000s
sys 0m2.394s
[root@oldboy ~]#umount -lf /mnt #强制卸载文件系统
[root@oldboy ~]#mount -t nfs -o nosuid,noexec,rw 192.168.0.114:/backup/NFS /mnt
[root@oldboy ~]#rm -f /mnt/testfile
[root@oldboy ~]#time dd if=/dev/zero of=/mnt/testfile bs=9k count=20000
20000+0 records in
20000+0 records out
184320000 bytes (184 MB) copied, 1.77673 s, 104 MB/s
real 0m1.793s
user 0m0.001s
sys 0m0.386s
[root@oldboy ~]#grep mnt /proc/mounts 192.168.0.114:/backup/NFS\ /mnt nfs4 rw,nosuid,noexec,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
[root@oldboy ~]#umount /mnt
[root@oldboy ~]#mount -t nfs -o nosuid,noexec,noatime,nodiratime,rw 192.168.0.114:/backup/NFS /mnt
[root@oldboy ~]#rm -f /mnt/testfile
[root@oldboy ~]#time dd if=/dev/zero of=/mnt/testfile bs=9k count=20000
20000+0 records in
20000+0 records out
184320000 bytes (184 MB) copied, 2.36344 s, 78.0 MB/s
real 0m2.369s
user 0m0.001s
sys 0m0.649s
[root@oldboy ~]#time for ((i=1;i<50000;i++));do cat /mnt/rm >/dev/null;done #(测试单个文件读取速度)
real 0m56.576s
user 0m3.825s
sys 0m7.754s
[root@oldboy ~]#grep mnt /proc/mounts
192.168.0.114:/backup/NFS /mnt nfs4 rw,nosuid,noexec,noatime,nodiratime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.131,minorversion=0,local_lock=none,addr=192.168.0.114 0 0
[root@oldboy ~]#umount -lf /mnt #强制卸载文件系统
[root@oldboy ~]#mount -t nfs -o noatime,nodiratime 192.168.0.114:/backup/NFS /mnt
[root@oldboy ~]#time for ((i=1;i<50000;i++));do cat /mnt/rm >/dev/null;done
real 0m51.925s
user 0m3.433s
sys 0m6.833s
四、NFS官方优化可以修改,官方的建议:
a.命令行调整
[root@oldboy ~]#cat /proc/sys/net/core/rmem_max #该文件指定了发送套接字缓冲区大小的最大值
124928
[root@oldboy ~]#cat /proc/sys/net/core/rmem_default #该文件指定了发送套接字缓冲区 大小的默认值
124928
[root@oldboy ~]#echo 8388608 > /proc/sys/net/core/rmem_default
[root@oldboy ~]#echo 16777216 > /proc/sys/net/core/rmem_max
[root@oldboy ~]#cat /proc/sys/net/core/rmem_default
8388608
[root@oldboy ~]#cat /proc/sys/net/core/rmem_max
16777216
[root@oldboy ~]#ll /mnt
total 180116
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwsr-xr-x 1 root root 57440 Jul 20 20:59 rm
-rw-r--r-- 1 nfsnobody nfsnobody 184320000 Jul 20 22:00 testfile
[root@oldboy ~]#rm -f /mnt/testfile
[root@oldboy ~]#ll /mnt
total 112
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rwsr-xr-x 1 root root 57440 Jul 20 20:59 rm
[root@oldboy ~]#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.6G 2.2G 5.1G 31% /
tmpfs 495M 0 495M 0% /dev/shm
/dev/sda1 190M 27M 153M 15% /boot
192.168.0.114:/backup/NFS
7.6G 4.7G 2.6G 65% /mnt
[root@oldboy ~]#ll
total 16
-rw-r--r-- 1 root root 292 May 12 22:16 a.log
drwxrwxr-x 7 1000 kl 4096 May 11 22:07 keepalived-1.2.7
-rw-r--r-- 1 root root 0 Jul 11 10:06 oldboy.log
drwxr-xr-x 3 root root 4096 Jul 5 20:58 server
drwxr-xr-x 4 root root 4096 May 11 22:07 tools
[root@oldboy ~]#cp oldboy.log /mnt
[root@oldboy ~]#ll /mnt
total 112
-rwxr-xr-x 1 nfsnobody nfsnobody 48568 Jul 20 20:50 cat
drwxrwxrwx 5 nfsnobody nfsnobody 4096 Apr 13 00:00 data
-rw-r--r-- 1 nfsnobody nfsnobody 0 Jul 20 22:41 oldboy.log
-rwsr-xr-x 1 root root 57440 Jul 20 20:59 rm
[root@oldboy ~]#time for ((i=1;i<50000;i++));do cat /mnt/oldboy.log >/dev/null;done
real 0m54.298s
user 0m3.931s
sys 0m7.291s
b.内核优化调整
[root@oldboy ~]#cat >>/etc/sysctl.conf<<EOF
> net.core.wmem_default = 8388608
> net.core.rmem_default = 8388608
> net.core.rmem_max = 16777216
> net.core.wmem_max = 16777216
> EOF
[root@oldboy ~]#tail /etc/sysctl.conf
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
[root@oldboy ~]#sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables" is an unknown key
error: "net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
[root@oldboy ~]#time for ((i=1;i<50000;i++));do cat /mnt/oldboy.log >/dev/null;done
real 0m51.015s
user 0m3.461s
sys 0m7.207s