Kubernetes集群内核参数优化
网络参数,分别套用到Node和Pod,其中Pod级通过服务发布时候注入init-container生效,Node级通过运维初始化机器脚本或者编辑sysctl.conf配置文件生效到内存。
1、调整服务器内核参数
#!/bin/bash # node瓶颈是conntrack # 下面2项令TCP窗口和状态追踪更加宽松 sysctl -w net.netfilter.nf_conntrack_tcp_be_liberal=1 sysctl -w net.netfilter.nf_conntrack_tcp_loose=1 # 下面3项调大了conntrack表,保证操作效率 sysctl -w net.netfilter.nf_conntrack_max=3200000 sysctl -w net.netfilter.nf_conntrack_buckets=1600512 sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30 # pod瓶颈是TCP协议栈 sysctl -w net.ipv4.tcp_timestamps=1 # 与tw_reuse一起用 sysctl -w net.ipv4.tcp_tw_reuse=1 # 仅作用于客户端,允许复用TIME_WAIT端口(与tcp_rw_recycle不同,该选项不受NAT场景下时间戳不一致问题影响) sysctl -w net.ipv4.ip_local_port_range="5120 65000" # 端口范围 sysctl -w net.ipv4.tcp_fin_timeout=30 # 缩短TIME_WAIT时间,加速端口回收 # 下面3个均为加强握手队列能力 sysctl -w net.ipv4.tcp_max_syn_backlog=10240 sysctl -w net.core.somaxconn=10240 sysctl -w net.ipv4.tcp_syncookies=1
注意:以上脚本是临时调整服务内核,重启服务器会失效,如果需要永久修改需要编辑/etc/sysctl.conf配置文件。另外,直接修改net.netfilter.nf_conntrack_buckets内核参数会报错,需要
echo 1600512 > /sys/module/nf_conntrack/parameters/hashsize
批量调整集群节点服务器内核脚本(需要一个主控节点能ssh免登陆到机器其他节点,保证systcl.conf内核配置文件和批量执行脚本在一个路径下):
#!/bin/bash for row in `cat $1 | awk '{printf("%s:%s:%s\n"),$1,$2,$3}'` do ip=`echo ${row} | awk -F ':' '{print $1}'` passwd=`echo ${row} | awk -F ':' '{print $2}'` username=`echo ${row} | awk -F ':' '{print $3}'` echo $ip /usr/bin/expect <<-EOF spawn scp sysctl.conf $ip:/etc/sysctl.conf expect { "yes/no" { send "yes\r";exp_continue} "password: " {send "$passwd\r";exp_continue} } EOF /usr/bin/expect <<-EOF spawn ssh root@$ip expect "*#*" send "sysctl -p\r" expect "*root*]#*" exit EOF done
sysctl.conf
net.ipv4.ip_forward = 1 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_fin_timeout = 300 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 net.core.somaxconn = 65535 net.core.netdev_max_backlog = 65535 net.netfilter.nf_conntrack_tcp_be_liberal=1 net.netfilter.nf_conntrack_tcp_loose=1 net.netfilter.nf_conntrack_max=3200000 net.netfilter.nf_conntrack_tcp_timeout_time_wait=30 net.ipv4.tcp_timestamps=1 net.ipv4.ip_local_port_range=5120 65000 net.ipv4.tcp_fin_timeout=30 net.ipv4.tcp_max_syn_backlog=10240 net.core.somaxconn=10240 net.ipv4.tcp_syncookies=1
另外还需要批量修改net.netfilter.nf_conntrack_buckets
#!/bin/bash for row in `cat $1 | awk '{printf("%s:%s:%s\n"),$1,$2,$3}'` do ip=`echo ${row} | awk -F ':' '{print $1}'` passwd=`echo ${row} | awk -F ':' '{print $2}'` username=`echo ${row} | awk -F ':' '{print $3}'` echo $ip /usr/bin/expect <<-EOF spawn ssh root@$ip expect "*#*" send "echo 1600512 > /sys/module/nf_conntrack/parameters/hashsize\r" expect "*root*]#*" send "sysctl -a --p=net.netfilter.nf_conntrack_buckets\r" expect "*root*]#*" exit EOF done
2、node开机后需要关闭透明大页,否则影响程序性能(来自VM PHP运维经验):
echo never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/defrag
3、在init-container做信号量限制放开(主要是给php cat以及nodejs cat使用):
sysctl -w kernel.sem="1034 32000 100 1000";
因为短连接调用方式的问题,单node内核处理能力有限,不适合选择太高的硬件配置,建议16核64G。