转载:低延迟系统Linux内核参数优化
To make Easegress good performance, some of the kernel parameters might need to be optimized. Here is a possible configuration for /etc/sysctl.conf
and /etc/security/limits.conf
.
/etc/security/limits.conf
The values need to be aligned with actual memory.
* soft nproc 2067554
* hard nproc 2067554
/etc/sysctl.conf
The values need to be aligned with actual memory, CPU, and network.
-
Increase range of ephemeral ports that can be used
net.ipv4.ip_local_port_range = 1024 65535
-
Increase number of max open-files
fs.file-max = 150000
-
Enable TCP window scaling (enabled by default). Refer to Wikipedia: TCP_window_scale_option
net.ipv4.tcp_window_scaling = 1
-
Turn off SYN-flood protections
net.ipv4.tcp_syncookies = 0
-
Increase the number of packets that can be queued in the network card before being handed to the CPU
net.core.netdev_max_backlog = 3240000
-
Max number of "backlogged sockets" (connection requests that can be queued for any given listening socket)
net.core.somaxconn = 65535
-
TCP memory tuning. Refer to Linux Administration
# Increase the default socket buffer read size (rmem_default) and write size (wmem_default) # *** Maybe recommended only for high-RAM servers? *** net.core.rmem_default=16777216 net.core.wmem_default=16777216 # Increase the max socket buffer size (optmem_max), max socket buffer read size (rmem_max), max socket buffer write size (wmem_max) # 16MB per socket - which sounds like a lot, but will virtually never consume that much # rmem_max over-rides tcp_rmem param, wmem_max over-rides tcp_wmem param and optmem_max over-rides tcp_mem param net.core.optmem_max=16777216 net.core.rmem_max=16777216 net.core.wmem_max=16777216 # Configure the Min, Pressure, Max values (units are in page size) # Useful mostly for very high-traffic websites that have a lot of RAM # Consider that we already set the *_max values to 16777216 # So you may eventually comment on these three lines net.ipv4.tcp_mem=16777216 16777216 16777216 net.ipv4.tcp_wmem=4096 87380 16777216 net.ipv4.tcp_rmem=4096 87380 16777216
-
Number of packets to keep in the backlog before the kernel starts dropping them
net.ipv4.tcp_max_syn_backlog = 3240000
-
Only retry creating TCP connections 3 times. Minimize the time it takes for a connection attempt to fail
net.ipv4.tcp_syn_retries=3 net.ipv4.tcp_synack_retries=3 net.ipv4.tcp_orphan_retries=3
-
How many retries TCP makes on data segments (default 15). Some guides suggest reducing this value
net.ipv4.tcp_retries2 = 8
-
Increase max number of sockets allowed in TIME_WAIT
net.ipv4.tcp_max_tw_buckets = 1440000
-
Keepalive Optimizations. By default, the keepalive routines wait for two hours (7200 secs) before sending the first keepalive probe, and then resend it every 75 seconds. If no ACK response is received for 9 consecutive times, the connection is marked as broken.
# We would decrease the default values for tcp_keepalive_* params as follow: net.ipv4.tcp_keepalive_time = 600 # default 7200 net.ipv4.tcp_keepalive_intvl = 10 # default 75 net.ipv4.tcp_keepalive_probes = 9 # default 9
-
The TCP FIN timeout specifies the amount of time a port must be inactive before it can be reused for another connection. The default is often 60 seconds, but can normally be safely reduced to 30 or even 15 seconds
https://www.linode.com/docs/web-servers/nginx/configure-nginx-for-optimized-performancenet.ipv4.tcp_fin_timeout = 7
-
Refer to this Github post
# Avoid falling back to slow start after a connection goes idle. net.ipv4.tcp_slow_start_after_idle = 0 # Disable caching of TCP congestion state net.ipv4.tcp_no_metrics_save = 1
-
If listening service is too slow to accept new connections, reset them. Default state is FALSE. It means that if overflow occurred due to a burst, connection will recover. Enable this option only if you are really sure that listening daemon cannot be tuned to accept connections faster. Enabling this option can harm clients of your server. Refer to Github Post
net.ipv4.tcp_abort_on_overflow=0
文档摘抄于
https://github.com/haoel/easegress/blob/main/doc/kernel-tuning.md
,由陈皓编写。
此处转载,以供后续系统优化使用。