Nginx Learning (2)

Passive Health Checks

When NGINX considers a server unavailable, it temporarily stops sending requests to the server until it is considered active again. The following parameters to the server directive configure the conditions under which NGINX considers a server unavailable:

  • max_fails – Sets the number of consecutive failed attempts after which NGINX marks the server as unavailable.
  • fail_timeout – Sets the time during which the number of failed attempts specified by the max_fails parameter must happen for the server to be considered unavailable, and also the length of time that NGINX considers the server unavailable after it is marked so.

The default values are 1 attempt and 10 seconds. So if a server does not accept or does not respond to a (that is, one) request, NGINX immediately considers the server unavailable for 10 seconds. The following example shows how to set these parameters:

upstream backend {
    server backend1.example.com;
    server backend2.example.com max_fails=3 fail_timeout=30s;
    server backend3.example.com max_fails=2;
}

Active Health Checks

Periodically sending special requests to each server and checking for a response that satisfies certain conditions can monitor the availability of servers.

To enable this type of health monitoring in your nginx.conf file, include the health_check directive in the the location that passes requests to an upstream group. In addition, the upstream group must include the zonedirective to define a shared‑memory zone where information about health status is stored:

http {
    upstream backend {
        zone backend 64k;
        server backend1.example.com;
        server backend2.example.com;
        server backend3.example.com;
        server backend4.example.com;
    }
    server {
        location / {
            proxy_pass http://backend;
            health_check;
        }
    }
}

This configuration defines the upstream group backend and a virtual server with a single location that passes all requests (represented by /) to backend.

The zone directive defines a memory zone that is shared among worker processes and is used to store the configuration of the server group. This enables the worker processes to use the same set of counters to keep track of responses from the upstream servers. The zone directive is also required for dynamic configuration of the upstream group.

The health_check directive without any parameters configures health monitoring with the default settings: every seconds NGINX Plus sends a request for / to each server in the backend group. If any communication error or timeout occurs (or a proxied server responds with a status code other than 2_xx_ or 3_xx_) the health check fails for that server. Any server that fails a health check is considered unhealthy, and NGINX Plus stops sending client requests to it until it once again passes a health check.

The default behavior can be overridden using the parameters to the health_check directive. Here, the interval parameter reduces the duration between health checks to 5 seconds. The fails=3 parameter means a server is considered unhealthy after 3 consecutive failed health checks instead of the default 1. With the passes parameter, a server needs to pass 2 consecutive checks (rather than 1) to be considered healthy again.

To define the URI to request (here, /some/path) instead of the default /, include the uri parameter. The URI is appended to the server’s domain name or IP address as specified by the server directive in the upstreamblock. For example, for the first server in the backend group configured above, the health check is a request for http://backend1.example.com/some/path.

location / {
    proxy_pass http://backend;
    health_check interval=5 fails=3 passes=2 uri=/some/path;
}

Finally, you can set custom conditions that a response must satisfy for NGINX Plus to consider the server healthy. The conditions are specified in the match block, which is then referenced by the health_check directive’s match parameter.

http {
    # ...
    match server_ok {
        status 200-399;
        body !~ "maintenance mode";
    }
    
    server {
        # ...
        location / {
            proxy_pass http://backend;
            health_check match=server_ok;
        }
    }
}

Here the health check is passed if the status code in the response is in the range from 200 to 399, and the response body does not match the specified regular expression.

The match directive enables NGINX Plus to check the status, header fields, and the body of a response. Using this directive it is possible to verify whether the status code is in a specified range, the response includes a header, or the header or body matches a regular expression (in any combination). The match directive can contain one status condition, one body condition, and multiple header conditions. For the health check to succeed, the response must satisfy all of the conditions specified in the match block.

For example, the following match block requires that responses have status code 200, include the Content-Type header with the exact value text/html, and contain the text “Welcome to nginx!” in the body:

match welcome {
    status 200;
    header Content-Type = text/html;
    body ~ "Welcome to nginx!";
}

In the following example of using the exclamation point (!), the block matches responses where the status code is anything other than 301302303, and 307, and Refresh is not among the headers.

match not_redirect {
    status ! 301-303 307;
    header ! Refresh;
}

Health checks can also be enabled for the non‑HTTP protocols that NGINX Plus proxies: FastCGImemcachedSCGI, and uwsgi.

Sharing Data with Multiple Worker Processes

 If an upstream block does not include the zone directive, each worker process keeps its own copy of the server group configuration and maintains its own set of related counters.  As a result, the server group configuration cannot be modified dynamically.

When the zone directive is included in an upstream block, the configuration of the upstream group is kept in a memory area shared among all worker processes. This scenario is dynamically configurable, because the worker processes access the same copy of the group configuration and utilize the same related counters.

The zone directive is mandatory for active health checks and dynamic reconfiguration of the upstream group. However, other features of upstream groups can benefit from the use of this directive as well.

For example, if the configuration of a group is not shared, each worker process maintains its own counter for failed attempts to pass a request to a server (set by the max_fails parameter). In this case, each request gets to only one worker process. When the worker process that is selected to process a request fails to transmit the request to a server, other worker processes don’t know anything about it. While some worker process can consider a server unavailable, others might still send requests to this server. For a server to be definitively considered unavailable, the number of failed attempts during the timeframe set by the fail_timeout parameter must equal max_fails multiplied by the number of worker processes. On the other hand, the zone directive guarantees the expected behavior.

Similarly, the Least Connections load‑balancing method might not work as expected without the zone directive, at least under low load. This method passes a request to the server with the smallest number of active connections. If the configuration of the group is not shared, each worker process uses its own counter for the number of connections and might send a request to the same server that another worker process just sent a request to. However, you can increase the number of requests to reduce this effect. Under high load requests are distributed among worker processes evenly, and the Least Connections method works as expected.

Setting the Zone Size

It is not possible to recommend an ideal memory‑zone size, because usage patterns vary widely. The required amount of memory is determined by which features (such as session persistencehealth checks, or DNS re‑resolving) are enabled and how the upstream servers are identified.

Configuring HTTP Load Balancing Using DNS

The configuration of a server group can be modified at runtime using DNS.

For servers in an upstream group that are identified with a domain name in the server directive, NGINX Plus can monitor changes to the list of IP addresses in the corresponding DNS record, and automatically apply the changes to load balancing for the upstream group, without requiring a restart. This can be done by including the resolver directive in the http block along with the resolve parameter to the server directive:

http {
    resolver 10.0.0.1 valid=300s ipv6=off;
    resolver_timeout 10s;
    server {
        location / {
            proxy_pass http://backend;
        }
    }
    upstream backend {
        zone backend 32k;
        least_conn;
        # ...
        server backend1.example.com resolve;
        server backend2.example.com resolve;
    }
}

In the example, the resolve parameter to the server directive tells NGINX Plus to periodically re‑resolve the backend1.example.com and backend2.example.com domain names into IP addresses.

The resolver directive defines the IP address of the DNS server to which NGINX Plus sends requests (here, 10.0.0.1). By default, NGINX Plus re‑resolves DNS records at the frequency specified by time‑to‑live (TTL) in the record, but you can override the TTL value with the valid parameter; in the example it is 300 seconds, or 5 minutes.

The optional ipv6=off parameter means only IPv4 addresses are used for load balancing, though resolving of both IPv4 and IPv6 addresses is supported by default.

If a domain name resolves to several IP addresses, the addresses are saved to the upstream configuration and load balanced. In our example, the servers are load balanced according to the Least Connections load‑balancing method. If the list of IP addresses for a server has changed, NGINX Plus immediately starts load balancing across the new set of addresses.

 

posted @ 2018-06-17 12:47  geeklove  阅读(192)  评论(0编辑  收藏  举报