GRPC Health Checking Protocol Unavailable 14
https://github.com/grpc/grpc/blob/master/doc/health-checking.md
GRPC Health Checking Protocol
Health checks are used to probe whether the server is able to handle rpcs. The client-to-server health checking can happen from point to point or via some control system. A server may choose to reply “unhealthy” because it is not ready to take requests, it is shutting down or some other reason. The client can act accordingly if the response is not received within some time window or the response says unhealthy in it.
A GRPC service is used as the health checking mechanism for both simple client-to-server scenario and other control systems such as load-balancing. Being a high level service provides some benefits. Firstly, since it is a GRPC service itself, doing a health check is in the same format as a normal rpc. Secondly, it has rich semantics such as per-service health status. Thirdly, as a GRPC service, it is able reuse all the existing billing, quota infrastructure, etc, and thus the server has full control over the access of the health checking service.
Service Definition
The server should export a service defined in the following proto:
syntax = "proto3";
package grpc.health.v1;
message HealthCheckRequest {
string service = 1;
}
message HealthCheckResponse {
enum ServingStatus {
UNKNOWN = 0;
SERVING = 1;
NOT_SERVING = 2;
SERVICE_UNKNOWN = 3; // Used only by the Watch method.
}
ServingStatus status = 1;
}
service Health {
rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}
A client can query the server’s health status by calling the Check
method, and a deadline should be set on the rpc. The client can optionally set the service name it wants to query for health status. The suggested format of service name is package_names.ServiceName
, such as grpc.health.v1.Health
.
The server should register all the services manually and set the individual status, including an empty service name and its status. For each request received, if the service name can be found in the registry, a response must be sent back with an OK
status and the status field should be set to SERVING
or NOT_SERVING
accordingly. If the service name is not registered, the server returns a NOT_FOUND
GRPC status.
The server should use an empty string as the key for server's overall health status, so that a client not interested in a specific service can query the server's status with an empty request. The server can just do exact matching of the service name without support of any kind of wildcard matching. However, the service owner has the freedom to implement more complicated matching semantics that both the client and server agree upon.
A client can declare the server as unhealthy if the rpc is not finished after some amount of time. The client should be able to handle the case where server does not have the Health service.
A client can call the Watch
method to perform a streaming health-check. The server will immediately send back a message indicating the current serving status. It will then subsequently send a new message whenever the service's serving status changes.
grpc/grpc-go: The Go language implementation of gRPC. HTTP/2 based RPC https://github.com/grpc/grpc-go#the-rpc-failed-with-error-code--unavailable-desc--transport-is-closing
The RPC failed with error "code = Unavailable desc = transport is closing"
This error means the connection the RPC is using was closed, and there are many possible reasons, including:
- mis-configured transport credentials, connection failed on handshaking
- bytes disrupted, possibly by a proxy in between
- server shutdown
- Keepalive parameters caused connection shutdown, for example if you have configured your server to terminate connections regularly to trigger DNS lookups. If this is the case, you may want to increase your MaxConnectionAgeGrace, to allow longer RPC calls to finish.
It can be tricky to debug this because the error happens on the client side but the root cause of the connection being closed is on the server side. Turn on logging on both client and server, and see if there are any transport errors.
type ServerParameters ¶
type ServerParameters struct {
// MaxConnectionIdle is a duration for the amount of time after which an
// idle connection would be closed by sending a GoAway. Idleness duration is
// defined since the most recent time the number of outstanding RPCs became
// zero or the connection establishment.
MaxConnectionIdle time.Duration // The current default value is infinity.
// MaxConnectionAge is a duration for the maximum amount of time a
// connection may exist before it will be closed by sending a GoAway. A
// random jitter of +/-10% will be added to MaxConnectionAge to spread out
// connection storms.
MaxConnectionAge time.Duration // The current default value is infinity.
// MaxConnectionAgeGrace is an additive period after MaxConnectionAge after
// which the connection will be forcibly closed.
MaxConnectionAgeGrace time.Duration // The current default value is infinity.
// After a duration of this time if the server doesn't see any activity it
// pings the client to see if the transport is still alive.
// If set below 1s, a minimum value of 1s will be used instead.
Time time.Duration // The current default value is 2 hours.
// After having pinged for keepalive check, the server waits for a duration
// of Timeout and if no activity is seen even after that the connection is
// closed.
Timeout time.Duration // The current default value is 20 seconds.
}
ServerParameters is used to set keepalive and max-age parameters on the server-side.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· winform 绘制太阳,地球,月球 运作规律
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
2019-11-16 Fixed-Length Frames 谈谈网络编程中应用层(基于TCP/UDP)的协议设计
2019-11-16 h256
2019-11-16 Linux性能分析工具与图形化方法
2019-11-16 shared_ptr 引用计数
2018-11-16 upstream模块实现反向代理的功能
2018-11-16 epoll
2018-11-16 在nginx启动后,如果我们要操作nginx,要怎么做呢 别增加无谓的上下文切换 异步非阻塞的方式来处理请求 worker的个数为cpu的核数 红黑树