Don’t use Go’s default HTTP client (in production)

Writing Go programs that talk to services over HTTP is easy and fun. I’ve written numerous API client packages and I find it an enjoyable task. However, I have run into a pitfall that is easy to fall into and can crash your program very quickly: the default HTTP client.


TL;DR: Go’s http package doesn’t specify request timeouts by default, allowing services to hijack your goroutines. Always specify a custom http.Client when connecting to outside services.


The Problem by Example

Let’s say you want to want to talk to spacely-sprockets.com via their nice JSON REST API and view a list of available sprockets. In Go, you might do something like:

// error checking omitted for brevity
var sprockets SprocketsResponse
response, _ := http.Get("spacely-sprockets.com/api/sprockets")
buf, _ := ioutil.ReadAll(response.Body)
json.Unmarshal(buf, &sprockets)

You write your code (with proper error handling, please), compile, and run. Everything works great. Now, you take your API package and plug it into a web application. One page of your app shows a list of Spacely Sprockets inventory to users by making a call to the API.

Everything is going great until one day your app stops responding. You look in the logs, but there’s nothing to indicate a problem. You check your monitoring tools, but CPU, memory, and I/O all look reasonable leading up to the outage. You spin up a sandbox and it seems to work fine. What gives?

Frustrated, you check Twitter and notice a tweet from the Spacely Sprockets dev team saying that they experienced a brief outage, but that everything is now back to normal. You check their API status page and see that the outage began a few minutes before yours. That seems like an unlikely coincidence, but you can’t quite figure out how it’s related, since your API code handles errors gracefully. You’re still no closer to figuring out the issue.


The Go HTTP package

Go’s HTTP package uses a struct called Client to manage the internals of communicating over HTTP(S). Clients are concurrency-safe objects that contain configuration, manage TCP state, handle cookies, etc. When you use http.Get(url), you are using the http.DefaultClient, a package variable that defines the default configuration for a client. The declaration for this is:

var DefaultClient = &Client{}

Among other things, http.Client configures a timeout that short-circuits long-running connections. The default for this value is 0, which is interpreted as “no timeout”. This may be a sensible default for the package, but it is a nasty pitfall and the cause of our application falling over in the above example. As it turns out, Spacely Sprockets’ API outage caused connection attempts to hang (this doesn’t always happen, but it does in our example). They will continue to hang for as long as the malfunctioning server decides to wait. Because API calls were being made to serve user requests, this caused the goroutines serving user requests to hang as well. Once enough users hit the sprockets page, the app fell over, most likely due to resource limits being reached.

Here is a simple Go program that demonstrates the issue:

package main
import (
“fmt”
“net/http
“net/http/httptest”
“time”
)
func main() {
svr := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(time.Hour)
}))
defer svr.Close()
  fmt.Println(“making request”)
http.Get(svr.URL)
fmt.Println(“finished request”)
}

When run, this program will make a request to a server that will sleep for an hour. Consequently, the program will wait for one hour and then exit.


The Solution

The solution to this problem is to always define an http.Client with a sensible timeout for your use case. Here is an example:

var netClient = &http.Client{
Timeout: time.Second * 10,
}
response, _ := netClient.Get(url)

This sets a 10 second timeout on requests made to the endpoint. If the API server exceeds the timeout, Get() will return with the error:

&httpError{
err: err.Error() + " (Client.Timeout exceeded while awaiting headers)",
timeout: true,
}

If you need finer-grained control over the request lifecycle, you can additionally specify a custom net.Transport and net.Dialer. A Transport is a struct used by clients to manage the underlying TCP connection and it’s Dialer is a struct that manages the establishment of the connection. Go’s net package has a default Transport and Dialer as well. Here’s an example of using custom ones:

var netTransport = &http.Transport{
Dial: (&net.Dialer{
Timeout: 5 * time.Second,
}).Dial,
TLSHandshakeTimeout: 5 * time.Second,
}
var netClient = &http.Client{
Timeout: time.Second * 10,
Transport: netTransport,
}
response, _ := netClient.Get(url)

This code will cap the TCP connect and TLS handshake timeouts, as well as establishing an end-to-end request timeout. There are other configuration options such as keep-alive timeouts you can play with if needed.


Conclusion

Go’s net and http packages are a well-thought out, convenient base for communicating over HTTP(S). However, the lack of a default timeout for requests is an easy pitfall to fall into, because the package provides convenience methods like http.Get(url). Some languages (e.g. Java) have the same issue, others (e.g. Ruby has a default 60 second read timeout) do not. Not setting a request timeout when contacting a remote service puts your application at the mercy of that service. A malfunctioning or malicious service can hang on to your connection forever, potentially starving your application.

posted @ 2017-11-06 12:04  jvava  阅读(358)  评论(0编辑  收藏  举报