HTTP Get一定是幂等的吗,统计访问量的时候呢?
HTTP Get一定是幂等的吗?
今天分析抓包的时候, 主要研究POST请求, 然后想到, 有没可能程序员为了隐藏接口, 把修改操作放在了响应GET请求的时候。后来想了想, GET应该是幂等的,没人会这么做的。
然而突发奇想想到了一个问题, 各大网站的访问量的统计不就正是发送了GET请求后, 后端将访问量+1吗,这不是破坏了幂等性吗?
于是开始百度,基本无果,搜的到的内容太少了,google后在stackoverflow找到了如下这个问题,和我的疑惑一样,来看一下吧
Problem description
I'm trying to understand REST. Under REST a GET must not trigger something transactional on the server (this is a definition everybody agrees upon, it is fundamental to REST).
So imagine you've got a website like stackoverflow.com (I say like so if I got the underlying details of SO wrong it doesn't change anything to my question), where everytime someone reads a question, using a GET, there's also some display showing "This question has been read 256 times".
Now someone else reads that question. The counter now is at 257. The GET is transactional because the number of views got incremented and is now incremented again. The "number of views" is incremented in the DB, there's no arguing about that (for example on SO the number of time any question has been viewed is always displayed).
So, is a REST GET fundamentally incompatible with any kind of "number of views" like functionality in a website?
So should it want to be "RESTFUL", should the SO main page either stop display plain HTML links that are accessed using GETs or stop displaying the "this question has been viewed x times"?
Because incrementing a counter in a DB is transactional and hence "unrestful"?
EDIT just so that people Googling this can get some pointers:
From http://www.xfront.com/REST-Web-Services.html :
- All resources accessible via HTTP GET should be side-effect free. That is, the request should just return a representation of the resource. Invoking the resource should not result in modifying the resource.
Now to me if the representation contains the "number of views", it is part of the resource [and in SO the "number of views" a question has is a very important information] and accessing it definitely modifies the resource.
This is in sharp contrast with, say, a true RESTFUL HTTP GET like the one you can make on an Amazon S3 resource, where your GET is guaranteed not to modify the resource you get back.
But then I'm still very confused.
Answer
What matters is that from a client point of view GET is safe (has no side effects) by definition and that a client therefore can safely call GET any number of times without considering any side effect that might have.
What a server does is the server's responsibility. In the case of the view counter the server has to make the decision if it considers the update of the counter a side effect. Usually it won't because the counter is part of the semantic of the resource in the first place.
However, the server might decide NOT to increment the counter for certain requests, such as a GET by a crawler.
---Jan, From StackOverflow
仔细品下面这段话
Idempotency is important in building a fault-tolerant API. Suppose a client wants to update a resource through POST. Since POST is not a idempotent method, calling it multiple times can result in wrong updates. What would happen if you sent out the POST request to the server, but you get a timeout. Is the resource actually updated? Does the timeout happened during sending the request to the server, or the response to the client? Can we safely retry again, or do we need to figure out first what has happened with the resource? By using idempotent methods, we do not have to answer this question, but we can safely resend the request until we actually get a response back from the server.
Be careful when dealing with safe methods as well: if a seemingly safe method like GET will change a resource, it might be possible that any middleware client proxy systems between you and the server, will cache this response. Another client who wants to change this resource through the same URL (like: http://example.org/api/article/1234/delete), will not call the server, but return the information directly from the cache. Non-safe (and non-idempotent) methods will never be cached by any middleware proxies