使用WinInet从HTTP服务器下载信息--Downloading from an HTTP Server using WinInet
Downloading from an HTTP Server using WinInet
使用WinInet从HTTP服务器下载信息
Thu, 2004-02-26 11:45 — roger
The WinInet functions allow an application to interact with Gopher, FTP and HTTP servers. This article shows how to use the WinInet API to download from an HTTP server.
WinInet允许应用程序和Gopher,FTP, HTTP服务器进行交互。这篇文章描述了如何使用WinInet API从HTTP服务器下载信息。
InternetOpen
打开Internet
The first thing that the application needs to do is to initialise WinInet for use by that application:
要做的第一件事情就是为应用程序初始化WinInet。
LPCTSTR lpszAgent = "WinInetGet/0.1"; HINTERNET hInternet = InternetOpen(lpszAgent, INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
function allows the user to specify the proxy to be used. Here we tell it to use whatever the user has already configured in Internet Explorer. It also requires the "user agent" string, which identifies the application. Here, our demo application is going to be called WinInetGet, so that's what we pass, along with a version number.
When we're finished with the WinInet functions, we should remember to call InternetCloseHandle.
InternetOpen 函数允许用户指定要使用的代理。这里我们认为用户已经在IE中进行了配置。同时它也需要user agent字符串,这个字符串描述了应用程序。这里我们的示例程序会叫做WinInetGet,这就是我们要传输的,同时加上一个版本号。
InternetConnect
链接Internet
Having initialised the WinInet functions, the next thing we do is connect to a particular server:
完成了WinInet的初始化,我们接下来要做的事情就是去连接一个特定的服务器。
LPCTSTR lpszServerName = "vague.home.differentpla.net"; INTERNET_PORT nServerPort = INTERNET_DEFAULT_HTTP_PORT; LPCTSTR lpszUserName = NULL; LPCTSTR lpszPassword = NULL; DWORD dwConnectFlags = 0; DWORD dwConnectContext = 0; HINTERNET hConnect = InternetConnect(hInternet, lpszServerName, nServerPort, lpszUserName, lpszPassword, INTERNET_SERVICE_HTTP, dwConnectFlags, dwConnectContext);
You'll need to change the server name (since this one refers to my Linux test box at home). We're not passing a user name or password, nor are we passing any connection flags. ThedwConnectContext is an application-defined value that's passed to the callback function registered with InternetSetStatusCallback. Since we're not using InternetSetStatusCallback, we pass zero.
你可能需要改变服务器的名称(这里指向了我家中的Linux test box)。我们不传递用户名和密码,我们也不传递链接标识。dwConnectContext是一个应用程序定义的值,它将被专递给注册的回调函数InternetSetStatusCallback。
这里我们不使用回调函数InternetSetStatusCallback所以我们传递0。
HttpOpenRequest
打开Http请求
The application then needs to form an HTTP request. This is done with the HttpOpenRequest function:
应用程序接下来需要生成一个HTTP请求。这个通过HttpOpenRequest 函数来完成。
LPCTSTR lpszVerb = "GET"; LPCTSTR lpszObjectName = "/"; LPCTSTR lpszVersion = NULL; // Use default. LPCTSTR lpszReferrer = NULL; // No referrer. LPCTSTR *lplpszAcceptTypes = NULL; // Whatever the server wants to give us. DWORD dwOpenRequestFlags = INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTP | INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTPS | INTERNET_FLAG_KEEP_CONNECTION | INTERNET_FLAG_NO_AUTH | INTERNET_FLAG_NO_AUTO_REDIRECT | INTERNET_FLAG_NO_COOKIES | INTERNET_FLAG_NO_UI | INTERNET_FLAG_RELOAD; DWORD dwOpenRequestContext = 0; HINTERNET hRequest = HttpOpenRequest(hConnect, lpszVerb, lpszObjectName, lpszVersion, lpszReferrer, lplpszAcceptTypes, dwOpenRequestFlags, dwOpenRequestContext);
function doesn't actually communicate with the server at this point. We use it to create the HTTP request object, which we can then fill in, before sending it.
The interesting thing (from the HTTP point of view) is that we specify "GET /" in the lpszVerb and lpszObjectName parameters. The flags are just a bunch of available flags that I thought looked interesting. We pass NULL as the referrer, since we didn't come here from another page. By passing NULL in the lplpszAcceptTypes parameter, we signal that we're not bothered about what we're given. Most servers will interpret this as "text/*". Again, we pass zero for the application-defined context value.
HttpOpenRequest 函数实际上并不和服务器进行通讯。我们使用它来创建HTTP请求对象,这样我们就能先填充它,然后发送它。
有趣的是(从HTTP的角度看)就是我们在指定lpszVerb 和lpszObjectName 两个参数中指定"GET /"。标识里是我认为看起来感兴趣的一下可选的标识。我们给lpszReferrer传递NULL,因为这里我们不是来自其它网页。通过给lplpszAcceptTypes 参数传递NULL,我们指出我们并不关心返回给我们什么。大多数服务器将会将它解释为"text/*"。同样,这里我们也将用户指定的参数的值置为0。
HttpSendRequest
发送Http请求
Since we're done putting together our HTTP request, we can send it to the server using the HttpSendRequest function:
我们已经将我们的HTTP请求放到了一起,我们可以给使用HttpSendRequest 函数将它发送到服务器。
BOOL bResult = HttpSendRequest(hRequest, NULL, 0, NULL, 0);
The last four parameters allow us to supply additional headers and any optional data. The optional data is generally only used in POST and PUT operations.
最后面的4个参数允许我们提供额外的头和一些可选的数据。这些可选的数据大多仅在POST和PUT操作中使用。
HttpQueryInfo
查询Http信息
After we've sent the request, and the response has come back, we ought to check the response headers. This is done with the HttpQueryInfo function:
在我们发送了请求之后,服务器将给出响应,我们应该去检查,响应的头。这通过HttpQueryInfo 函数来实现。
DWORD dwInfoLevel = HTTP_QUERY_RAW_HEADERS_CRLF;
DWORD dwInfoBufferLength = 10;
BYTE *pInfoBuffer = (BYTE *)malloc(dwInfoBufferLength+1);
while (!HttpQueryInfo(hRequest, dwInfoLevel, pInfoBuffer, &dwInfoBufferLength, NULL))
{
DWORD dwError = GetLastError();
if (dwError == ERROR_INSUFFICIENT_BUFFER)
{
free(pInfoBuffer);
pInfoBuffer = (BYTE *)malloc(dwInfoBufferLength+1);
}
else
{
fprintf(stderr, "HttpQueryInfo failed, error = %d (0x%x)/n",
GetLastError(), GetLastError());
break;
}
}
pInfoBuffer[dwInfoBufferLength] = '/0';
printf("%s", pInfoBuffer);
free(pInfoBuffer);
will tell you if you have not allocated enough buffer space for the result. This while loop is a good way to make sure that you get it right. Note that we add a null terminator, so we allow for this in our call to malloc. The MSDN documentation implies that the string will already be zero-terminated, but it's a little ambiguous.
Note that we're deliberately not allocating enough space. This is so that we can test the logic in the while loop. In a real application, we'd allocate a larger buffer, to avoid calling HttpQueryInfomultiple times.
The HttpQueryInfo function can return a specific header value from the HTTP response, e.g. pass HTTP_QUERY_DATE to retrieve the "Date:" header. For custom header values, the application can pass HTTP_QUERY_CUSTOM and pass the name of the header in the buffer. It will be overwritten with the header value. We pass HTTP_QUERY_RAW_HEADERS_CRLF, because we're not particularly interested in the actual content; this is just a demo application.
和很多其它的Windows API函数一样,HttpQueryInfo 函数会告知你,如果你没有为结果分配足够的buffer空间。这个while 循环是一个很好的能够确保正确的方法。注意我们添加了一个空终止。我们在调用malloc函数的时候保证了这一点。MSDN文档中指出它已经是空终止的了,但是有点含糊。
注意我们有意没有分配做够的空间。这样我们就可以测试while循环里面的逻辑。在真实的应用中,我们因该分配一个更大的buffer来避免多次调用HttpQueryInfomultiple函数。
HttpQueryInfo 函数可以返回一个特定部分的HTTP响应头。例如:传递HTTP_QUERY_DATE来返回"Date:"头。对于自定义的头值,应用程序应该传递HTTP_QUERY_CUSTOM并在buffer中传递头的名称。它会被头的值覆盖。我们传递了HTTP_QUERY_RAW_HEADERS_CRLF,因为我们不对真正的内容感兴趣,这仅仅是一个示例程序。
InternetReadFile
读取Internet文件
To retrieve the entity body from the HTTP response, we'll use a loop like this:
要获得完整的HTTP响应,我们将使用像下面这样的循环。
DWORD dwBytesAvailable;
while (InternetQueryDataAvailable(hRequest, &dwBytesAvailable, 0, 0))
{
BYTE *pMessageBody = (BYTE *)malloc(dwBytesAvailable+1);
DWORD dwBytesRead;
BOOL bResult = InternetReadFile(hRequest, pMessageBody,
dwBytesAvailable, &dwBytesRead);
if (!bResult)
{
fprintf(stderr, "InternetReadFile failed, error = %d (0x%x)/n",
GetLastError(), GetLastError());
break;
}
if (dwBytesRead == 0)
break; // End of File.
pMessageBody[dwBytesRead] = '/0';
printf("%s", pMessageBody);
free(pMessageBody);
}
function blocks until data is available or an error occurs.
注意InternetQueryDataAvailable 函数会阻塞直到有数据,否则会返回一个错误。
原文链接:
http://www.differentpla.net/content/2004/02/downloading-from-an-http-server-using-wininet