看自己动手写爬虫,没想到一上来就跪了。
里面提到用的jar包是apache的http客户端开源项目---HttpClient
就去官网下载了一个版本4.3
当按书上代码敲时
HttpClient httpclient = new HttpClient();
敲完这句,就给跪了
提示Cannot instantiate the type HttpClient,
google 了下,在stackoverflow上面说是应该
HttpClient httpclient = new DefaultHttpClient();
这样写,不过得先import org.apache.http.impl.client.DefaultHttpClient;
试了一下。。。果然可以,但是后面的GetMethod啥的都有,
最后才特么发现从4.×版本后,它的用法就变了不能这么使用了
给个官网示例看看就知道咋回事了
package spider; import java.io.IOException; import org.apache.http.HttpEntity; import org.apache.http.HttpResponse; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.ResponseHandler; import org.apache.http.client.methods.HttpGet; import org.apache.http.impl.client.CloseableHttpClient; import org.apache.http.impl.client.HttpClients; import org.apache.http.util.EntityUtils; public class Spider { public static void main(String[] args) throws Exception{ CloseableHttpClient httpclient = HttpClients.createDefault(); try{ String url = "http://www.baidu.com"; HttpGet httpGet = new HttpGet(url); System.out.println("executing request " + httpGet.getURI()); ResponseHandler<String> responseHandler = new ResponseHandler<String>(){ public String handleResponse(final HttpResponse response) throws ClientProtocolException,IOException{ int status = response.getStatusLine().getStatusCode(); if (status >= 200 && status < 300){ HttpEntity entity = response.getEntity(); return entity !=null ? EntityUtils.toString(entity) : null; }else{ throw new ClientProtocolException("Unexpected response status: " + status); } } }; String responseBody = httpclient.execute(httpGet,responseHandler); System.out.println("-------------------------------------------"); System.out.println(responseBody); System.out.println("-------------------------------------------"); }finally{ httpclient.close(); } } }
我本将心向明月,奈何明月照沟渠,落花有意随流水,流水无心恋落花。