HttpClient(一)-- HelloWorld

一、简介

  HttpClient 是Apache Jakarta Common 下的子项目,可以用来提供高效的、最新的、功能丰富的支持 HTTP 协议的客户端编程工具包,并且它支持 HTTP 协议最新的版本和建议。详细介绍,此处基于4.5.2版本。maven依赖:

     <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpclient</artifactId>
        <version>4.5.2</version>
    </dependency>

二、HelloWorld实现

package com.xsjt.chap01;

import java.io.IOException;

import org.apache.http.HttpEntity;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

public class HelloWorld {
    
    /**
     * 抓取网页信息使用 get请求
     * @param args
     * @throws IOException 
     * @throws ClientProtocolException 
     */
    public static void main(String[] args) throws ClientProtocolException, IOException {
        // 创建httpClient实例
        CloseableHttpClient httpClient = HttpClients.createDefault();
        // 创建httpGet实例
        HttpGet httpGet = new HttpGet("http://www.cnblogs.com");  // http://www.tuicool.com/  
        CloseableHttpResponse response = httpClient.execute(httpGet);
        if(response != null){
            HttpEntity entity = response.getEntity();   // 获取网页内容
            String result = EntityUtils.toString(entity, "UTF-8"); 
            System.out.println("网页内容:" + result);
        }
        if(response != null){
            response.close();
        }
        if(httpClient != null){
            httpClient.close();
        }
    }
}

 上述代码中可以直接获取到 网页内容,有的获取到的内容是 中文乱码的,这就需要根据 网页的编码 来设置编码了,比如gb2312。

三、爬虫教程

  https://www.kancloud.cn/johnnylee/crawler/  

 四、HttpClient学习地址

  开源博客系统-HttpClient

 

posted @ 2017-09-11 22:56  小葱拌豆腐~  阅读(288)  评论(0编辑  收藏  举报