60 网络编程(二)——URL

认识URI、URL、URN

详细请参考:https://blog.51cto.com/xoyabc/1905492

URI:uniform resource Indent 统一资源标识符

URL:uniform resource locator 统一资源定位符

URN:统一资源名称

它们的关系如:

URL

我们学习java网络编程最常用的类就是URL。

一个完整的URL由:protocol、host、port、path、parameter、anchor(锚点)组成

代码测试:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
package _20191213;
import java.net.MalformedURLException;
import java.net.URL;
/**
 * URL测试类
 * @author TEDU
 *
 */
public class URLTest {
    public static void main(String[] args) throws MalformedURLException {
        URL url = new URL("https://www.cnblogs.com/Scorpicat/category/1596649.html");
        System.out.println(url.getProtocol());
        System.out.println(url.getFile());
        System.out.println(url.getAuthority());
        System.out.println(url.getDefaultPort());
        System.out.println(url.getPort());
        System.out.println(url.getQuery());
        System.out.println(url.getHost());
        System.out.println(url.getRef());
        System.out.println(url.getUserInfo());
    }
}

  

运行结果:

https
/Scorpicat/category/1596649.html
www.cnblogs.com
443
-1
null
www.cnblogs.com
null
null

通过URL与IO流爬取一张网页的数据

运行后将会生成一个web.txt文件,存储有目标地址的网页数据。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
package _20191213;
 
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
 
public class DownloadAWebPage {
    public static void main(String[] args) throws IOException {
        //目标地址
        URL url = new URL("https://gy.anjuke.com/?pi=navi-tencent-qq-mz");
        //流创建:选择源,选择流,读取,关闭
        BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream(),"utf-8"));
        BufferedWriter bw = new BufferedWriter(new FileWriter(new File("web.txt")));
        char[] cbuf = new char[1024*8];
        String content;
        while((content = br.readLine())!=null) {
            System.out.println(content);
            bw.write(content);
            bw.newLine();
            bw.flush();
        }
        bw.close();
        br.close();
    }
}

  

posted @   Scorpicat  阅读(424)  评论(0编辑  收藏  举报
努力加载评论中...
点击右上角即可分享
微信分享提示