COMP3322 notes P1 - Internet & WWW Basic
选这门课完全是为了推进我博客美化的大业!希望学完之后 update logs 里的一部分 issues 能自己亲手解决。
首先来到 Internet and WWW basic: 这些基本的 network 知识对接下来的 front-end framework 学习大有裨益。Internet, Web, DNS, HTTP 等「最熟悉的陌生人」在这一节得以祛魅。
从这门课开始我打算 develop 一个新的笔记模式。目前已有的笔记模式大概分为:
- Fragment 碎片型。Ex. ENGG1340/COMP2113.
- Essence 精华型。Ex. ENGG3357.
- Comprehensive 全面型。Ex. Algorithms I & II.
对于类似 COMP3322 这种概念较多较杂的课程,碎片型显得没有条理,精华型容易产生遗漏,全面型整理起来又太过冗长。于是我准备尝试一下结合课程 slides 的 revision-oriented 型:一切以 final 复习方便为宗旨!
{% note warning %}
This article is a self-administered course note.
It will NOT cover any exam or assignment related content.
{% endnote %}
What is the Internet?
The Internet is the backbone of the Web, the technical infrastructure that makes the Web possible. At its most basic, the Internet is a large network of computers which communicate all together.
Keywords. [见 slides 01-Internet-Basic]
- Internet is a technical infrastructure.
- hosts (主机)。
- IP routers (\(\sim n^2\) links to \(n\) links).
What is World Wide Web (WWW)?
Internet 的中文翻译是互联网;而 WWW (World Wide Web) 是万维网。在此之前我一直以为它们指的是同一种东西;然而互联网中的「网」强调的是 network;而万维网中的「网」指的是 web。这个问题挺值得玩味。
另外,WWW 的台译是全球资讯网,我认为该翻译虽没有万维网那么「雅」,但「信」,「达」却做得更好。
Keywords. [见 slides 01-Internet-Basic]
- hypertext links (超文本链接).
- uniform resource locator (URL): protocol + domain name + network path.
- uniquely identify a resource on the WWW.
- client-server communication: by means of HTTP protocol.
- publishing documents with HTML & CSS.
Client-Server Communication
Keywords. [见 slides 01-Internet-Basic]
- End-to-End and port numbers. [a]
- [IMPORTANT] Illustration of steps taken by client and server.
- IP address and IPV4 addressing. [b]
- localhost: this computer. IP:
127.0.0.1
.
[a]: 客户端同时运行着 Google Chrome 与 Firefox,而服务器端运行着 Nginx 与 MongoDB;通过分配不同的 port numbers,进程间可以分别建立联系 (Chrome \(\to^{80}\) Nginx, Firefox \(\to^{443}\) MongoDB)。客户端的进程向对应的端口发送请求;而服务器端的进程则监听特定的端口,并对请求进行回应。
[b]: IPV4 addressing scheme 的前/后缀。在 HKU 之外,148.7
这一前缀将作为 (HKU) physical network locator;而在 HKU 内部,147.8.175
这一前缀将作为 (HKU CS) physical network locator.
Domain Name System (DNS)
DNS maps human-readable domain names to binary addresses used by Internet Protocol.
Keywords. [见 slides 01-Internet-Basic]
- hierarchical domain names: read from right to left.
- nameless DNS root node connecting to all TLDs.
- top-level domains (TLD). e.g.,
com
,org
,hk
,cn
... - second-level domains (SLD). e.g.,
thisisxxz
... - 3rd, 4th...
- distributed hierarchical database.
- internal machines \(\to\) local DNS servers (provided by ISP) \(\to\) root name servers.
- [IMPORTANT] domain name address resolution process.
- DNS caching: browser cashes/local DNS(name) server cashes.
Hypertext Transfer Protocol (HTTP)
HTTP is an extensible protocol that is easy to use. The client-server structure, combined with the ability to add headers, allows HTTP to advance along with the extended capabilities of the Web.
Keywords. [见 slides 01-Internet-Basic]
- application-layer protocol: sent over TCP.
- stateless but not sessionless (e.g., cookies).
- HTTP/1.0. HTTP headers, status code info.
- HTTP/1.1 (still widely used). pipeline request, caching, content negotiation.
HTTP Messages
Keywords. [见 slides 01-Internet-Basic]
- HTTP request:
[request type] [URL] [HTTP version]
.- e.g.,
GET /https://i.cs.hku.hk/~atctam/view/Simple.html HTTP/1.1
.
- e.g.,
- HTTP response (status line):
[HTTP version] [status code] [status phrase]
.- e.g.,
HTTP/1.1 200 OK
.
- e.g.,
- HTTP headers. A series of key-value pairs
[case_insensitive_name]: [value]
.- provide additional info. 4 kinds: General/Entity/Request/Response.
- Message body.
- requests: do not have a body for
GET
,HEAD
..., have it forPOST
... - responses: carries the resource (HTML page) requested by clients.
- requests: do not have a body for
- Web requests. The browser makes subsequent requests for each resource (CSS, scripts, external pictures...) referenced in the HTML.
Caching & Validation
Keywords. [见 slides 01-Internet-Basic]
- web caching: private browser caches/proxy servers caches.
- advantages: (1) cache close to client \(\to\) reduce response time. (2) decrease network traffic to distant servers.
- stale:
Cache-Control
&Expires
headers. - validation: two response headers & a request header.
- web server responses with
Etag
andLast-Modified
headers. - browser/proxy server requests with
If-None-Match
and/orIf-Modified-Since
header. - If not modified, web server responses with
HTTP/1.1 304 Not Modified
.
- web server responses with
- cookies. server sends cookies to browser with the
Set-Cookie
header. browser stores it and the next request to the same server includes theCookie
header.- responses:
Set-Cookie: user_id=1678
. - (next) requests:
Cookie: user_id=1678
.
- responses:
Reference
{% note warning %}
This article is a self-administered course note.
References in the article are from corresponding course materials if not specified.
{% endnote %}
Reference website:
- How does the Internet work? - mdn web docs.
- How the Web works? - mdn web docs.
- What is a domain name? - mdn web docs.
- An overview of HTTP - mdn web docs.
- What's the difference between no-cache and max-age: 0? - stack overflow.
Course info:
Code: COMP3322, Lecturer: Dr.Anthony Tam.