A Hasty Introduction to Web Development
Definitions
Yeah, some of these might be silly, but let's do this!
What's the difference between the internet and the web (I'm really asking this). →
- the internet - global system of interconnected computer networks; a network of networks
- internet's underlying protocol for communication is TCP/IP
- TCP/IP dictates how data should be packetized, addressed, transmitted, routed and received
- the web - a collection of interconnected documents (web pages) and other resources (images, video, etc.), retrievable by url and connected by hyperlinks
- HTTP is the protocol used to allow documents and resources to be requested over a network
Other Services
The web is just one of many services available on the internet… what are some others services and protocols on the internet? →
- email (SMTP)
- chat (XMPP, OSCAR, IRC)
- file transfer (FTP)
- voice (SIP, Skype protocol)
- these are all examples of network protocols - ways of communicating over a network
Protocols
Hm. All this talk about protocols but … what exactly is a protocol? →
It's a bunch of rules and conventions for communication. Really. That's it.
For computers and communication between them, these rules may define:
- the format for exchanging messages
- a meaning (semantics) and syntax for these messages
- the process for synchronizing the communication
A Protocol Example
Eloquent JavaSccript describes a simple chat protocol. For two computers to communicate with this protocol:
- one computer sends bits that represent the text, 'CHAT?', to another computer
- the other computer responds with 'OK!' to show that it accepts and understands the protocol
- from there, they can:
- proceed to send each other strings of text
- read the text sent by the other from the network
- display the received text
A Slightly Closer Look at TCP/IP
The previous slides described Application protocols … (chat, mail, specific applications). However, these protocols don't define how data/messages actually gets from one computer to another in a networked environment →
- how does a message get translated from (for example) plain text to electronic signals… and how is sent over the Internet, and translated back to plain text?
- welp! there are other protocols - a stack of protocols that describe how this communication works
- this stack is of protocols is often referred to as the TCP/IP stack
- (mainly because TCP and IP are two of the major protocols involved)
TCP/IP Continued
The TCP/IP stack consists of 4 layers:
- Application Layer - application level protocols such as HTTP, SMTP, etc.
- Transport Layer - protocols involved in communication (connection establishment, flow-control) between applications (either on the same host/computer or different host), such as TCP or UDP
- Network Layer - the protocol responsible for routing packets of data across network boundaries - directs data to a specific computer / host, which is IP or Internet Protocol
- Physical (hardware) Layer / Link Layer - converts data to network signals and back (wi-fi, ethernet)
Sending a Message Over the Internet
Check out the diagram from this whitepaper on how the internet works (!). The whitepaper describes sending data from one host (computer) to another through the internet →
- messages start at the top of the stack and work downward
- each layer that the message passes through may break the message up into smaller chunks of more manageable data called packets
- packets go through the Application Layer and continue to the Transport Layer where each packet is assigned a port number (loosely speaking a number that specifies which program on the destination computer needs to receive the message)
- packets then proceed to the Network Layer, where each packet receives its destination IP address (number that identifies a computer on the network)
Sending a Message Over the Internet Continued
Starting from the hardware layer of this diagram, our message continues its journey! →
- with a port number and an IP address, the hardware layer turns packets of data into electronic signals and transmits them
- these packets eventually arrive at the other host (often going through intermediary routers in the process), and work their way back up the stack
- as the packets go upwards through the stack, all routing data that the sending computer's stack added (such as IP address and port number) is stripped from the packets
- when the data reaches the top of the stack, the packets have been re-assembled into their original form
Again, all of this comes from this whitepaper. Although it's nearly a couple of decades old, the networking aspects are still very relevant.
It All Starts With a URL
Each document or resource on the web is retrievable by a name, a URL (Universal Resource Locator). What are the parts to a URL? →
- scheme/protocol - http (er, browsers accept schema-less)
- domain or actual ip address - pizzaforyou.com
- port (optional) - 80 (default if http)
- path - /search
- query_string (optional) - ?type=vegan
- fragment_id (optional) - #topresult
scheme://domain:port/path?query_string#fragment_id
http://pizzaforyou.com:80/search?type=vegan#top_result
Domains and IP Addresses
Each machine connected to the Internet gets a unique IP address.
We can map domains to IP addresses through DNS (Domain Name System).
- both IP Addresses and domains are acceptable in a URL.
- on OSX, Linux (and windows), there's a file that allows you to map names to ip addresses (before using dns)
- typically
/etc/hosts
orhosts.txt
localhost
maps to127.0.0.1
… which essentially is your computer
HTTP
To retrieve documents on the web, we use HTTP (Hyper Text Transfer Protocol). The computer/application asking for the document is the client or user-agent, and the computer responding to requests for documents is the server.
- generally, the server is going to be some sort of web server, like Apache or Nginx
- the client (or the user-agent) is usuall → some sort of browser, like Chrome or Safari (there are clients other than browsers too)
HTTP is a request-response protocol, a very basic text-based (at least for version 1.1) communication method between computers:
- the client sends a request for some data
- the second computer responds to the request