《深入理解计算机系统》CSAPP_ProxyLab

ProxyLab

开始日期:22.2.27

操作系统:Ubuntu20.0.4

Link:CS:APP3e

写在前面

实验环境bug

  • 在目录proxylab-handout中输入./driver.sh,出现bug:

    • bug 1:未安装net-tool

      行 117: netstat:未找到命令
      ...
      

      是由于未安装net-tool包,执行以下指令安装即可(需要输入用户密码)

      sudo apt install net-tools
      
    • bug2:part II & part III 无法评分

      *** Concurrency ***
      Starting tiny on port 25178
      Starting proxy on port 26265
      Starting the blocking NOP server on port 12673
      Timeout waiting for the server to grab the port reserved for it
      已终止
      

      是python环境问题,将python文件nop-server.py中的首行 #!/usr/bin/python添加改为#!/usr/bin/python3即可

  • curl测试出现bug
    端口号来自指令./port-for-user.pl,分配的端口号是 55200,笔者拿给tiny web使用,为了方便,笔者手动加1,把552001proxy使用
    首先需要开启三个终端,一个服务器tiny web(注意要在目录proxylab-handout/tiny中启动),一个代理proxy,一个用以执行curl指令

    /* boot tiny */
    ./tiny 55200
      
    /* boot proxy */
    ./tiny 55201
      
    /* command: curl */
    curl -v --proxy http://localhost:55201/ http://localhost:55200/home.html
    
    • proxy、tiny对应的port出错,或者没有分开启动三个终端或者浏览器firefox网络代理配置错误

      * Trying 127.0.0.1:55201...
      * TCP_NODELAY set
      * connect to 127.0.0.1 port 55201 failed: 拒绝连接
      * Failed to connect to localhost port 55201: 拒绝连接
      * Closing connection 0
      curl: (7) Failed to connect to localhost port 55201: 拒绝连接
      
      host:55201/ http://localhoast:55200/home.html
      *   Trying 127.0.0.1:55201...
      * TCP_NODELAY set
      * Connected to localhost (127.0.0.1) port 55201 (#0)
      > GET http://localhoast:55200/home.html HTTP/1.1
      > Host: localhoast:55200
      > User-Agent: curl/7.68.0
      > Accept: */*
      > Proxy-Connection: Keep-Alive
      > 
      * Empty reply from server
      * Connection #0 to host localhost left intact
      curl: (52) Empty reply from server
      duile@ubuntu:~/Desktop/csapp_lab/prox
      
    • tiny web 对于的URI(网址)输入错误

      Accepted connection from (localhost 48746)
      getaddrinfo failed (localhoast:55200): Temporary failure in name resolution
      Open_clientfd error: Resource temporarily unavailable
      

firefox网络代理配置

  • 端口号来自指令./port-for-user.pl,分配的端口号是 55200,笔者拿给tiny web使用,为了方便,笔者自动加1,把552001proxy使用
  • 如图所示(firefox => 设定 => 网络配置)

需要提前理解的知识点

  • sprint()、sscanf()

  • fprintf()

  • strstr()、strcmp()

  • RIO packet

  • telnet、curl

    • telnet 127.0.0.1 4500可以用来远程登录是否有效
      4500是通过脚本指令./free-port.sh获得自由端口)
  • int Open_clientfd(char *hostname, char *port)

    • csapp.h文件中给出的Open系列函数里,port是char类型
  • \r\n,new_request_hdr需要加上

    • 先回车(return line)再换行(newline)
  • 在HTTP协议,服务器的默认端口号为:80

  • 关闭服务器或者代理

    • ctrl + c
  • uri和url的区别

参考链接

part I

  • 任务:实现一个简单的proxy(代理),只要满足requests of sequence (序列请求)即可

  • 首先要搞懂什么是代理,它主要有两个任务(这也是doit要实现的功能)

    • 面对client(客户端)的request(请求),代理要充当服务器,接收请求,再将请求包装转发给服务器
    • 面对server(服务器)的respond(响应),代理要充当客服端,接收响应,再将响应直接转发给客服端
  • 一开始懵是很正常的,该部分的代码主要参考了tiny.c,要理解tiny.c才好写

    • 我们是通过fd(文件描述符),来进行数据传递的(这里还需要使用RIO packet的主要函数)
    • 注意sprint()函数不打印字符,而是传递字符给指针
  • 为方便理解,笔者在代码中将真正的客户端命名为real_client,将真正的服务器命名为real_server

  • 笔者在这里写清楚两个函数的作用

    • parse_uri

      • 功能:从uri中获取hostname(主机名),port(端口号),path(路径)
        normal uri => http://hostname:port/path

      • eg uri => http://www.cmu.edu:450/hub/index.html

        • hostname => www.cmu.edu

        • port => 450,若uri中没有端口号,默认是80(在HTTP协议,服务器的默认端口号为:80

        • path => /hub/index.html

      • hostname,port,path的获取都要分为两种情况,一共有六种情况

        • hostname要考虑http://是否存在
        • port要考虑port本身是否存在
        • path也要path本身是否存在
    • build_new_request_hdr

      • 功能:把来自real_client旧请求包装成新请求转发real_server
      • 先读取旧请求中的hdr,以其为基础构造新请求的hdr
  • curl的测试结果
    (关于GET请求,请读者自测:GET http://www.cmu.edu/hub/index.html HTTP/1.1

  • 参考代码

#include <stdio.h>
#include "csapp.h"

/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400

/* You won't lose style points for including this long line in your code */
static const char *user_agent_hdr = "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 Firefox/10.0.3\r\n";
static const char *conn_hdr = "Connection: close\r\n";
static const char *proxy_hdr = "Proxy-Connection: close\r\n";

void doit(int client_fd);
void clienterror(int fd, char *cause, char *errnum, 
		 char *shortmsg, char *longmsg);
void parse_uri(char *uri, char *hostname, char *path, int *port);
void build_new_request_hdr(rio_t *rio_packet, char *new_request, char *hostname, char *port);
void *thread(void *varge_ptr);

/* boot proxy as server get connfd from client*/
int main(int argc, char **argv) 
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];
    socklen_t clientlen;
    struct sockaddr_storage clientaddr;
    pthread_t tid;

    /* Check command line args */
    if (argc != 2) {
		fprintf(stderr, "usage: %s <port>\n", argv[0]);
		exit(1);
    }

    listenfd = Open_listenfd(argv[1]);
    while (1) {
		clientlen = sizeof(clientaddr);
		connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen); 
    	Getnameinfo((SA *) &clientaddr, clientlen, hostname, MAXLINE, 
                    port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);
        doit(connfd);
		Close(connfd);                                                                                     
    }
}

/*
 * doit - handle one HTTP request/response transaction
 */
void doit(int client_fd) 
{
    int real_server_fd;
    char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
    rio_t real_client, real_server;
    char hostname[MAXLINE], path[MAXLINE];
    int port;

    /* Read request line and headers */
    Rio_readinitb(&real_client, client_fd);
    if (!Rio_readlineb(&real_client, buf, MAXLINE))  	 
        return;
    sscanf(buf, "%s %s %s", method, uri, version);       
    if (strcasecmp(method, "GET")) {                     
        clienterror(client_fd, method, "501", "Not Implemented",
                    "Tiny does not implement this method");
        return;
    }                                                    
    
    /* perpare: parse uri and build new request */
    parse_uri(uri, hostname, path, &port);
    char port_str[10];
    sprintf(port_str, "%d", port); /* port from int convert to char */
    real_server_fd = Open_clientfd(hostname, port_str);  /* real server get fd from proxy(as client) */
	if(real_server_fd < 0){
        printf("connection failed\n");
        return;
    }
    Rio_readinitb(&real_server, real_server_fd);
    
    char new_request[MAXLINE];
    sprintf(new_request, "GET %s HTTP/1.0\r\n", path);
    build_new_request_hdr(&real_client, new_request, hostname, port_str);

    /* proxy as client sent to web server */
    Rio_writen(real_server_fd, new_request, strlen(new_request));
    
    /* then proxy as server respond to real client */
    int char_nums;
    while((char_nums = Rio_readlineb(&real_server, buf, MAXLINE)))
        Rio_writen(client_fd, buf, char_nums);
}



/*
 * clienterror - returns an error message to the client
 */
void clienterror(int fd, char *cause, char *errnum, 
		 char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE], body[MAXBUF];

    /* Build the HTTP response body */
    sprintf(body, "<html><title>Tiny Error</title>");
    sprintf(body, "%s<body bgcolor=""ffffff"">\r\n", body);
    sprintf(body, "%s%s: %s\r\n", body, errnum, shortmsg);
    sprintf(body, "%s<p>%s: %s\r\n", body, longmsg, cause);
    sprintf(body, "%s<hr><em>The Tiny Web server</em>\r\n", body);

    /* Print the HTTP response */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: text/html\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-length: %d\r\n\r\n", (int)strlen(body));
    Rio_writen(fd, buf, strlen(buf));
    Rio_writen(fd, body, strlen(body));
}

/*
 * parse_uri - parse uri to get hostname, port, path from real client
 */
void parse_uri(char *uri, char *hostname, char *path, int *port) {
    *port = 80; /* default port */
    char* ptr_hostname = strstr(uri,"//");
    /* normal uri => http://hostname:port/path */
    /* eg. uri => http://www.cmu.edu:8080/hub/index.html */
    if (ptr_hostname) 
        /* hostname_eg1. uri => http://hostname... */
        ptr_hostname += 2; 
    else
        /* hostname_eg2. uri => hostname... <= NOT "http://"*/
        ptr_hostname = uri; 
    
    char* ptr_port = strstr(ptr_hostname, ":"); 
    /* port_eg1. uri => ...hostname:port... */
    if (ptr_port) {
        *ptr_port = '\0'; /* c-style: the end of string(hostname) is '\0' */
        strncpy(hostname, ptr_hostname, MAXLINE);

        /* change default port to current port */
        /* if path not char, sscanf will automatically store the ""(null) int the path */
        sscanf(ptr_port + 1,"%d%s", port, path); 
    } 
    /* port_eg1. uri => ...hostname... <= NOT ":port"*/
    else {
        char* ptr_path = strstr(ptr_hostname,"/");
        /* path_eg1. uri => .../path */
        if (ptr_path) {
            *ptr_path = '\0';
            strncpy(hostname, ptr_hostname, MAXLINE);
            *ptr_path = '/';
            strncpy(path, ptr_path, MAXLINE);
            return;                               
        }
        /* path_eg2. uri => ... <= NOT "/path"*/
        strncpy(hostname, ptr_hostname, MAXLINE);
        strcpy(path,"");
    }
    return;
}

/*
 * build_new_request_hdr - get old request_hdr then build new request_hdr
 */
void build_new_request_hdr(rio_t *real_client, char *new_request, char *hostname, char *port){
    char temp_buf[MAXLINE];

    /* get old request_hdr */
    while(Rio_readlineb(real_client, temp_buf, MAXLINE) > 0){
        if (strstr(temp_buf, "\r\n")) break; /* read to end */

        /* if all old request_hdr had been read, we get it */
        if (strstr(temp_buf, "Host:")) continue;
        if (strstr(temp_buf, "User-Agent:")) continue;
        if (strstr(temp_buf, "Connection:")) continue;
        if (strstr(temp_buf, "Proxy Connection:")) continue;

        sprintf(new_request, "%s%s", new_request, temp_buf);
    }

    /* build new request_hdr */
    sprintf(new_request, "%sHost: %s:%s\r\n", new_request, hostname, port);
    sprintf(new_request, "%s%s%s%s", new_request, user_agent_hdr, conn_hdr, proxy_hdr);
    sprintf(new_request,"%s\r\n", new_request);
}

part II

  • 实现concurrency(并发),也就是可以两个或两个以上的客户端同时发送请求

  • 参考课本关于线程并发的内容即可,基本是照搬,但注意两点(书本也提及了)

    • 为每一个peer thread(对等线程)分配单独的空间,可以防止多个peer thread的race(竞争)
    • 对于每个peer thread要先 detach(分离)之后再释放
  • 课本内容(英文版):实现concurrency

  • 参考代码

#include <stdio.h>
#include "csapp.h"

/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400

/* You won't lose style points for including this long line in your code */
static const char *user_agent_hdr = "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 Firefox/10.0.3\r\n";
static const char *conn_hdr = "Connection: close\r\n";
static const char *proxy_hdr = "Proxy-Connection: close\r\n";

void doit(int client_fd);
void clienterror(int fd, char *cause, char *errnum, 
		 char *shortmsg, char *longmsg);
void parse_uri(char *uri, char *hostname, char *path, int *port);
void build_new_request_hdr(rio_t *rio_packet, char *new_request, char *hostname, char *port);
void *thread(void *varge_ptr);

/* boot proxy as server get connfd from client*/
int main(int argc, char **argv) 
{
    int listenfd, *connfd_ptr;
    char hostname[MAXLINE], port[MAXLINE];
    socklen_t clientlen;
    struct sockaddr_storage clientaddr;
    pthread_t tid;

    /* Check command line args */
    if (argc != 2) {
		fprintf(stderr, "usage: %s <port>\n", argv[0]);
		exit(1);
    }

    listenfd = Open_listenfd(argv[1]);
    while (1) {
		clientlen = sizeof(clientaddr);
        connfd_ptr = Malloc(sizeof(int)); /* alloc memory of each thread to avoid race */
		*connfd_ptr = Accept(listenfd, (SA *)&clientaddr, &clientlen); 
    	Getnameinfo((SA *) &clientaddr, clientlen, hostname, MAXLINE, 
                    port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);
		Pthread_create(&tid, NULL, thread, connfd_ptr);                                                                                      
    }
}

/*
 * Thread routine
 */
void *thread(void *varge_ptr){
    int connfd = *((int *)varge_ptr);
    Pthread_detach(pthread_self());
    doit(connfd);
    Free(varge_ptr);
    Close(connfd);
    return;
}

/*
 * doit - handle one HTTP request/response transaction
 */
void doit(int client_fd) 
{
    int real_server_fd;
    char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
    rio_t real_client, real_server;
    char hostname[MAXLINE], path[MAXLINE];
    int port;

    /* Read request line and headers */
    Rio_readinitb(&real_client, client_fd);
    if (!Rio_readlineb(&real_client, buf, MAXLINE))  	 
        return;
    sscanf(buf, "%s %s %s", method, uri, version);       
    if (strcasecmp(method, "GET")) {                     
        clienterror(client_fd, method, "501", "Not Implemented",
                    "Tiny does not implement this method");
        return;
    }                                                    
    
    /* perpare: parse uri and build new request */
    parse_uri(uri, hostname, path, &port);
    char port_str[10];
    sprintf(port_str, "%d", port); /* port from int convert to char */
    real_server_fd = Open_clientfd(hostname, port_str);  /* real server get fd from proxy(as client) */
	if(real_server_fd < 0){
        printf("connection failed\n");
        return;
    }
    Rio_readinitb(&real_server, real_server_fd);
    
    char new_request[MAXLINE];
    sprintf(new_request, "GET %s HTTP/1.0\r\n", path);
    build_new_request_hdr(&real_client, new_request, hostname, port_str);

    /* proxy as client sent to web server */
    Rio_writen(real_server_fd, new_request, strlen(new_request));
    
    /* then proxy as server respond to real client */
    int char_nums;
    while((char_nums = Rio_readlineb(&real_server, buf, MAXLINE)))
        Rio_writen(client_fd, buf, char_nums);
}



/*
 * clienterror - returns an error message to the client
 */
void clienterror(int fd, char *cause, char *errnum, 
		 char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE], body[MAXBUF];

    /* Build the HTTP response body */
    sprintf(body, "<html><title>Tiny Error</title>");
    sprintf(body, "%s<body bgcolor=""ffffff"">\r\n", body);
    sprintf(body, "%s%s: %s\r\n", body, errnum, shortmsg);
    sprintf(body, "%s<p>%s: %s\r\n", body, longmsg, cause);
    sprintf(body, "%s<hr><em>The Tiny Web server</em>\r\n", body);

    /* Print the HTTP response */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: text/html\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-length: %d\r\n\r\n", (int)strlen(body));
    Rio_writen(fd, buf, strlen(buf));
    Rio_writen(fd, body, strlen(body));
}

/*
 * parse_uri - parse uri to get hostname, port, path from real client
 */
void parse_uri(char *uri, char *hostname, char *path, int *port) {
    *port = 80; /* default port */
    char* ptr_hostname = strstr(uri,"//");
    /* normal uri => http://hostname:port/path */
    /* eg. uri => http://www.cmu.edu:8080/hub/index.html */
    if (ptr_hostname) 
        /* hostname_eg1. uri => http://hostname... */
        ptr_hostname += 2; 
    else
        /* hostname_eg2. uri => hostname... <= NOT "http://"*/
        ptr_hostname = uri; 
    
    char* ptr_port = strstr(ptr_hostname, ":"); 
    /* port_eg1. uri => ...hostname:port... */
    if (ptr_port) {
        *ptr_port = '\0'; /* c-style: the end of string(hostname) is '\0' */
        strncpy(hostname, ptr_hostname, MAXLINE);

        /* change default port to current port */
        /* if path not char, sscanf will automatically store the ""(null) int the path */
        sscanf(ptr_port + 1,"%d%s", port, path); 
    } 
    /* port_eg1. uri => ...hostname... <= NOT ":port"*/
    else {
        char* ptr_path = strstr(ptr_hostname,"/");
        /* path_eg1. uri => .../path */
        if (ptr_path) {
            *ptr_path = '\0';
            strncpy(hostname, ptr_hostname, MAXLINE);
            *ptr_path = '/';
            strncpy(path, ptr_path, MAXLINE);
            return;                               
        }
        /* path_eg2. uri => ... <= NOT "/path"*/
        strncpy(hostname, ptr_hostname, MAXLINE);
        strcpy(path,"");
    }
    return;
}

/*
 * build_new_request_hdr - get old request_hdr then build new request_hdr
 */
void build_new_request_hdr(rio_t *real_client, char *new_request, char *hostname, char *port){
    char temp_buf[MAXLINE];

    /* get old request_hdr */
    while(Rio_readlineb(real_client, temp_buf, MAXLINE) > 0){
        if (strstr(temp_buf, "\r\n")) break; /* read to end */

        /* if all old request_hdr had been read, we get it */
        if (strstr(temp_buf, "Host:")) continue;
        if (strstr(temp_buf, "User-Agent:")) continue;
        if (strstr(temp_buf, "Connection:")) continue;
        if (strstr(temp_buf, "Proxy Connection:")) continue;

        sprintf(new_request, "%s%s", new_request, temp_buf);
    }

    /* build new request_hdr */
    sprintf(new_request, "%sHost: %s:%s\r\n", new_request, hostname, port);
    sprintf(new_request, "%s%s%s%s", new_request, user_agent_hdr, conn_hdr, proxy_hdr);
    sprintf(new_request,"%s\r\n", new_request);
}

part III

  • 功能:在proxy中格外添加一个cache(缓存)功能,这个cache存储的是最近使用的object(对象)

    • 若cache中已存在请求的对象,就直接从cache转发给client
    • 若cache中未存在请求的对象,在把响应转发给client的同时,把响应(可能需要拆解成一个个object,也可能不需要)存储到cache中
  • 注意一个object是page(网页)的一部分,而不是整个网页

    • object的结构

      /* struction of one object(also one cache block) */
      typedef struct {
          char *url;
          char *content;
          int *cnt; /* LRU: the count of use */
          int *is_used; /* equals 0 => obj can't be used; equals 1 => obj can be used */
      }object;
      
    • is_used的取值不同(取0或取1)对于读者(reader)、写者(writer)各有意义

      • cache[i].is_used == 1
        • 对于reader()函数,这个obj存在,可以使用,可以传递给real client
        • 对于writer()函数,这个obj已经在cache中,如果它对应的cnt最小,那么可以被替代为新的obj
      • cache[i].is_used == 0
        • 对于reader()函数,这个obj不存在,不可以使用,不可以传递给real client
        • 对于writer()函数,这个obj还没有内容,在cache中是空壳,可以直接把新的obj插入
  • 这里使用的URL丢弃策略的思路和平时是一样的

    • 即最近最少使用的把object丢弃掉
    • 常规的计数方式是:
      cnt = 0(或者是cnt 最小的)是不经常使用的,cnt最大的是经常使用的
    • 反常规的计数方式是
      cnt = 0(或者是cnt 最小的)反倒是经常使用的,cnt最大的是不经常使用的
    • 笔者的代码采用常规的计数方式
  • 使用了读者-写者模型中的读者优先

    • 读者可以有多个以满足并发,但写者只能有一个
    • 可能会出现starvation(饥饿)现象:
      • 饥饿的定义:一个线程无限期地阻塞,无法进展
      • 读者优先可能导致:读者不断地到达,从而写者无限期地等待
        对于part III,可以理解为,cache不断地被读,但没法写(更新)
    • 参考了课本代码(读者计数锁,写者锁)
      • 当有第一个读者来的时候,我们允许唯一的写者启动,并锁住保护;当最后一个读者离开的时候,我们允许唯一的写者结束,并解锁保护。这就履行了读者-写者模型中的读者优先
      • 注意我们是多线程,所以会有多个读者,同时我们只允许唯一的写者
      • 课本内容(英文版):读者-写者模型中的读者优先
  • cache中objects的数量,笔者取10

    • #define NUMBERS_OBJECT 10
    • 因为MAX_OBJECT_SIZE * 10 = 1024000 约等于 MAX_CACHE_SIZE = 1049000
  • 参考代码

    #include <stdio.h>
    #include "csapp.h"
    
    /* Recommended max cache and object sizes */
    #define MAX_CACHE_SIZE 1049000
    #define MAX_OBJECT_SIZE 102400
    /* numbers of object from a cache */
    #define NUMBERS_OBJECT 10
    
    /* You won't lose style points for including this long line in your code */
    static const char *user_agent_hdr = "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 Firefox/10.0.3\r\n";
    static const char *conn_hdr = "Connection: close\r\n";
    static const char *proxy_hdr = "Proxy-Connection: close\r\n";
    
    /* struction of one object(also one cache block) */
    typedef struct {
        char *url;
        char *content;
        int *cnt; /* LRU: the count of use */
        int *is_used; /* equals 0 => obj can't be used; equals 1 => obj can be used */
    }object;
    
    /* Global varibles */
    static object *cache;
    static int readcnt; /* count of reader */
    static sem_t readcnt_mutex, writer_mutex; /* and the mutex that pretects it */
    
    /* helper function */
    void doit(int client_fd);
    void clienterror(int fd, char *cause, char *errnum, 
    		 char *shortmsg, char *longmsg);
    void parse_uri(char *uri, char *hostname, char *path, int *port);
    void print_and_build_hdr(rio_t *rio_packet, char *new_request, char *hostname, char *port);
    void *thread(void *varge_ptr);
    void init_cache(void);
    static void init_mutex(void);
    int reader(int fd, char* url);
    void writer(int* buf, char* url);
    
    /* boot proxy as server get connfd from client*/
    int main(int argc, char **argv) 
    {   
        init_cache();
        int listenfd, *connfd_ptr;
        char hostname[MAXLINE], port[MAXLINE];
        socklen_t clientlen;
        struct sockaddr_storage clientaddr;
        pthread_t tid;
    
        /* Check command line args */
        if (argc != 2) {
    		fprintf(stderr, "usage: %s <port>\n", argv[0]);
    		exit(1);
        }
    
        listenfd = Open_listenfd(argv[1]);
        while (1) {
    		clientlen = sizeof(clientaddr);
            connfd_ptr = Malloc(sizeof(int)); /* alloc memory of each thread to avoid race */
    		*connfd_ptr = Accept(listenfd, (SA *)&clientaddr, &clientlen); 
        	Getnameinfo((SA *) &clientaddr, clientlen, hostname, MAXLINE, 
                        port, MAXLINE, 0);
            printf("Accepted connection from (%s, %s)\n", hostname, port);
    		Pthread_create(&tid, NULL, thread, connfd_ptr);                                                                                    
        }
        return 0;
    }
    
    /*
     * Thread routine
     */
    void *thread(void *varge_ptr){
        int connfd = *((int *)varge_ptr);
        Pthread_detach(pthread_self());
        doit(connfd);
        Free(varge_ptr);
        Close(connfd);
        return;
    }
    
    /*
     * doit - handle one HTTP request/response transaction
     */
    void doit(int client_fd) 
    {
        int real_server_fd;
        char buf[MAXLINE], method[MAXLINE], url[MAXLINE], version[MAXLINE];
        char uri[MAXLINE], obj_buf[MAXLINE];
        rio_t real_client, real_server;
        char hostname[MAXLINE], path[MAXLINE];
        int port;
    
        /* Read request line and headers from real client */
        Rio_readinitb(&real_client, client_fd);
        if (!Rio_readlineb(&real_client, buf, MAXLINE))  	 
            return;
        sscanf(buf, "%s %s %s", method, uri, version);
        strcpy(url, uri);       
        if (strcasecmp(method, "GET")) {                     
            clienterror(client_fd, method, "501", "Not Implemented",
                        "Tiny does not implement this method");
            return;
        }
    
        /* if object of request from cache */
        if(reader(client_fd, url)){
            fprintf(stdout, "%s from cache\n", url);
            return;
        }
    
        /* perpare for parse uri and build new request */
        parse_uri(uri, hostname, path, &port);
        char port_str[0];
        sprintf(port_str, "%d", port); /* port from int convert to char */
        real_server_fd = Open_clientfd(hostname, port_str);  /* real server get fd from proxy(as client) */
    	if(real_server_fd < 0){
            printf("connection failed\n");
            return;
        }
        Rio_readinitb(&real_server, real_server_fd);
        
        char new_request[MAXLINE];
        sprintf(new_request, "GET %s HTTP/1.0\r\n", path);
        print_and_build_hdr(&real_client, new_request, hostname, port_str);
    
        /* proxy as client sent to web server */
        Rio_writen(real_server_fd, new_request, strlen(new_request));
        
        /* then proxy as server respond to real client */
        int char_nums;
        int obj_size = 0;
        while((char_nums = Rio_readlineb(&real_server, buf, MAXLINE))){
            Rio_writen(client_fd, buf, char_nums);
    
             /* perpare for write object to cache */
             if(obj_size + char_nums < MAX_OBJECT_SIZE){
                strcpy(obj_buf + obj_size, buf);
                obj_size += char_nums;
             }
        }
    
        if(obj_size < MAX_OBJECT_SIZE)
            writer(obj_buf, url);
    
        Close(real_server_fd);
    }
    
    
    
    /*
     * clienterror - returns an error message to the client
     */
    void clienterror(int fd, char *cause, char *errnum, 
    		 char *shortmsg, char *longmsg) 
    {
        char buf[MAXLINE], body[MAXBUF];
    
        /* Build the HTTP response body */
        sprintf(body, "<html><title>Tiny Error</title>");
        sprintf(body, "%s<body bgcolor=""ffffff"">\r\n", body);
        sprintf(body, "%s%s: %s\r\n", body, errnum, shortmsg);
        sprintf(body, "%s<p>%s: %s\r\n", body, longmsg, cause);
        sprintf(body, "%s<hr><em>The Tiny Web server</em>\r\n", body);
    
        /* Print the HTTP response */
        sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
        Rio_writen(fd, buf, strlen(buf));
        sprintf(buf, "Content-type: text/html\r\n");
        Rio_writen(fd, buf, strlen(buf));
        sprintf(buf, "Content-length: %d\r\n\r\n", (int)strlen(body));
        Rio_writen(fd, buf, strlen(buf));
        Rio_writen(fd, body, strlen(body));
    }
    
    /*
     * parse_uri - parse uri to get hostname, port, path from real client
     */
    void parse_uri(char *uri, char *hostname, char *path, int *port) {
        *port = 80; /* default port */
        char* ptr_hostname = strstr(uri,"//");
        /* normal uri => http://hostname:port/path */
        /* eg. uri => http://www.cmu.edu:8080/hub/index.html */
        if (ptr_hostname) 
            /* hostname_eg1. uri => http://hostname... */
            ptr_hostname += 2; 
        else
            /* hostname_eg2. uri => hostname... <= NOT "http://"*/
            ptr_hostname = uri; 
        
        char* ptr_port = strstr(ptr_hostname, ":"); 
        /* port_eg1. uri => ...hostname:port... */
        if (ptr_port) {
            *ptr_port = '\0'; /* c-style: the end of string(hostname) is '\0' */
            strncpy(hostname, ptr_hostname, MAXLINE);
    
            /* change default port to current port */
            /* if path not char, sscanf will automatically store the ""(null) in the path */
            sscanf(ptr_port + 1,"%d%s", port, path); 
        } 
        /* port_eg1. uri => ...hostname... <= NOT ":port"*/
        else {
            char* ptr_path = strstr(ptr_hostname,"/");
            /* path_eg1. uri => .../path */
            if (ptr_path) {
                *ptr_path = '\0';
                strncpy(hostname, ptr_hostname, MAXLINE);
                *ptr_path = '/';
                strncpy(path, ptr_path, MAXLINE);
                return;                               
            }
            /* path_eg2. uri => ... <= NOT "/path"*/
            strncpy(hostname, ptr_hostname, MAXLINE);
            strcpy(path,"");
        }
        return;
    }
    
    /*
     * print_and_build_hdr - print old request_hdr then build and print new request_hdr
     */
    void print_and_build_hdr(rio_t *real_client, char *new_request, char *hostname, char *port){
        char temp_buf[MAXLINE];
    
        /* print old request_hdr */
        while(Rio_readlineb(real_client, temp_buf, MAXLINE) > 0){
            if (strstr(temp_buf, "\r\n")) break; /* read to end */
    
            /* if all old request_hdr had been read, we print it */
            if (strstr(temp_buf, "Host:")) continue;
            if (strstr(temp_buf, "User-Agent:")) continue;
            if (strstr(temp_buf, "Connection:")) continue;
            if (strstr(temp_buf, "Proxy Connection:")) continue;
    
            sprintf(new_request, "%s%s", new_request, temp_buf);
        }
    
        /* build and print new request_hdr */
        sprintf(new_request, "%sHost: %s:%s\r\n", new_request, hostname, port);
        sprintf(new_request, "%s%s%s%s", new_request, user_agent_hdr, conn_hdr, proxy_hdr);
        sprintf(new_request,"%s\r\n", new_request);
    }
    
    /*
     * initialize the cache
     */
    void init_cache(void){
        init_mutex();
        int readcnt = 0;
        
        /* cache is a Array of object*/
        cache = (object*)Malloc(MAX_CACHE_SIZE);
        for(int i = 0; i < 10; i++){
            cache[i].url = (char*)Malloc(sizeof(char) * MAXLINE);
            cache[i].content = (char*)Malloc(sizeof(char) * MAX_OBJECT_SIZE);
            cache[i].cnt = (int*)Malloc(sizeof(int));
            cache[i].is_used = (int*)Malloc(sizeof(int));
            *(cache[i].cnt) = 0;
            *(cache[i].is_used) = 0;
        }
    }
    
    /*
     * initialize the mutex
     */
    static void init_mutex(void){
        Sem_init(&readcnt_mutex, 0, 1);
        Sem_init(&writer_mutex, 0, 1);
    }
    
    /*
     * reader - read from cache to real client
     */
    int reader(int fd, char* url){
        while(1){
            int from_cache = 0; /* equals 0 => obj not from cache; equals 1 => obj from cache */
    
            P(&readcnt_mutex);
            readcnt++;
            if(readcnt == 1) /* First in */
                P(&writer_mutex);
            V(&readcnt_mutex);
    
            /* obj from cache then we should write content to fd of real client */
            for(int i = 0; i < NUMBERS_OBJECT; i++){
                if(cache[i].is_used && (strcmp(url, cache[i].url) == 0)){
                    from_cache = 1;
                    Rio_writen(fd, cache[i].content, MAX_OBJECT_SIZE);
                    *(cache[i].cnt)++;
                    break;
                }
            }
    
            P(&readcnt_mutex);
            readcnt--;
            if(readcnt == 0) /* last out */
                V(&writer_mutex);
            V(&readcnt_mutex);
    
            return from_cache;        
        }
    }
    
    /*
     * writer - write from real server to cache
     */
    void writer(int* buf, char* url){
        while(1){
            int min_cnt = *(cache[0].cnt);
            int insert_or_evict_i;
    
            P(&writer_mutex);
    
            /* LRU: find the empty obj to insert or the obj of min cnt to evict */
            for(int i = 0; i < NUMBERS_OBJECT; i++){
                if(*(cache[i].is_used) == 0){ /* insert */
                    insert_or_evict_i = i;
                    break;
                }
                if(*(cache[i].cnt) < min_cnt){ /* evict */
                    insert_or_evict_i = i;
                    min_cnt = *(cache[i].cnt);
                }
            }
            strcpy(cache[insert_or_evict_i].url, url);
            strcpy(cache[insert_or_evict_i].content, buf);
            *(cache[insert_or_evict_i].cnt) = 0;
            *(cache[insert_or_evict_i].is_used) = 1;
    
            V(&writer_mutex);
        }
    }
    

总结

  • 完成日期:22.3.13
  • 文件描述符真的很好用,用来传递数据
  • 对于很多函数的功能经常想当然,如果没见过,一定要仔细查资料搞清楚,比如sprint()
  • 期间还是中断了一大段时间,跑去做xv6的实验了,还休息了两三天(在玩galgame),不过总归是完成了!
  • csapp的lab之旅算是结束了,经常受折磨,也收获许多,对计算机有大概的了解了,嗯,之后做xv6的实验我还要多多提高自己的debug能力
  • 最近在听SHE'S的Letter、Sekai no Owari的《花鳥風月》以及Eason的《人生马拉松》
posted @ 2022-03-13 20:47  duile  阅读(1274)  评论(1编辑  收藏  举报