一个多线程web服务器实例(C,Linux,详细的web服务器原理)
转自:http://www.cppblog.com/cuijixin/archive/2008/07/02/55112.html
系统:fedora core 5
编译器:g++
实现功能:通过http协议,用浏览器查看服务器上的html,htm,jpg,jpeg,gif,png,css文件 ,或者说查看带有jpg,jpeg,gif等文件的网页,即是web~
把代码复制下来到linux里,照着后面的方法编译、运行,就可以看到一个简单的多线程服务器的效果了。
原理:
在浏览器中输入一个网址,回车之后,浏览器会向相应主机的相应端口发送一段报文,如果是http协议的(如平常看到的网页的传输协议),就会发送HTTP请求报文。下面是一个报文的例子:
Host: 127.0.0.1:8848
User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.8.0.1) Gecko/20060313 Fedora/1.5.0.1-9 Firefox/1.5.0.1 pango-text
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: zh-cn,zh;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: gb2312,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
我们在服务器端把收到的数据打印出来,可以看到浏览器发过来的就是这个东西。当然,也可以用ethereal等抓包工具来抓获这些报文。关于报文里写的是什么意思,网上有很多资料的,GOOGLE一下就有了。我们只看第一行。
GET表示是要从服务器获取文件,/index.html是文件的路径,这个路径是相对于服务器端程序所在文件夹的路径。如我的服务器端程序放在/home/mio/program/webserver1707/里面,那这个index.html在服务器上的绝对路径就是/home/mio/program/webserver1707/index.html。如果报文里是GET /admin/login.html HTTP/1.1的话,那么login.html文件在服务器端的路径是/home/mio/program/webserver1707/admin/login.html.HTTP/1.1表示的是HTTP协议的版本是1.1.
服务器端程序运行后,一直监听8848端品(0-1023的端口由IANA统一分配和控制的,不要用,最好选大一些的端口号。我原来用了个1234,用不了,还是选大一点好,可以用5460之类的啊~:) ),当监听到客户端发来的请求后,就与客户端建立链接,接收客户端发过来的请求报文。我们如果把这些报文打出来,就可以看到就是与上面请求报文类似的东西了。
下面我们要根据所接受的到的请求报文(GET /index.html HTTP/1.1)来决定放给客户端(即浏览器)什么东西。这里我们看到浏览器要的是index.html这样一个html文本,我们就在相应路径(/home/mio/program/webserver1707/index.html)找到这个文件,不过不要急着发给客户端,我们要先告诉客户端,发过去的是一个html文件,让浏览器做好相应的准备。怎么让浏览器知道呢?我们还是用报文,这个报文叫响应报文。报文由状态行、首部行、实体主体三部分组成。状态行只有一行,它和首部行、首部行的每行之间是没有空行的,但是首部行与实体主体之间有一个空行,表明从这个空行开始,就是你浏览器要的数据了。下面是一个用ethereal抓到的响应报文:
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: GWS/2.1
Content-Length: 1851
Date: Sat, 14 Oct 2006 11:33:39 GMT
<html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8"><title>Google</title><style><!--
body,td,a,p,.h{font-family:arial,sans-serif}
.h{font-size:20px}
.q{color:#00c}
--></style>
<script>
<!--
function sf(){document.f.q.focus();}
function clk(url,oi,cad,ct,cd,sg){if(document.images){var e = window.encodeURIComponent ? encodeURIComponent : escape;var u="";var oi_param="";var cad_param="";if (url) u="&url="+e(url.replace(/#.*/,"")).replace(/\+/g,"%2B");if (oi) oi_param="&oi="+e(oi);if (cad) cad_param="&cad="+e(cad);new Image().src="/url?sa=T"+oi_param+cad_param+"&ct="+e(ct)+"&cd="+e(cd)+u+"&ei=E8swRYIOkpKwAvzZ8JkB"+sg;}return true;}
// -->
</script>
</head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onLoad=sf() topmargin=3 marginheight=3><center><div align=right nowrap style="padding-bottom:4px" width=100%><font size=-1><b>manioster@gmail.com</b> | <a href="/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Dzh-CN&sig=__1eXNMn0jGllmJ57x74DzjVvy6Vk=" onmousedown="return clk('/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Dzh-CN&sig=__1eXNMn0jGllmJ57x74DzjVvy6Vk=','promos','hppphou:zh-cn_all','pro','1','&sig2=zclmOmtQiZPPuTCMWUJMZA')">个性化主页</a> | <a href="https://www.google.com/accounts/ManageAccount">我的帐户</a> | <a href="http://www.google.com/accounts/Logout?continue=http://www.google.com/intl/zh-CN/">退出</a></font></div><img src="/intl/zh-CN_ALL/images/logo.gif" width=286 height=110 alt="Google"><br><br>
<form action=/search name=f><script><!--
function qs(el) {if (window.RegExp && window.encodeURIComponent) {var ue=el.href;var qe=encodeURIComponent(document.f.q.value);if(ue.indexOf("q=")!=-1){el.href=ue.replace(new RegExp("q=[^&$]*"),"q="+qe);}else{el.href=ue+"&q="+qe;}}return 1;}
// -->
..........
第一个空行上面的就是“说明”了,下面是html代码。有了说明,浏览器就知道这是什么了,拿到这段数据后,就把这些html标签解释成各种各样的元素,在浏览器上有序地显示出来。浏览器还蛮聪明的,当看到<img src=..>标签,那就会又自己发一个请求报文给服务器,要求得到一个图像文件,请求报文就像:
....
这样,服务器端就找到这个.jpg图像,加上"说明"之后发给浏览器,浏览器收到后就显示在对应的位置上。遇到包含css、js...的标签也一样。
如此重复,一个完整的web就会呈现在我们眼前了。
服务器端代码:
mymultiwebserver.c
system:redhat linux Fedora Core 5
enviroment:g++
compile command:g++ -g -o mymultiwebserver -lpthread
date:10/15/2006
By Manio
*****************************************************************/
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <netinet/in.h>
#include <unistd.h>
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#define PORT 8848
#define BACKLOG 5
#define MAXDATASIZE 1000
#define DEBUG 1
void process_cli(int connectfd, sockaddr_in client);
int sendobj(int connectfd,char* serverfilepath);
int IsDIR(char* fpath);
int fileordirExist(char* fpath);
char* getextname(char*);
int writehead(FILE* cfp, char* extname);
void* start_routine(void* arg);
void msg404(int connectfd);
struct ARG {
int connfd;
sockaddr_in client;
};
main()
{
int listenfd, connectfd;
pthread_t thread; //id of thread
ARG *arg; //pass this var to the thread
struct sockaddr_in server; //server's address info
struct sockaddr_in client; //client's
int sin_size;
//create tcp socket
#ifdef DEBUG
printf("socket.... ");
#endif
if ((listenfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
perror("creating socket failed.");
exit(1);
}
int opt = SO_REUSEADDR;
setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
bzero(&server,sizeof(server));
server.sin_family = AF_INET;
server.sin_port = htons(PORT);
server.sin_addr.s_addr = htonl(INADDR_ANY);
printf("bind.... ");
if(bind(listenfd,(struct sockaddr *)&server,sizeof(struct sockaddr)) == -1) {
perror("bind error.");
exit(1);
}
printf("listen.... ");
if(listen(listenfd,BACKLOG) == -1) {
perror("listen() error ");
exit(1);
}
sin_size = sizeof(struct sockaddr_in);
while(1)
{
//accept() using main thread
printf("accepting.... ");
if((connectfd = accept(listenfd,
(struct sockaddr *)&client,
(socklen_t*)&sin_size)) == -1) {
printf("accept() error ");
}
arg = new ARG;
arg->connfd = connectfd;
memcpy((void *)&arg->client, &client, sizeof(client));
//invoke start_routine to handle this thread
#ifdef DEBUG
printf("thread_creating....");
#endif
if(pthread_create(&thread, NULL, start_routine, (void*)arg)){
perror("pthread_create() error");
exit(1);
}
}
close(listenfd);
}
//handle the request of the client
void process_cli(int connectfd, sockaddr_in client)
{
int num;
//char recvbuf[MAXDATASIZE], sendbuf[MAXDATASIZE], cli_name[MAXDATASIZE];
char requestline[MAXDATASIZE], filepath[MAXDATASIZE], cmd[MAXDATASIZE],extname[MAXDATASIZE];
int c;
FILE *fp;
FILE *cfp;
fp = fdopen(connectfd,"r");
#ifdef DEBUG
printf("the host is:%s ",inet_ntoa(client.sin_addr) );
#endif
fgets(requestline,MAXDATASIZE,fp);
#ifdef DEBUG
printf(" THE REQUEST IS :%s ",requestline);
#endif
strcpy(filepath,"./");
sscanf(requestline,"%s%s",cmd,filepath+2);
strcpy(extname, getextname(filepath));
#ifdef DEBUG
printf("cmd:%s filepath:%s extname:%s ",cmd,filepath,extname);
printf("string comparing :::::::::::::start::::::::::::::: ");
#endif
if(strcmp(cmd,"GET") == 0){
//the command is get
#ifdef DEBUG
printf("cmd(%s)==GET ",cmd);
#endif
//is this a file or dir or notexist?
if(fileordirExist(filepath)){
//is a file or dir or none
//is this a dir
if(IsDIR(filepath)){
//is a dir
#ifdef DEBUG
printf("%s is a DIR ",filepath);
#endif
if( fileordirExist( strcat(filepath,"index.htm") )){
sendobj(connectfd,"index.htm");
}else if(fileordirExist(strcat(filepath,"index.html"))){
sendobj(connectfd,"index.htm");
}else{
msg404(connectfd);
}
}else{
//is a file
#ifdef DEBUG
printf("%s is a file",filepath);
#endif
sendobj(connectfd,filepath);
}
}else{
#ifdef DEBUG
printf("404 ");
#endif
msg404(connectfd);
}
}else{
#ifdef DEBUG
printf("cmd(%s)!=GET ",cmd);
#endif
}
#ifdef DEBUG
printf(":::::::::::::end::::::::::::::: ");
#endif
close(connectfd);
}
//send the 404 error message to the client
void msg404(int connectfd)
{
char* msg;
msg = "HTTP/1.0 404 Not Found Content-Type: text/plain 404 not found by Manio";
send(connectfd,msg,strlen(msg),0);
}
//is the filepath a file or directory
int fileordirExist(char* fpath)
{
struct stat filestat;
return ( stat(fpath,&filestat) != -1);
}
// is the filepath a directory
int IsDIR(char* fpath)
{
#ifdef DEBUG
printf("IN IsDIR ");
#endif
struct stat filestat;
return ( stat(fpath,&filestat) != -1 && S_ISDIR(filestat.st_mode));
}
//send the data of the file which the client want
int sendobj(int connectfd,char* serverfilepath)
{
FILE* sfp,*cfp;
int c;
sfp = fopen(serverfilepath,"r");
cfp = fdopen(connectfd,"w");
writehead(cfp,getextname(serverfilepath));
while( (c = getc(sfp)) != EOF)putc(c,cfp);
fflush(cfp);
return 0;
}
//write the packet header to the client
int writehead(FILE* cfp, char* extname)
{
#ifdef DEBUG
printf("INWRITEHEAD:::::::extname is %s::::::: ",extname);
#endif
char* content = "text/plain";
if( strcmp(extname,"html") == 0 || strcmp(extname,"htm") == 0)
content = "text/html";
else if ( strcmp(extname,"css") == 0 )
content = "text/css";
else if ( strcmp(extname,"gif") == 0 )
content = "image/gif";
else if ( strcmp(extname,"jpeg") == 0 || strcmp(extname,"jpg") == 0)
content = "image/jpeg";
else if ( strcmp(extname,"png") == 0)
content = "image/png";
#ifdef DEBUG
printf("HTTP/1.1 200 OK ");
printf("Content-Type: %s ",content);
#endif
fprintf(cfp,"HTTP/1.1 200 OK ");
fprintf(cfp,"Content-Type: %s ",content);
return 0;
}
//get the extent name of the file
char* getextname(char* filepath)
{
char* p;
if(( p = strrchr(filepath,'.')) != NULL)
return p+1;
return NULL;
}
//invoked by pthread_create
void* start_routine(void* arg)
{
ARG *info;
info = (ARG *)arg;
//handle client's requirement
process_cli(info->connfd, info->client);
delete arg;
pthread_exit(NULL);
}
运行方法:
在fc5中打开控制台,按下面的方法进行
[root@localhost webserver1707]# ls
admin header img index.htm~
chinaunix.html header~ index.htm mymultiwebserver.c
[root@localhost webserver1707]# g++ -g -o mymultiwebserver mymultiwebserver.c -lpthread
mymultiwebserver.c: In function 鈥榲oid* start_routine(void*)鈥?
mymultiwebserver.c:253: 璀﹀憡锛氬垹闄?鈥榲oid*鈥?鏈畾涔?[root@localhost webserver1707]# ./mymultiwebserver socket....
bind....
listen....
accepting....
thread_creating....accepting....
the host is:127.0.0.1
THE REQUEST IS :GET / HTTP/1.1
cmd:GET
filepath:.//
extname://
string comparing
:::::::::::::start:::::::::::::::
cmd(GET)==GET
IN IsDIR
.// is a DIR
INWRITEHEAD:::::::extname is htm:::::::
HTTP/1.1 200 OK
Content-Type: text/html
thread_creating....accepting....
:::::::::::::end:::::::::::::::
thread_creating....accepting....
the host is:127.0.0.1
THE REQUEST IS :GET /img/sb.jpg HTTP/1.1
cmd:GET
filepath:.//img/sb.jpg
extname:jpg
string comparing
:::::::::::::start:::::::::::::::
cmd(GET)==GET
IN IsDIR
.//img/sb.jpg is a fileINWRITEHEAD:::::::extname is jpg:::::::
HTTP/1.1 200 OK
Content-Type: image/jpeg
:::::::::::::end:::::::::::::::
the host is:127.0.0.1
THE REQUEST IS :GET /img/gcc.png HTTP/1.1
cmd:GET
filepath:.//img/gcc.png
extname:png
string comparing
:::::::::::::start:::::::::::::::
cmd(GET)==GET
IN IsDIR
.//img/gcc.png is a fileINWRITEHEAD:::::::extname is png:::::::
HTTP/1.1 200 OK
Content-Type: image/png
:::::::::::::end:::::::::::::::
放一个index.htm文件在此程序所在的文件夹,打开浏览器,在地址栏输入http://127.0.0.1:8848/,就可以看到网页了~