wsgi初探

大半夜的不睡觉,起来看技术文档,我这是什么精神啊~ 

ok 本文的大部分内容都是阅读 http://wsgi.readthedocs.org/en/latest/ 得来的。下面开始研究 wsgi

wsgi全名叫 Web Server Gateway Interface.是一个python的标准,定义了python程序应该如何和webserver通信。本文主要分为以下四个部分:

what is wsgi

wsgi是python的一个标准,定义了python application与web server通信的接口标准。它不是一个模块,一个程序,也不是一个server。简单的说,如果一个application是按照 wsgi规范写的,一个web server也是按照wsgi规范写的,那么这个application 就可以运行在这个server上。

wsgi的server所做的事情非常简单,就是把client(通常是浏览器) 的request转交给 wsgi application,然后把wsgi application产生的response返回给client。就是这么简单。

wsgi的application则是可以像积木一样堆叠的。比如,wsgi server上面放一个 程序wsgi 程序A, wsgi 程序A上面再放一个wsgi 程序B, wsgi 程序B上面再放一个wsgi 程序C。理论上可以无限的堆叠。对于那些在中间的wsgi 程序,比如A 和 B, 它们就是wsgi middleware。由于它们处于中间,所以它们与上下层通信都需要实现wsgi规范的接口。

application interface

wsgi application的interface需要是一个可以调用的对象,比如function,class或者一个实现了__call__方法的实例(我猜只要一个实例具有__call__方法,我们就可以调用它吧?等下验证一下 -- 经过验证,是这样的)

  • 这个可调用对象必须接收下面两个位置参数:
    • 一个装有类似于CGI变量的字典对象
    • 一个wsgi server提供的回调函数,该函数用来把wsgi application的HTTP status code/message 和HTTP headers发给wsgi server
  • 这个可调用对象必须把response body以string的形式放在一个iterable的对象中

如下是一个代码示例:

# This is our application object. It could have any name,
# except when using mod_wsgi where it must be "application"
def application( # It accepts two arguments:
      # environ points to a dictionary containing CGI like environment variables
      # which is filled by the server for each received request from the client
      environ,
      # start_response is a callback function supplied by the server
      # which will be used to send the HTTP status and headers to the server
      start_response):

   # build the response body possibly using the environ dictionary
   response_body = 'The request method was %s' % environ['REQUEST_METHOD']

   # HTTP response code and message
   status = '200 OK'

   # These are HTTP headers expected by the client.
   # They must be wrapped as a list of tupled pairs:
   # [(Header name, Header value)].
   response_headers = [('Content-Type', 'text/plain'),
                       ('Content-Length', str(len(response_body)))]

   # Send them to the server using the supplied function
   start_response(status, response_headers)

   # Return the response body.
   # Notice it is wrapped in a list although it could be any iterable.
   return [response_body]

这段代码暂时还不能运行,因为我们还没有wsgi server。下一部分会涉及到

Environment dictionary

环境变量字典会包含一些CGI 变量,wsgi server 在收到client的request后根据request填充这个字典。下面的脚本会输出整个字典:

#! /usr/bin/env python

# Our tutorial's WSGI server
from wsgiref.simple_server import make_server

def application(environ, start_response):

   # Sorting and stringifying the environment key, value pairs
   response_body = ['%s: %s' % (key, value)
                    for key, value in sorted(environ.items())]
   response_body = '\n'.join(response_body)

   status = '200 OK'
   response_headers = [('Content-Type', 'text/plain'),
                  ('Content-Length', str(len(response_body)))]
   start_response(status, response_headers)

   return [response_body]

# Instantiate the WSGI server.
# It will receive the request, pass it to the application
# and send the application's response to the client
httpd = make_server(
   'localhost', # The host name.
   8051, # A port number where to wait for the request.
   application # Our application object name, in this case a function.
   )

# Wait for a single request, serve it and quit.
httpd.handle_request()

 

Response Iterable

如果把上面application中的return [response_body] 换成了 return response_body。 则会发现程序的响应速度慢了很多。这是因为server会把response_body的字符串整个当做一个iterable的对象。一个字符一个字符的返回给客户端。 所以,一定要把response_body放进可迭代对象中。 另外,如果一个response_body中包含了多个字符串,那么content-length就是所有字符串的字符数量之和。

Parsing the Request - Get

如果在访问上面的application的时候用下面这样的url

http://localhost:8051/?age=10&hobbies=software&hobbies=tunning

那么在environ字典中REQUEST_METHOD 和 QUERY_STRING 就会是GET 与  age=10&hobbies=software&hobbies=tunning。要注意到hobbies出现了2次。这很正常,比如你提交的表单里面可能有checkbox。通过 CGI module  的  parse_qs 函数,可以很方便的解析query string。parse_qs返回的结果是一个字典,key是如age,hobbies这样的键,而值是list 比如 hobbies对应的值是['software','tunning']。

运行下面的代码,再用上面的URL去访问,就可以看到返回解析过的query  string

#!/usr/bin/env python

from wsgiref.simple_server import make_server
from cgi import parse_qs, escape

def application(environ, start_response):

   # Returns a dictionary containing lists as values.
   d = parse_qs(environ['QUERY_STRING'])

   # In this idiom you must issue a list containing a default value.
   age = d.get('age', [''])[0] # Returns the first age value.
   hobbies = d.get('hobbies', []) # Returns a list of hobbies.

   # Always escape user input to avoid script injection
   age = escape(age)
   hobbies = [escape(hobby) for hobby in hobbies]

   response_body = 'age is '+age+' hobbies is '+' '.join(hobbies) 

   status = '200 OK'

   # Now content type is text/html
   response_headers = [('Content-Type', 'text/html'),
                  ('Content-Length', str(len(response_body)))]
   start_response(status, response_headers)

   return [response_body]

httpd = make_server('localhost', 8051, application)
# Now it is serve_forever() in instead of handle_request().
# In Windows you can kill it in the Task Manager (python.exe).
# In Linux a Ctrl-C will do it.
httpd.serve_forever()

Parsing the Request - Post

如果request是post,那么query string就会在http body中,而不是在URL中。wsgi server在environ字典的wsgi.input这个键对应的value处放了一个类文件对象。这个类文件对象中存放了具体的request string。wsgi server还在environ字典的content_length键对应处放了这个query string的长度。下面的代码解析post request

#!/usr/bin/env python

from wsgiref.simple_server import make_server
from cgi import parse_qs, escape


def application(environ, start_response):

   # the environment variable CONTENT_LENGTH may be empty or missing
   try:
      request_body_size = int(environ.get('CONTENT_LENGTH', 0))
   except (ValueError):
      request_body_size = 0

   # When the method is POST the query string will be sent
   # in the HTTP request body which is passed by the WSGI server
   # in the file like wsgi.input environment variable.
   request_body = environ['wsgi.input'].read(request_body_size)
   d = parse_qs(request_body)

   age = d.get('age', [''])[0] # Returns the first age value.
   hobbies = d.get('hobbies', []) # Returns a list of hobbies.

   # Always escape user input to avoid script injection
   age = escape(age)
   hobbies = [escape(hobby) for hobby in hobbies]

   response_body = age+hobbies

   status = '200 OK'

   response_headers = [('Content-Type', 'text/html'),
                  ('Content-Length', str(len(response_body)))]
   start_response(status, response_headers)

   return [response_body]

httpd = make_server('localhost', 8051, application)
httpd.serve_forever()

 

posted on 2014-09-07 00:35  kramer  阅读(345)  评论(0编辑  收藏  举报

导航