Flask源码解析：上下文

1 上下文（application context 和 request context）

每一段程序都有很多外部变量。只有像Add这种简单的函数才是没有外部变量的。一旦你的一段程序有了外部变量，这段程序就不完整，不能独立运行。你为了使他们运行，就要给所有的外部变量一个一个写一些值进去。这些值的集合就叫上下文。

比如，在 flask 中，视图函数需要知道它执行情况的请求信息（请求的 url，参数，方法等）以及应用信息（应用中初始化的数据库等），才能够正确运行。

最直观地做法是把这些信息封装成一个对象，作为参数传递给视图函数。但是这样的话，所有的视图函数都需要添加对应的参数，即使该函数内部并没有使用到它。

flask 的做法是把这些信息作为类似全局变量的东西，视图函数需要的时候，可以使用 from flask import request 获取。但是这些对象和全局变量不同的是——它们必须是动态的，因为在多线程或者多协程的情况下，每个线程或者协程获取的都是自己独特的对象，不会互相干扰。

那么如何实现这种效果呢？如果对 python 多线程比较熟悉的话，应该知道多线程中有个非常类似的概念 threading.local，可以实现多线程访问某个变量的时候只看到自己的数据。内部的原理说起来也很简单，这个对象有一个字典，保存了线程 id 对应的数据，读取该对象的时候，它动态地查询当前线程 id 对应的数据。flaskpython 上下文的实现也类似，后面会详细解释。

flask 中有两种上下文：application context 和 request context。上下文有关的内容定义在 globals.py 文件，文件的内容也非常短：

def _lookup_req_object(name):
    """
    LocalProxy类实例化时，传了一个偏函数(提前传参了requset，name=request)-->执行_lookup_req_object函数
    top是ctx对象, 到ctx对象去反射request，拿到的就是当次线程或协程的request，session同理
    """
    top = _request_ctx_stack.top
    if top is None:
        raise RuntimeError(_request_ctx_err_msg)
    return getattr(top, name)


def _lookup_app_object(name):
    # top是app_ctx对象，通过反射拿到当次线程或协程的g，因为g是后期加的，没有在_find_app函数的基础上修改
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return getattr(top, name)


def _find_app():
    # top是app_ctx对象，直接top.app拿出当次线程或协程的current_app，没有通过反射
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return top.app


# context locals
_request_ctx_stack = LocalStack()
_app_ctx_stack = LocalStack()
current_app = LocalProxy(_find_app)
request = LocalProxy(partial(_lookup_req_object, 'request'))
session = LocalProxy(partial(_lookup_req_object, 'session'))
g = LocalProxy(partial(_lookup_app_object, 'g'))

flask 提供两种上下文：application context 和 request context 。application context 又演化出来两个变量 current_app 和 g，而 request context 则演化出来 request 和 session。

这里的实现用到了两个东西：LocalStack 和 LocalProxy。它们两个的结果就是我们可以动态地获取两个上下文的内容，在并发程序中每个视图函数都会看到属于自己的上下文，而不会出现混乱。

LocalStack 和 LocalProxy 都是 werkzeug 提供的，定义在 local.py 文件中。在分析这两个类之前，我们先介绍这个文件另外一个基础的类 Local。Local 就是实现了类似 threading.local 的效果——多线程或者多协程情况下全局变量的隔离效果。下面是它的代码：

"""
getcurrent获取协程id，重命名为get_ident
如果开了协程，get_ident拿到的就是协程id，否则拿到的就是线程id
"""
try:
    from greenlet import getcurrent as get_ident
except ImportError:
    try:
        from thread import get_ident
    except ImportError:
        from _thread import get_ident

class Local(object):
    __slots__ = ('__storage__', '__ident_func__')

    def __init__(self):
        # 数据保存在 __storage__ 中，后续访问都是对该属性的操作
        # 把storage放到__init__中，每次实例化时初始化得到一个字典，每次实例化得到local对象，用自己的字典存储
        # LocalStack()实例化得到request_ctx_stack和app_ctx_stack两个对象，都用自己的空间存储，线程号就不会发生冲突
        object.__setattr__(self, '__storage__', {})
        object.__setattr__(self, '__ident_func__', get_ident)

    def __call__(self, proxy):
        """Create a proxy for a name."""
        return LocalProxy(self, proxy)

    # 清空当前线程/协程保存的所有数据
    def __release_local__(self):
        self.__storage__.pop(self.__ident_func__(), None)

    # 下面三个方法实现了属性的访问、设置和删除。
    # 注意到，内部都调用 `self.__ident_func__` 获取当前线程或者协程的 id，然后再访问对应的内部字典。
    # 如果访问或者删除的属性不存在，会抛出 AttributeError。
    # 这样，外部用户看到的就是它在访问实例的属性，完全不知道字典或者多线程/协程切换的实现
    def __getattr__(self, name):
        try:
            return self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

    def __setattr__(self, name, value):
        ident = self.__ident_func__()
        storage = self.__storage__
        try:
            storage[ident][name] = value
        except KeyError:
            storage[ident] = {name: value}

    def __delattr__(self, name):
        try:
            del self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

可以看到，Local 对象内部的数据都是保存在 __storage__ 属性的，这个属性变量是个嵌套的字典：map[ident]map[key]value。最外面字典 key 是线程或者协程的 identity，value 是另外一个字典，这个内部字典就是用户自定义的 key-value 键值对。用户访问实例的属性，就变成了访问内部的字典，外面字典的 key 是自动关联的。__ident_func 是协程的 get_current 或者线程的 get_ident，从而获取当前代码所在线程或者协程的 id。

除了这些基本操作之外，Local 还实现了 __release_local__ ，用来清空（析构）当前线程或者协程的数据（状态）。__call__ 操作来创建一个 LocalProxy 对象，LocalProxy 会在下面讲到。

理解了 Local，我们继续回来看另外两个类。

LocalStack 是基于 Local 实现的栈结构。如果说 Local 提供了多线程或者多协程隔离的属性访问，那么 LocalStack 就提供了隔离的栈访问。下面是它的实现代码，可以看到它提供了 push、pop 和 top 方法。

__release_local__ 可以用来清空当前线程或者协程的栈数据，__call__ 方法返回当前线程或者协程栈顶元素的代理对象。

class LocalStack(object):
    """This class works similar to a :class:`Local` but keeps a stack
    of objects instead. """

    def __init__(self):
        # local()是flask封装的支持多线程和协程的local对象
        self._local = Local()

    def __release_local__(self):
        self._local.__release_local__()

    def __call__(self):
        def _lookup():
            rv = self.top
            if rv is None:
                raise RuntimeError('object unbound')
            return rv
        return LocalProxy(_lookup)

    # push、pop 和 top 三个方法实现了栈的操作，
    # 可以看到栈的数据是保存在 self._local.stack 属性中的
    # obj就是RequestContext传过来的对象ctx
    def push(self, obj):
        """Pushes a new item to the stack"""
        # 判断flask自己封装的local对象中是否有stack属性
        rv = getattr(self._local, 'stack', None)
        if rv is None:
            # 生成stack属性列表,等价于rv=self._local.stack    self._local.stack=[]
            self._local.stack = rv = []
        # 将ctx加到local对象的stack列表中，基于local的特性，rv存放形式类似：{'线程id号':{'stack':[ctx]}}
        rv.append(obj)
        return rv

    def pop(self):
        """Removes the topmost item from the stack, will return the
        old value or `None` if the stack was already empty.
        """
        stack = getattr(self._local, 'stack', None)
        if stack is None:
            return None
        elif len(stack) == 1:
            release_local(self._local)
            return stack[-1]
        else:
            return stack.pop()

    @property
    def top(self):
        """The topmost item on the stack.  If the stack is empty,
        `None` is returned.
        """
        try:
             # 返回列表中最后一个元素，也就是ctx对象或者app_ctx对象
            return self._local.stack[-1]
        except (AttributeError, IndexError):
            return None

我们再回顾flask框架核心代码，和启动流程：

def wsgi_app(self, environ, start_response):
    '''
    1 request_context 调用RequestContext类，把app和environ传过去，实例化得到一个对象
    ctx就是返回的RequestContext类对象，里面包含了当前请求的request(封装后的request对象)、session以及app
    '''
    ctx = self.request_context(environ)
    error = None
    try:
        try:
            '''
            2 ctx调用RequestContext类的push方法，把ctx对象放到flask封装的local对象里的stack列表，通过线程id来区分
            生成一个session对象，把ctx对象和app_ctx对象分别放在某一个列表中	   
            '''
            ctx.push()
            # 路由分发，执行视图函数，返回结果
            response = self.full_dispatch_request()
        except Exception as e:
            # 错误处理，默认是 InternalServerError 错误处理函数，客户端会看到服务器 500 异常
            error = e
            response = self.handle_exception(e)
        except:  # noqa: B001
            error = sys.exc_info()[1]
            raise
        # 返回response对象
        return response(environ, start_response)
    finally:
        if self.should_ignore_error(error):
            error = None
        # 最后无论成功与否，都会将ctx从local对象中移除  
        ctx.auto_pop(error)

上述流程的第二步：ctx.push()，执行流程如下：

# RequestContext类的push方法代码如下；

def push(self):
    # 2.1、调用_request_ctx_stack.top方法也就是全局变量LocalStack的top方法，取出_local.stack对象中最后一个ctx,如果抛出异常则返回None，即top=None,
    # top是ctx对象，列表形式
    top = _request_ctx_stack.top
    if top is not None and top.preserved:
        #2.2、 top._preserved_exc是None,等价于top.pop(None)
        top.pop(top._preserved_exc)
    
    # Before we push the request context we have to ensure that there
    # is an application context.
    # flask源代码最开始只有ctx对象，app_ctx这段是新版本加上的，因此逻辑稍显混乱
    # 2.3、app_ctx流程跟ctx一样    
    app_ctx = _app_ctx_stack.top   
    if app_ctx is None or app_ctx.app != self.app:  # 一开始肯定没有app_ctx对象，每次请求完都清除了
        # 生成一个AppContext对象app_ctx
        app_ctx = self.app.app_context()
        # 同ctx一样，把app_ctx放到_app_ctx_stack列表中
        app_ctx.push()
        self._implicit_app_ctx_stack.append(app_ctx)
    else:
        self._implicit_app_ctx_stack.append(None)
    if hasattr(sys, "exc_clear"):
        sys.exc_clear()
        
   # 2.4、将ctx对象传给LocalStack类中push方法，把ctx对象放到_request_ctx_stack列表中
    _request_ctx_stack.push(self)
   # 2.5 生成一个session对象
    if self.session is None:
        session_interface = self.app.session_interface
        self.session = session_interface.open_session(self.app, self.request)
        # 这一步，是作者用来预防开发者自己定义session规则类没有考虑session为空而导致报错
        if self.session is None:
            self.session = session_interface.make_null_session(self.app)

    if self.url_adapter is not None:
        self.match_request()

LocalProxy 是一个 Local 对象的代理，负责把所有对自己的操作转发给内部的 Local 对象。LocalProxy 的构造函数介绍一个 callable 的参数，这个 callable 调用之后需要返回一个 Local 实例，后续所有的属性操作都会转发给 callable 返回的对象。

全局的current_apprequestsessiong都是LocalProxy对象，通过代理转发拿到当前线程或协程里的application context 和 request context对象。

class LocalProxy(object):
    """Acts as a proxy for a werkzeug local.
    Forwards all operations to a proxied object. """
    __slots__ = ('__local', '__dict__', '__name__')

    def __init__(self, local, name=None):
        # 初始化执行__setattr__，相当于self.'_LocalProxy__local'=local
        # local是LocalProxy类实例化时传进来的，即当次请求的线程或协程里对应的对象       
        object.__setattr__(self, '_LocalProxy__local', local)  
        object.__setattr__(self, '__name__', name)

    def _get_current_object(self):
        """Return the current object."""
        if not hasattr(self.__local, '__release_local__'):
            return self.__local()   # __local在初始化时用_LocalProxy__local隐藏的属性，返回的就是当次请求的线程或协程里对应的对象  
        try:
            return getattr(self.__local, self.__name__)
        except AttributeError:
            raise RuntimeError('no object bound to %s' % self.__name__)
    
    # 重写了所有魔术方法
    @property
    def __dict__(self):
        try:
            return self._get_current_object().__dict__
        except RuntimeError:
            raise AttributeError('__dict__')

    def __getattr__(self, name):
        if name == '__members__':
            return dir(self._get_current_object())
        return getattr(self._get_current_object(), name)

    def __setitem__(self, key, value):
        self._get_current_object()[key] = value

实现的关键是把通过参数传递进来的 Local 的实例保存在 __local 属性中，LocalProxy 重写了所有的魔术方法，当魔术方法触发时，都会执行_get_current_object()方法，返回当前线程或者协程对应的对象(ctx或app_ctx)。因此在视图函数中print(request)，LocalProxy 重写了__str__方法，调用_get_current_object()返回的是当前线程或者协程对应的request对象。

到这里，上下文的实现就比较清晰了：每次有请求过来的时候，push 会先创建当前线程或者进程需要处理的两个重要上下文对象request_context和application context，把它们保存到隔离的栈里面，这样视图函数进行处理的时候就能直接从栈上获取这些信息，压栈后还会保存 session 的信息； pop 则相反，把 request context 和 application context 出栈，做一些清理性的工作。

2 为什么要用 LocalProxy

为什么要使用 LocalProxy？不使用 LocalProxy，直接访问 LocalStack 的对象会有什么问题吗？

首先明确一点，Local 和 LocalStack 实现了不同线程/协程之间的数据隔离。在为什么用 LocalStack 而不是直接使用 Local 的时候，我们说过这是因为 flask 希望在测试或者开发的时候，允许多 app 、多 request 的情况。而 LocalProxy 也是因为这个才引入进来的！

我们拿 current_app = LocalProxy(_find_app) 来举例子。每次使用 current_app 的时候，他都会调用 _find_app 函数，然后对得到的变量进行操作。

如果直接使用 current_app = _find_app() 有什么区别呢？区别就在于，我们导入进来之后，current_app 就不会再变化了。如果有多 app 的情况，就会出现错误，比如：

from flask import current_app

app = create_app()
admin_app = create_admin_app()

def do_something():
    with app.app_context():
        work_on(current_app)
        with admin_app.app_context():
            work_on(current_app)

这里我们出现了嵌套的 app，每个 with 上下文都需要操作其对应的 app，如果不使用 LocalProxy 是做不到的。

对于 request 也是类似！但是这种情况真的很少发生，有必要费这么大的功夫增加这么多复杂度吗？

其实还有一个更大的问题，这个例子也可以看出来。比如我们知道 current_app 是动态的，因为它背后对应的栈会 push 和 pop 元素进去。那刚开始的时候，栈一定是空的，只有在 with app.app_context() 这句的时候，才把栈数据 push 进去。而如果不采用 LocalProxy 进行转发，那么在最上面导入 from flask import current_app 的时候，current_app 就是空的，因为这个时候还没有把数据 push 进去，后面调用的时候根本无法使用。

所以为什么需要 LocalProxy 呢？简单总结一句话：因为上下文保存的数据是保存在栈里的，并且会动态发生变化。如果不是动态地去访问，会造成数据访问异常。

posted @ 2022-10-09 16:20 不会钓鱼的猫阅读(204) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Just so so

Flask源码解析：上下文

1 上下文（application context 和 request context）

2 为什么要用 LocalProxy

公告