weakref of Python
weakref
赋值运算产生的对象的引用是 强 引用。强引用的存在, 会阻止垃圾回收器回收对象。
在某些场景下, 不希望由于特殊的引用,而不能回收内存。
例如缓存器 和 映射保持器, 这两个可以对大内存对象进行管理。
https://docs.python.org/3.5/library/weakref.html
The
weakref
module allows the Python programmer to create weak references to objects.In the following, the term referent means the object which is referred to by a weak reference.
A weak reference to an object is not enough to keep the object alive: when the only remaining references to a referent are weak references, garbage collection is free to destroy the referent and reuse its memory for something else. However, until the object is actually destroyed the weak reference may return the object even if there are no strong references to it.
A primary use for weak references is to implement caches or mappings holding large objects, where it’s desired that a large object not be kept alive solely because it appears in a cache or mapping.
普通的引用增加应用计数, 阻止垃圾回收。弱引用既可以实现引用的效果, 同时不影响垃圾回收。
https://pymotw.com/3/weakref/index.html#module-weakref
Purpose: Refer to an “expensive” object, but allow its memory to be reclaimed by the garbage collector if there are no other non-weak references. The
weakref
module supports weak references to objects. A normal reference increments the reference count on the object and prevents it from being garbage collected. This outcome is not always desirable, especially when a circular reference might be present or when a cache of objects should be deleted when memory is needed. A weak reference is a handle to an object that does not keep it from being cleaned up automatically.
demo
# weakref_ref.py import weakref class ExpensiveObject: def __del__(self): print('(Deleting {})'.format(self)) obj = ExpensiveObject() r = weakref.ref(obj) print('obj:', obj) print('ref:', r) print('r():', r()) print('deleting obj') del obj print('r():', r())
In this case, since
obj
is deleted before the second call to the reference, theref
returnsNone
.$ python3 weakref_ref.py obj: <__main__.ExpensiveObject object at 0x1007b1a58> ref: <weakref at 0x1007a92c8; to 'ExpensiveObject' at 0x1007b1a58> r(): <__main__.ExpensiveObject object at 0x1007b1a58> deleting obj (Deleting <__main__.ExpensiveObject object at 0x1007b1a58>) r(): None
代理demo
弱引用本身是引用, 需要call之后才能获取原始对象。
代理则可以直接获取原始对象。
It is sometimes more convenient to use a proxy, rather than a weak reference. Proxies can be used as though they were the original object, and do not need to be called before the object is accessible. As a consequence, they can be passed to a library that does not know it is receiving a reference instead of the real object.
import weakref class ExpensiveObject: def __init__(self, name): self.name = name def __del__(self): print('(Deleting {})'.format(self)) obj = ExpensiveObject('My Object') r = weakref.ref(obj) p = weakref.proxy(obj) print('via obj:', obj.name) print('via ref:', r().name) print('via proxy:', p.name) del obj print('via proxy:', p.name)
If the proxy is accessed after the referent object is removed, a
ReferenceError
exception is raised.$ python3 weakref_proxy.py via obj: My Object via ref: My Object via proxy: My Object (Deleting <__main__.ExpensiveObject object at 0x1007aa7b8>) Traceback (most recent call last): File "weakref_proxy.py", line 30, in <module> print('via proxy:', p.name) ReferenceError: weakly-referenced object no longer exists
WeakKeyDictionary 做缓存
如果有一组数据需要缓存管理, 单独一个一个做弱引用, 将是繁琐的。
使用WeakKeyDictionary能很好解决这个问题。
#weakref_valuedict.py import gc from pprint import pprint import weakref gc.set_debug(gc.DEBUG_UNCOLLECTABLE) class ExpensiveObject: def __init__(self, name): self.name = name def __repr__(self): return 'ExpensiveObject({})'.format(self.name) def __del__(self): print(' (Deleting {})'.format(self)) def demo(cache_factory): # hold objects so any weak references # are not removed immediately all_refs = {} # create the cache using the factory print('CACHE TYPE:', cache_factory) cache = cache_factory() for name in ['one', 'two', 'three']: o = ExpensiveObject(name) cache[name] = o all_refs[name] = o del o # decref print(' all_refs =', end=' ') pprint(all_refs) print('\n Before, cache contains:', list(cache.keys())) for name, value in cache.items(): print(' {} = {}'.format(name, value)) del value # decref # remove all references to the objects except the cache print('\n Cleanup:') del all_refs gc.collect() print('\n After, cache contains:', list(cache.keys())) for name, value in cache.items(): print(' {} = {}'.format(name, value)) print(' demo returning') return demo(dict) print() demo(weakref.WeakValueDictionary)
Any loop variables that refer to the values being cached must be cleared explicitly so the reference count of the object is decremented. Otherwise, the garbage collector will not remove the objects and they will remain in the cache. Similarly, the
all_refs
variable is used to hold references to prevent them from being garbage collected prematurely.$ python3 weakref_valuedict.py CACHE TYPE: <class 'dict'> all_refs = {'one': ExpensiveObject(one), 'three': ExpensiveObject(three), 'two': ExpensiveObject(two)} Before, cache contains: ['one', 'three', 'two'] one = ExpensiveObject(one) three = ExpensiveObject(three) two = ExpensiveObject(two) Cleanup: After, cache contains: ['one', 'three', 'two'] one = ExpensiveObject(one) three = ExpensiveObject(three) two = ExpensiveObject(two) demo returning (Deleting ExpensiveObject(one)) (Deleting ExpensiveObject(three)) (Deleting ExpensiveObject(two)) CACHE TYPE: <class 'weakref.WeakValueDictionary'> all_refs = {'one': ExpensiveObject(one), 'three': ExpensiveObject(three), 'two': ExpensiveObject(two)} Before, cache contains: ['one', 'three', 'two'] one = ExpensiveObject(one) three = ExpensiveObject(three) two = ExpensiveObject(two) Cleanup: (Deleting ExpensiveObject(one)) (Deleting ExpensiveObject(three)) (Deleting ExpensiveObject(two)) After, cache contains: [] demo returningThe
WeakKeyDictionary
works similarly but uses weak references for the keys instead of the values in the dictionary.