urllib3中学到的LRU算法
介绍
urllib3._collections.py::RecentlyUserContainer类,是一个线程安全的Dict类容器,用来维护一定数量(maxsize)的Key-Value映射, 当数量超出,会删除最近最少使用
的key。该类实际用在PoolManager中存储Session中一定连接数量的连接池,当连接数超出时,会自动删除最早保存的连接,并且查找连接的时间复杂度是O(1)。
原理
其原理是通过有序字典(OrderedDict)保存Key的插入顺序,但Key的数量超出阈值时,删除最早插入的Key
from collections import OrderedDict
from threading import RLock
class RecentlyUserContainer:
def __init__(self, maxsize):
self._maxsize = maxsize
self._container = OrderedDict()
self.lock = RLock()
def __getitem__(self, key):
with self.lock:
# 注意 这里先取出再插入的目的是更新该Key的顺序,即它最近被访问过
item = self._container.pop(key)
self._container[key] = item
return item
def __setitem__(self, key, value):
with self.lock:
self._container[key] = value
if len(self._container) > self._maxsize:
# 驱逐出最早插入的Key
key, _ = self._container.popitem(last=False)
print(f"{key} is evicted")
使用
>>> pool = RecentlyUserContainer(3)
>>> pool["https://www.1.com"] = "conn1"
>>> pool["https://www.2.com"] = "conn2"
>>> pool["https://www.3.com"] = "conn3"
>>> pool["https://www.4.com"] = "conn4"
www.1.com is evicted
>>> pool._container
OrderedDict([('https://www.2.com', 'conn2'),
('https://www.3.com', 'conn3'),
('https://www.4.com', 'conn4')])