PEP 3106 -- Revamping(改进) dict.keys(), .values() and .items()
2017-12-12 11:21 很大很老实 阅读(268) 评论(0) 编辑 收藏 举报1. Abstract(摘要)
This PEP proposes(建议) to change the .keys(), .values() and .items() methods of the built-in dict type to return a set-like or unordered container object whose contents are derived from the underlying(潜在的) dictionary rather than a list which is a copy of the keys, etc.; and to remove the .iterkeys(), .itervalues() and .iteritems() methods.
The approach is inspired(灵感) by that taken in the Java Collections Framework [1].
2.Introduction
It has long been the plan to change the .keys(), .values() and .items() methods of the built-in dict type to return a more lightweight object than a list, and to get rid of .iterkeys(), .itervalues() and .iteritems(). The idea is that code that currently (in 2.x) reads:
for k, v in d.iteritems(): ...
should be rewritten as:
for k, v in d.items(): ...
(and similar for .itervalues() and .iterkeys(), except the latter is redundant since we can write that loop as for k in d.)
Code that currently reads:
a = d.keys() # assume we really want a list here
(etc.) should be rewritten as
a = list(d.keys())
There are (at least) two ways to accomplish(实现) this. The original plan was to simply let .keys(), .values() and .items() return an iterator, i.e. exactly what iterkeys(), itervalues() and iteritems() return in Python 2.x. However, the Java Collections Framework [1] suggests that a better solution is possible: the methods return objects with set behavior (for .keys() and .items()) or multiset (== bag) behavior (for .values()) that do not contain copies of the keys, values or items, but rather reference the underlying dict and pull their values out of the dict as needed.
The advantage(优势) of this approach is that one can still write code like this:
a = d.items()
for k, v in a: ...
# And later, again:
for k, v in a: ...
Effectively, iter(d.keys()) (etc.) in Python 3.0 will do what d.iterkeys() (etc.) does in Python 2.x; but in most contexts we don't have to write the iter() call because it is implied by a for-loop.
在python3.0中,iter(d.keys())等同于python2.0的d.iterkeys() ,但是,在大多数场景下,我们不需要嗲用iter(),因为,可以使用简单的for循环实现。
list可迭代化,但不是迭代器。
https://stackoverflow.com/questions/45458631/how-python-built-in-function-iter-convert-a-python-list-to-an-iterator
The objects returned by the .keys() and .items() methods behave like sets. The object returned by the values() method behaves like a much simpler unordered collection -- it cannot be a set because duplicate values are possible.
Because of the set behavior, it will be possible to check whether two dicts have the same keys by simply testing:
if a.keys() == b.keys(): ...
and similarly for .items().
These operations are thread-safe only to the extent that using them in a thread-unsafe way may cause an exception but will not cause corruption of the internal representation.
As in Python 2.x, mutating a dict while iterating over it using an iterator has an undefined effect and will in most cases raise a RuntimeError exception. (This is similar to the guarantees made by the Java Collections Framework.)
The objects returned by .keys() and .items() are fully interoperable with instances of the built-in set and frozenset types; for example:
set(d.keys()) == d.keys()
is guaranteed to be True (except when d is being modified simultaneously by another thread).
Specification
I'm using pseudo-code to specify the semantics:
class dict:
# Omitting all other dict methods for brevity.
# The .iterkeys(), .itervalues() and .iteritems() methods
# will be removed.
def keys(self):
return d_keys(self)
def items(self):
return d_items(self)
def values(self):
return d_values(self)
class d_keys:
def __init__(self, d):
self.__d = d
def __len__(self):
return len(self.__d)
def __contains__(self, key):
return key in self.__d
def __iter__(self):
for key in self.__d:
yield key
# The following operations should be implemented to be
# compatible with sets; this can be done by exploiting
# the above primitive operations:
#
# <, <=, ==, !=, >=, > (returning a bool)
# &, |, ^, - (returning a new, real set object)
#
# as well as their method counterparts (.union(), etc.).
#
# To specify the semantics, we can specify x == y as:
#
# set(x) == set(y) if both x and y are d_keys instances
# set(x) == y if x is a d_keys instance
# x == set(y) if y is a d_keys instance
#
# and so on for all other operations.
class d_items:
def __init__(self, d):
self.__d = d
def __len__(self):
return len(self.__d)
def __contains__(self, (key, value)):
return key in self.__d and self.__d[key] == value
def __iter__(self):
for key in self.__d:
yield key, self.__d[key]
# As well as the set operations mentioned for d_keys above.
# However the specifications suggested there will not work if
# the values aren't hashable. Fortunately, the operations can
# still be implemented efficiently. For example, this is how
# intersection can be specified:
def __and__(self, other):
if isinstance(other, (set, frozenset, d_keys)):
result = set()
for item in other:
if item in self:
result.add(item)
return result
if not isinstance(other, d_items):
return NotImplemented
d = {}
if len(other) < len(self):
self, other = other, self
for item in self:
if item in other:
key, value = item
d[key] = value
return d.items()
# And here is equality:
def __eq__(self, other):
if isinstance(other, (set, frozenset, d_keys)):
if len(self) != len(other):
return False
for item in other:
if item not in self:
return False
return True
if not isinstance(other, d_items):
return NotImplemented
# XXX We could also just compare the underlying dicts...
if len(self) != len(other):
return False
for item in self:
if item not in other:
return False
return True
def __ne__(self, other):
# XXX Perhaps object.__ne__() should be defined this way.
result = self.__eq__(other)
if result is not NotImplemented:
result = not result
return result
class d_values:
def __init__(self, d):
self.__d = d
def __len__(self):
return len(self.__d)
def __contains__(self, value):
# This is slow, and it's what "x in y" uses as a fallback
# if __contains__ is not defined; but I'd rather make it
# explicit that it is supported.
for v in self:
if v == value:
return True
return False
def __iter__(self):
for key in self.__d:
yield self.__d[key]
def __eq__(self, other):
if not isinstance(other, d_values):
return NotImplemented
if len(self) != len(other):
return False
# XXX Sometimes this could be optimized, but these are the
# semantics: we can't depend on the values to be hashable
# or comparable.
olist = list(other)
for x in self:
try:
olist.remove(x)
except ValueError:
return False
assert olist == []
return True
def __ne__(self, other):
result = self.__eq__(other)
if result is not NotImplemented:
result = not result
return result