python之 可迭代 迭代器 生成器
0.
1.总结
(1)
(a)iterable 可迭代(对象) 能力属性
指一个对象能够一次返回它的一个成员,for i in a_list 而不需要通过下标完成迭代。
例子包括所有序列类型(list, str, tuple), 以及 dict, file, 还包括定义了 __iter__() 或 __getitem__() 方法的类实例。
(b)iterator 迭代器 具体实现
代表数据流的对象。重复调用迭代器的 next() (python3为 __next__()) 方法将依次返回流中的项。当没有更多可用数据时,将抛出 StopIteration 异常。
(c)两者关系 iterator = iter(iterable)
使用可迭代对象时,通常不需要调用 iter() 或亲自处理迭代器对象。for 语句会自动创建一个临时未命名变量,以便在循环期间保存迭代器。
(2) 名叫 generator 生成器 的 iterator 迭代器:通过某种算法“生成”一系列数据,通过 for 循环 或 next() 逐个返回数据。
(a)简单算法:list comprehension 列表生成式直接返回完整列表,generator expression 生成器表达式返回名叫 generator 生成器 的 iterator 迭代器。
In [84]: [i**2 for i in range(10) if i % 2 == 0] Out[84]: [0, 4, 16, 36, 64] In [85]: (i**2 for i in range(10) if i % 2 == 0) Out[85]: <generator object <genexpr> at 0x000000000A284CA8> In [86]: [i**2 for i in range(10) if i % 2 == 0] Out[86]: [0, 4, 16, 36, 64] In [87]: gen = (i**2 for i in range(10) if i % 2 == 0) In [88]: gen Out[88]: <generator object <genexpr> at 0x000000000A3AE510> In [89]: gen.next() Out[89]: 0 In [90]: gen.next() Out[90]: 4 In [91]: for i in gen: ...: print i ...: 16 36 64 In [92]: gen.next() --------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-92-b2c61ce5e131> in <module>() ----> 1 gen.next() StopIteration:
(b)复杂算法:普通函数含有 yield 变成“生成器函数”,调用该函数返回名叫 generator 生成器 的 iterator 迭代器。
每次遇到 yield 将临时挂起,并保存当前执行状态(包括局部变量和 try 语句)。当生成器恢复执行时,将从挂起的位置继续执行,而不是像调用函数一样每次从头开始执行。
参考廖雪峰 生成器 斐波拉契数列(Fibonacci),除第一个和第二个数外,任意一个数都可由前两个数相加得到:
In [96]: def fib(max): ...: n, a, b = 0, 0, 1 ...: while n < max: ...: print b ...: a, b = b, a + b ...: n = n + 1 ...: ...: In [97]: fib(6) 1 1 2 3 5 8 In [98]: def fib(max): ...: n, a, b = 0, 0, 1 ...: while n < max: ...: yield b #将 print 修改为 yield ...: a, b = b, a + b ...: n = n + 1 ...: ...: In [99]: gen = fib(6) In [100]: gen? Type: generator String form: <generator object fib at 0x000000000A417F30> Docstring: <no docstring> In [101]: gen = fib(6) In [102]: gen.next() Out[102]: 1 In [103]: for i in gen: ...: print i ...: 1 2 3 5 8 In [104]: gen.next() --------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-104-b2c61ce5e131> in <module>() ----> 1 gen.next() StopIteration:
(c)判断是否可迭代,是否生成器类型
In [131]: from collections import Iterable In [132]: isinstance(fib(6), Iterable) Out[132]: True In [133]: import types In [134]: isinstance(fib(6), types.GeneratorType) Out[134]: True
(d)相比如下通过在类中定义 def __iter__(self) 和 def next(self) 来实现可迭代,仅仅把 print b 改为了 yield b,就在保持简洁性的同时获得了 iterable 的效果。
In [106]: class Fib(object): ...: ...: def __init__(self, max): ...: self.max = max ...: self.n, self.a, self.b = 0, 0, 1 ...: ...: def __iter__(self): ...: return self ...: ...: def next(self): ...: if self.n < self.max: ...: r = self.b ...: self.a, self.b = self.b, self.a + self.b ...: self.n = self.n + 1 ...: return r ...: raise StopIteration() ...: In [107]: gen = Fib(6) In [108]: gen Out[108]: <__main__.Fib at 0xa3cd160> In [109]: gen? Type: Fib String form: <__main__.Fib object at 0x000000000A3CD160> Docstring: <no docstring> In [110]: gen.next() Out[110]: 1 In [111]: for i in gen: ...: print i ...: 1 2 3 5 8 In [112]: gen.next() --------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-112-b2c61ce5e131> in <module>() ----> 1 gen.next() <ipython-input-106-0ae2acae18e3> in next(self) 14 self.n = self.n + 1 15 return r ---> 16 raise StopIteration() 17 StopIteration: In [113]:
2.参考资料
Iterables vs. Iterators vs. Generators
3.官网资料
https://docs.python.org/2/glossary.html#term-generator
http://python.usyiyi.cn/documents/python_278/glossary.html
http://python.usyiyi.cn/documents/python_352/glossary.html
sequence 序列
An iterable which supports efficient element access using integer indices via the __getitem__()
special method and defines a len()
method that returns the length of the sequence.
Some built-in sequence types are list
, str
, tuple
, and unicode
.
Note that dict
also supports __getitem__()
and __len__()
, but is considered a mapping rather than a sequence because the lookups use arbitrary immutablekeys rather than integers.
一个可迭代对象,它支持通过 __getitem__() 特殊方法使用整数索引来访问元素,并定义 len() 方法来返回该序列的长度。
一些内建序列类型包括 list, str, tuple 和 unicode (python3: bytes)。
注意 dict 也支持 __getitem__() 和 len() ,但由于它通过任意 immutable 不可变的 keys 而不是整数来查找,dict 被认为是映射而不是序列。
iterable 可迭代(对象)
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list
, str
, and tuple
) and some non-sequence types like dict
and file
and objects of any classes you define with an __iter__()
or __getitem__()
method.
Iterables can be used in a for
loop and in many other places where a sequence is needed (zip()
, map()
, …).
When an iterable object is passed as an argument to the built-in function iter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter()
or deal with iterator objects yourself. The for
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.
可迭代(对象)是指一个对象能够一次返回它的一个成员,例子包括所有序列类型(list, str, tuple), 以及 dict, file, 还包括定义了 __iter__() 或 __getitem__() 方法的类实例。
可迭代可以应用在 for 循环,以及需要一个序列的其他场合(zip(), map(), ...)。
#zip([iterable, …]) In [7]: zip([1,2,3],[4,5,6]) Out[7]: [(1, 4), (2, 5), (3, 6)] #map(function, iterable, …) In [11]: map(lambda x,y:x+y,[1,2,3],[4,5,6]) Out[11]: [5, 7, 9]
廖雪峰 迭代 :在Python中,迭代是通过for ... in
来完成的,而很多语言比如C或者Java,迭代list是通过下标完成的。
当可迭代对象作为参数传递给内建函数 iter() 时,将返回对象的迭代器。这个迭代器。。。这个迭代器对于一组值是有利的。
使用可迭代对象时,通常不需要调用 iter() 或亲自处理迭代器对象。for 语句会自动创建一个临时未命名变量,以便在循环期间保存迭代器。
In [33]: iter('abc') Out[33]: <iterator at 0xa387588> In [34]: iter([1,2,3]) Out[34]: <listiterator at 0xa387da0> In [35]: iter((1,2,3)) Out[35]: <tupleiterator at 0xa387fd0> In [36]: d={'a':1,'b':2} In [37]: iter(d) Out[37]: <dictionary-keyiterator at 0xa33def8> In [38]: d.items() Out[38]: [('a', 1), ('b', 2)] In [39]: d.iteritems() Out[39]: <dictionary-itemiterator at 0x3e96458> In [40]: d.iterkeys() Out[40]: <dictionary-keyiterator at 0xa3a24f8> In [41]: d.itervalues() Out[41]: <dictionary-valueiterator at 0xa3750e8>
iterator 迭代器
An object representing a stream of data. Repeated calls to the iterator’s next()
method return successive items in the stream. When no more data are available a StopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its next()
method just raise StopIteration
again.
Iterators are required to have an __iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes.
A container object (such as a list
) produces a fresh new iterator each time you pass it to theiter()
function or use it in a for
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
More information can be found in Iterator Types.
代表数据流的对象。重复调用迭代器的 next() (python3为 __next__()) 方法将依次返回流中的项。当没有更多可用数据时,将抛出 StopIteration 异常。此时,迭代器对象被耗尽,并且对其 next() 方法的任何进一步调用都会再次引发 StopIteration。
迭代器需要定义 __iter__() 方法来返回迭代器对象本身,因此每一个迭代器也是可迭代对象,并且可以被用于大多数接受可迭代对象的场合。一个值得注意的例外是尝试。。。一个值得注意的例外是尝试多次迭代的代码。
每次将容器对象(比如 list) 传递给 iter() 函数或将其用于 for 循环,都将生成一个全新的迭代器。
https://docs.python.org/2/library/stdtypes.html#typeiter
http://python.usyiyi.cn/documents/python_278/library/stdtypes.html#typeiter
http://python.usyiyi.cn/documents/python_352/library/stdtypes.html#typeiter
5.5. Iterator Types 迭代器
New in version 2.2.
Python supports a concept of iteration over containers. This is implemented using two distinct methods; these are used to allow user-defined classes to support iteration.
Sequences, described below in more detail, always support the iteration methods.
One method needs to be defined for container objects to provide iteration support:
Python支持容器上迭代的概念。这种实现使用了两种独特的方法;它们被用于让用户定义的类支持迭代。
“序列”都支持迭代方法。
容器对象需要定义一个方法以支持迭代:
container.
__iter__
()-
Return an iterator object. The object is required to support the iterator protocol described below. If a container supports different types of iteration, additional methods can be provided to specifically request iterators for those iteration types. (An example of an object supporting multiple forms of iteration would be a tree structure which supports both breadth-first and depth-first traversal.) This method corresponds to the
tp_iter
slot of the type structure for Python objects in the Python/C API.返回一个迭代器对象。该对象必须支持如下所述的迭代器协议。如果一个容器支持不同类型的迭代,可以提供额外的方法来返回相应的迭代器。(对象支持多种迭代形式的一个示例是支持广度和深度优先遍历的树结构)。
The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:
迭代器对象本身需要支持以下两种方法,它们组合在一起形成迭代器协议:
iterator.
__iter__
()-
Return the iterator object itself. This is required to allow both containers and iterators to be used with the
for
andin
statements. This method corresponds to thetp_iter
slot of the type structure for Python objects in the Python/C API.返回迭代器对象本身。它使得容器和迭代器能够应用于 for 和 in 语句。
iterator.
next
()-
Return the next item from the container. If there are no further items, raise the
StopIteration
exception. This method corresponds to thetp_iternext
slot of the type structure for Python objects in the Python/C API.从容器中返回下一个元素。如果没有更多的元素,则引发 StopIteration 异常。
Python defines several iterator objects to support iteration over general and specific sequence types, dictionaries, and other more specialized forms. The specific types are not important beyond their implementation of the iterator protocol.
Python定义了几个迭代器对象,以支持在通用和特定的序列类型、字典以及其他更多特殊形式上的迭代。相比迭代器协议的实现,具体的类型并不重要。
The intention of the protocol is that once an iterator’s next()
method raises StopIteration
, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken. (This constraint was added in Python 2.3; in Python 2.2, various iterators are broken according to this rule.)
该协议的意图是一旦迭代器的 next() 方法引发 StopIteration ,后续调用将继续这样的行为。不遵守此性质的实现被认为是有问题的。
5.5.1. Generator Types 生成器
Python’s generators provide a convenient way to implement the iterator protocol. If a container object’s __iter__()
method is implemented as a generator, it will automatically return an iterator object (technically, a generator object) supplying the __iter__()
andnext()
methods. More information about generators can be found in the documentation for the yield expression.
Python的生成器提供了一种方便的方法来实现迭代器协议。如果容器对象的 __iter__() 方法实现为一个生成器,它将自动返回一个提供 __iter__() 和 next() (python3为 __next__())方法的迭代器对象(从技术上讲,是生成器对象)。有关生成器的更多信息可以在yield表达式的文档中找到。
https://docs.python.org/2/reference/expressions.html#yieldexpr
5.2.10. Yield expressions 通过 yield 定义“生成器函数”,调用时返回一个被称为“生成器”的“迭代器”,具有 .next() 以及 StopIteration
yield_atom ::= “(”yield_expression
“)” yield_expression ::= “yield” [expression_list
]
New in version 2.5.
The yield
expression is only used when defining a generator function, and can only be used in the body of a function definition. Using ayield
expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.
When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function. The execution starts when one of the generator’s methods is called. At that time, the execution proceeds to the first yield
expression, where it is suspended again, returning the value of expression_list
to generator’s caller. By suspended we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack. When the execution is resumed by calling one of the generator’s methods, the function can proceed exactly as if the yield
expression was just another external call. The value of the yield
expression after resuming depends on the method which resumed the execution.
All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; the control is always transferred to the generator’s caller.
yield 表达式只用于定义生成器函数,且只能用于函数的定义体中。在函数定义中使用 yield 表达式就可以充分使得该函数定义创建一个生成器函数而不是普通的函数。
当调用生成器函数时,它返回一个称为生成器的迭代器。然后该生成器控制生成器函数的执行。当调用生成器的其中一个方法时,执行开始。此时,执行会行进到第一个 yield 表达式,在那里执行被挂起并返回expression_list的值给生成器的调用者。挂起的意思是保存所有的局部状态,包括当前局部变量的绑定、指令的指针和内部的计算栈。当通过调用生成器的一个方法来恢复执行时,函数可以准确地继续执行就好像 yield 表达式只是一个外部的调用。恢复执行后 yield 表达式的值取决于恢复执行的方法。
所有这些使得生成器函数与协程非常类似;它们可以 yield 多次,它们有多个入口点且它们的执行可以挂起。唯一的区别是生成器函数不可以控制 yield 之后执行应该从何处继续;控制始终被转让给生成器的调用者。
5.2.10.1. Generator-iterator methods
This subsection describes the methods of a generator iterator. They can be used to control the execution of a generator function.
该小节讲述“生成器迭代器”的方法。它们可用于控制生成器函数的执行。
Note that calling any of the generator methods below when the generator is already executing raises a ValueError
exception.
注意当生成器已经在执行时调用下面的任何一个生成器方法都将引发 ValueError 异常?????
generator.
next
()-
Starts the execution of a generator function or resumes it at the last executed
yield
expression. When a generator function is resumed with anext()
method, the currentyield
expression always evaluates toNone
. The execution then continues to the nextyield
expression, where the generator is suspended again, and the value of theexpression_list
is returned tonext()
’s caller. If the generator exits without yielding another value, aStopIteration
exception is raised.开始生成器函数的执行或者在最后一次执行的yield表达式处恢复执行。当生成器函数使用next()方法恢复执行时,当前的yield表达式始终None。然后执行继续行进到下一个yield表达式,在那里生成器被再次挂起并返回expression_list的值给next()的调用者。如果生成器退出时没有yield另外一个值,则引发一个StopIteration异常。
generator.
send
(value)-
Resumes the execution and “sends” a value into the generator function. The
value
argument becomes the result of the currentyield
expression. Thesend()
method returns the next value yielded by the generator, or raisesStopIteration
if the generator exits without yielding another value. Whensend()
is called to start the generator, it must be called withNone
as the argument, because there is noyield
expression that could receive the value.恢复执行并“发送”一个值到生成器中。该value参数成为当前yield表达式的结果。send()方法返回生成器yield的下一个值,如果生成器退出时没有yield另外一个值则引发StopIteration。 当调用send()用于(第一次)开始生成器的执行时,它必须以None作为参数进行调用,因为没有接受该值的yield表达式。
generator.
throw
(type[, value[, traceback]])-
Raises an exception of type
type
at the point where generator was paused, and returns the next value yielded by the generator function. If the generator exits without yielding another value, aStopIteration
exception is raised. If the generator function does not catch the passed-in exception, or raises a different exception, then that exception propagates to the caller.在生成器暂停的地方引发一个type类型的异常,并返回生成器函数yield的下一个值。如果生成器在退出时没有yield一个值,则引发StopIteration异常。如果生成器函数没有捕获传递进来的异常或者引发一个不同的异常,那么该异常将传播到调用者。
generator.
close
()-
Raises a
GeneratorExit
at the point where the generator function was paused. If the generator function then raisesStopIteration
(by exiting normally, or due to already being closed) orGeneratorExit
(by not catching the exception), close returns to its caller. If the generator yields a value, aRuntimeError
is raised. If the generator raises any other exception, it is propagated to the caller.close()
does nothing if the generator has already exited due to an exception or normal exit.在生成器函数暂停的地方引发一个GeneratorExit。如果生成器函数此后引发StopIteration(正常退出或者由于已经正在关闭)或者GeneratorExit(没有捕获该异常),close会返回到调用者。如果生成器yield一个值,则引发一个RuntimeError。如果生成器引发其它任何异常,它会被传播到调用者。如果生成器已经由于异常退出或正常退出,close()不会做任何事情。
Here is a simple example that demonstrates the behavior of generators and generator functions:
这里有个简单的例子演示生成器和生成器函数的行为:
>>> def echo(value=None): ... print "Execution starts when 'next()' is called for the first time." ... try: ... while True: ... try: ... value = (yield value) ... except Exception, e: ... value = e ... finally: ... print "Don't forget to clean up when 'close()' is called." ... >>> generator = echo(1) >>> print generator.next() Execution starts when 'next()' is called for the first time. 1 >>> print generator.next() None >>> print generator.send(2) 2 >>> generator.throw(TypeError, "spam") TypeError('spam',) >>> generator.close() Don't forget to clean up when 'close()' is called.
https://docs.python.org/2/glossary.html#term-generator
generator 生成器(函数)
A function which returns an iterator. It looks like a normal function except that it contains yield
statements for producing a series of values usable in a for-loop or that can be retrieved one at a time with the next()
function. Each yield
temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator resumes, it picks-up where it left-off (in contrast to functions which start fresh on every invocation).
返回一个迭代器 iterator 的函数。它看起来像一个普通函数,除了它包含yield表达式,用于产生一系列在 for 循环中可用的值,或者可以使用 next() 函数一次获取一个值。
每次遇到 yield 将临时挂起,并保存当前执行状态(包括局部变量和 try 语句)。当生成器恢复执行时,将从挂起的位置继续执行,而不是像调用函数一样每次从头开始执行。
通常指生成器函数,但在某些上下文中可以引用生成器迭代器。在预期意义不清楚的情况下,使用完整术语避免歧义。
generator iterator 生成器迭代器
https://docs.python.org/3/glossary.html#term-generator-iterator
由generator函数创建的对象。
每个yield
暂时挂起处理,记住位置执行状态(包括局部变量和待处理的try语句)。当生成器迭代器恢复时,它会在其中删除的位置(与在每次调用时开始的函数相反)。
generator expression 生成器表达式
An expression that returns an iterator. It looks like a normal expression followed by a for
expression defining a loop variable, range, and an optional if
expression. The combined expression generates values for an enclosing function:
返回迭代器的表达式。它看起来像是一个正常表达式,后面是定义循环变量,范围和可选的if
表达式的for
表达式。组合表达式生成包围函数的值:
In [3]: [i*i for i in range(10)] Out[3]: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] In [4]: (i*i for i in range(10)) Out[4]: <generator object <genexpr> at 0x000000000A284F78> In [5]: sum(i*i for i in range(10)) Out[5]: 285
https://docs.python.org/2/glossary.html#term-list-comprehension
list comprehension 列表推导式
A compact way to process all or part of the elements in a sequence and return a list with the results. result = ["0x%02x" % x for x inrange(256) if x % 2 == 0]
generates a list of strings containing even hex numbers (0x..) in the range from 0 to 255. The if
clause is optional. If omitted, all elements in range(256)
are processed.
list推导式
一种处理序列中所有或部分元素并返回结果列表的紧凑方法。result = ['{:#04x}'.format(x) for x in range(256) if x % 2 == 0]
generates a list of strings containing even hex numbers (0x..) in the range from 0 to 255. if
子句是可选的。如果省略,则处理range(256)
中的所有元素。