Python recipe(10):逐段处理文件
代码何在?
Example Source Code [http://www.cnblogs.com/tomsheep/]
''' Created on 2010-5-22 @author: lk ''' class Paragraphs: def __init__(self, fileobj, seperator = '\n'): self.seq = fileobj.readlines() self.line_num = 0 self.para_num = 0 if seperator[-1:]!='\n':seperator += '\n' self.seperator = seperator def __getitem__(self, index): if index != self.para_num: raise TypeError, 'only sequential access supported' #get the first line of current paragraph self.para_num += 1 while 1: line = self.seq[self.line_num] self.line_num += 1 if line != self.seperator: break result = [line] #get the rest while 1: # line = self.seq[self.line_num] #tag1:
try: line = self.seq[self.line_num] except IndexError: break self.line_num += 1 if line == self.seperator: break result.append(line) return ''.join(result) if __name__ == '__main__': text = Paragraphs(open("test.txt")) for para in text: print para
以上代码改写自Python Cookbook 4-9
概述:
逐段处理文件。自定义Paragraphs类,实现容器行为函数__getitem__
代码说明:
1.__getitem__函数可以使自定义类型具有容器行为,x[key]访问
2.在编写代码时,一开始在tag1处没有使用try…except,但是奇怪的是代码运行时并没有抛出IndexError,而是少打出一个para,想了一阵,觉得应该是在__getitem__外层捕捉了IndexError,自己尝试了一下,在__getitem__中手动raise IndexError,果然没有抛出(而ValueError就会抛出)可见__getitem__机制调用外层捕获了IndexError