ch3:文件处理与异常
如何从文件读入数据?
python中的基本输入机制是基于行的;
python中标准的“打开-处理-关闭”代码:
the_file=open('文件全称')
#处理文件中的数据
the_file.close()
使用IDLE来感受python的文件输入机制;
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import os #从标准库导入os >>> os.getcwd() #查看当前工作目录 'C:\\Python35-32' >>> os.chdir('D:\workspace\headfirstpython\chapter3') #改变当前工作目录到包括数据文件的文件夹中 >>> os.getcwd() #确认当前工作目录在正确的目录下 'D:\\workspace\\headfirstpython\\chapter3' >>> ##打开数据文件,从文件读取前两行,并打印到屏幕上 >>> data=open("sketch.txt') SyntaxError: EOL while scanning string literal >>> #以上提示结尾处字符串错误,发现是因为引号没有成对使用所致 >>> data=open('sketch.txt') >>> print(data.readline (), end='') #使用"readline()"方法从文件获取一个数据行 Man: Is this the right room for an argument? >>> print(data.readline (), end='') #再次运行,获取第二行数据 Other Man: I've told you once. >>> #####下面再“退回”到文件起始位置,然后使用for循环处理文件中的每一行 >>> data.seek(0) #使用"seek()"方法返回到文件起始位置,当然对于python文件也可以使用“tell()”。 0 >>> for each_line in data: print(each_line,end='') Man: Is this the right room for an argument? Other Man: I've told you once. Man: No you haven't! Other Man: Yes I have. Man: When? Other Man: Just now. Man: No you didn't! Other Man: Yes I did! Man: You didn't! Other Man: I'm telling you, I did! Man: You did not! Other Man: Oh I'm sorry, is this a five minute argument, or the full half hour? Man: Ah! (taking out his wallet and paying) Just the five minutes. Other Man: Just the five minutes. Thank you. Other Man: Anyway, I did. Man: You most certainly did not! Other Man: Now let's get one thing quite clear: I most definitely told you! Man: Oh no you didn't! Other Man: Oh yes I did! Man: Oh no you didn't! Other Man: Oh yes I did! Man: Oh look, this isn't an argument! (pause) Other Man: Yes it is! Man: No it isn't! (pause) Man: It's just contradiction! Other Man: No it isn't! Man: It IS! Other Man: It is NOT! Man: You just contradicted me! Other Man: No I didn't! Man: You DID! Other Man: No no no! Man: You did just then! Other Man: Nonsense! Man: (exasperated) Oh, this is futile!! (pause) Other Man: No it isn't! Man: Yes it is! >>> data.close() >>>
使用函数汇总:
os.getcwd() #查看当前工作目录
os.chdir('D:\workspace\headfirstpython\chapter3') #改变当前工作目录
data=open('sketch.txt') #获取文件
data.readline() #从文件获取一个数据行
data.seek(0) #使用"seek()"方法返回到文件起始位置
data.close() #关闭文件
进一步查看数据:
遵循特定的格式:“演员角色: 台词”
可以使用split()方法抽取出数据行中需要的各个部分;
split()方法:返回一个字符串列表,这会赋至一个目标标识符列表。这称为“多重赋值”;
(role, line_spoken)=each_line.split(':')
将数据行以“:”进行分隔,分别赋之给role和line_spoken;
说明:split()方法传回的是一个列表,但是目标标识符包括在小括号之间,而非中括号之间。
python有两种列表,一种可以改变的列表(中括号包围);另一种一旦创建就不能改变(小括号包围),常称呼为“元组(tuple)”,可以认为是一个“常量列表”。
>>> data=open('sketch.txt') >>> for each_line in data: (role, line_spoken)=each_line.split(':') print(role,end='') print(' said: ',end='') print (line_spoken,end='') Man said: Is this the right room for an argument? Other Man said: I've told you once. Man said: No you haven't! Other Man said: Yes I have. Man said: When? Other Man said: Just now. Man said: No you didn't! Other Man said: Yes I did! Man said: You didn't! Other Man said: I'm telling you, I did! Man said: You did not! Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour? Man said: Ah! (taking out his wallet and paying) Just the five minutes. Other Man said: Just the five minutes. Thank you. Other Man said: Anyway, I did. Man said: You most certainly did not! Traceback (most recent call last): File "<pyshell#29>", line 2, in <module> (role, line_spoken)=each_line.split(':') ValueError: too many values to unpack (expected 2)
错误提示:Man said: You most certainly did not!该句下一句有太多的值进行拆分
分析发现:Other Man: Now let's get one thing quite clear: I most definitely told you!该句有两个冒号,而不是一个冒号,代码没有告诉split()如何处理第二个冒号,导致该方法无法正常工作;
>>> help(each_line.split) Help on built-in function split: split(...) method of builtins.str instance S.split(sep=None, maxsplit=-1) -> list of strings Return a list of the words in S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.
split()有一个可选的参数maxsplit,控制着将数据行分解为多个部分。如果将该参数设置为1,数据行只会拆分为两部分,这样就会消除数据行只会额外的冒号的影响;
>>> data=open('sketch.txt') >>> for each_line in data: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') Man said: Is this the right room for an argument? Other Man said: I've told you once. Man said: No you haven't! Other Man said: Yes I have. Man said: When? Other Man said: Just now. Man said: No you didn't! Other Man said: Yes I did! Man said: You didn't! Other Man said: I'm telling you, I did! Man said: You did not! Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour? Man said: Ah! (taking out his wallet and paying) Just the five minutes. Other Man said: Just the five minutes. Thank you. Other Man said: Anyway, I did. Man said: You most certainly did not! Other Man said: Now let's get one thing quite clear: I most definitely told you! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh look, this isn't an argument! Traceback (most recent call last): File "<pyshell#39>", line 2, in <module> (role, line_spoken)=each_line.split(':',1) ValueError: not enough values to unpack (expected 2, got 1)
Other Man said: Now let's get one thing quite clear: I most definitely told you!
成功打印到了屏幕上,但是又出现了新的错误,是因为“(pause)”语句格式不符合我们期望的格式所致。
我们发现意外情况越来越多,可以选择两种截然不同的方法:
- 继续增加额外的逻辑对付这些异常;
- 让错误出现,监视他的发生,然后从运行时的错误(以某种方式)恢复;
- 增加额外逻辑:采用字符串的find()方法
find()方法用来查找一个字符串中的子串,如果无法找到,find()方法会返回值-1;如果找到,返回该子串在原字符串中的索引位置。
>>> each_line='I tell you, today is sunday!' >>> each_line.find(':') -1 >>> each_line='I tell you: today is sunday!' >>> each_line.find(':') 10 >>>
>>> data=open('sketch.txt') >>> for each_line in data: if not each_line.find(':')==-1 (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') SyntaxError: invalid syntax >>> for each_line in data: if not each_line.find(':')==-1: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') Man said: Is this the right room for an argument? Other Man said: I've told you once. Man said: No you haven't! Other Man said: Yes I have. Man said: When? Other Man said: Just now. Man said: No you didn't! Other Man said: Yes I did! Man said: You didn't! Other Man said: I'm telling you, I did! Man said: You did not! Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour? Man said: Ah! (taking out his wallet and paying) Just the five minutes. Other Man said: Just the five minutes. Thank you. Other Man said: Anyway, I did. Man said: You most certainly did not! Other Man said: Now let's get one thing quite clear: I most definitely told you! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh look, this isn't an argument! Other Man said: Yes it is! Man said: No it isn't! Man said: It's just contradiction! Other Man said: No it isn't! Man said: It IS! Other Man said: It is NOT! Man said: You just contradicted me! Other Man said: No I didn't! Man said: You DID! Other Man said: No no no! Man said: You did just then! Other Man said: Nonsense! Man said: (exasperated) Oh, this is futile!! Other Man said: No it isn't! Man said: Yes it is! >>> data.close() >>>
程序可以正常工作了,但是有些脆弱,如果文件的格式发生变化,这个代码可能会有问题,需要改变条件,代码会越来越复杂;
所以我们可以采用python的异常处理机制允许错误的出现,但监视他的发生,然后给你一个机会来恢复:
2、让错误出现,监视他的发生,然后从运行时的错误(以某种方式)恢复;
try:
#你的代码(可能导致一个运行时错误)
except:
#错误恢复代码
>>> data.close() >>> data=open('sketch.txt') >>> for each_line in data: try: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') except: pass Man said: Is this the right room for an argument? Other Man said: I've told you once. Man said: No you haven't! Other Man said: Yes I have. Man said: When? Other Man said: Just now. Man said: No you didn't! Other Man said: Yes I did! Man said: You didn't! Other Man said: I'm telling you, I did! Man said: You did not! Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour? Man said: Ah! (taking out his wallet and paying) Just the five minutes. Other Man said: Just the five minutes. Thank you. Other Man said: Anyway, I did. Man said: You most certainly did not! Other Man said: Now let's get one thing quite clear: I most definitely told you! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh no you didn't! Other Man said: Oh yes I did! Man said: Oh look, this isn't an argument! Other Man said: Yes it is! Man said: No it isn't! Man said: It's just contradiction! Other Man said: No it isn't! Man said: It IS! Other Man said: It is NOT! Man said: You just contradicted me! Other Man said: No I didn't! Man said: You DID! Other Man said: No no no! Man said: You did just then! Other Man said: Nonsense! Man said: (exasperated) Oh, this is futile!! Other Man said: No it isn't! Man said: Yes it is! >>> data.close() >>>
pass语句:可以认为是空语句或者Null语句,此处用来忽略捕捉到的异常,使得程序继续运行;
处理缺少的文件:将'sketch.txt'文件删除或者重命名
如果这个数据文件别删除,程序会崩溃,产生一个IOError的错误;
解决方法一:增加更多的错误检查代码
import os if os.path.exists ('sketch.txt'): data=open('sketch.txt') for each_line in data: if not each_line.find(':')==-1: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') data.close() else: print('文件缺失')
在idle的编辑窗口中,按F5运行:
>>>
========= RESTART: D:\workspace\headfirstpython\chapter3\filemiss.py =========
文件缺失
>>>
正如所料。
解决方法二:再增加一层异常处理
try: data=open('sketch.txt') for each_line in data: try: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') except: pass data.close() except: print('文件缺失')
按F5运行:
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> ==== RESTART: D:\workspace\headfirstpython\chapter3\filemiss_tryexcept.py ==== 文件缺失 >>>
正如所料。
可以发现,随着考虑的错误的增多,“增加额外的逻辑处理代码”方案复杂性也随之增加,直到最后可能掩盖程序本身的作用。
而异常处理方案就不存在该问题,python的异常处理机制可以使我们关注代码真正需要做什么,不必操心哪里会出现问题;
使用try语句使代码更易读、更易写、更容易修正!
要重点关注你的代码需要做什么!!!
但是异常处理代码太一般化,需要使用一种不那么一般化的方式使用except;
指定特定异常:在except代码行上指定错误类型。
try: data=open('sketch.txt') for each_line in data: try: (role, line_spoken)=each_line.split(':',1) print(role,end='') print(' said: ',end='') print (line_spoken,end='') except ValueError: pass data.close() except IOError: print('文件缺失')
按F5运行:
>>>
==== RESTART: D:\workspace\headfirstpython\chapter3\filemiss_tryexcept.py ====
文件缺失
>>>