Python之路【第十五篇】:文件操作
- 打开文件
- 操作文件
- 读文件
- 写文件
- ...
- 关闭文件
打开文件
使用open()函数,会返回一个对象,之后对文件的操作就是对该对象的操作
f=open(文件名,模式,编码)
文件在电脑里本质上就是010101010这种代码,当用普通模式打开文件时,python自动将01010这些通关过编码转换为字符串(在3.5中默认是utf-8编码),所以encoding这个参数不加也可以,不加默认就是utf-8
当以字节模式打开文件时,就省去了编码转换的过程,就全是二进制操作了,这时就需要用到编码与字符串转换的知识了。
一般情况下都用普通模式r/w/a
- 普通模式(操作都是字符串类型)
- r:只读模式(默认)
- w:只写模式,不可读,不存在即创建,存在即清空内容
- x:只写模式,不可读,不存在即创建,存在则报错
- a:追加模式,不可读,不存在即创建,存在则只追加内容
- 字节模式(操作都是字节类型,SOCKET时会用到)
- rb:
- wb:
- xb:
- ab:
- +模式(f.tell()获取指针位置,f.seek()调整指针位置)
- r+:读写,写就会在末尾追加,指针就到最后了,所有的模式中,该模式最为常用,读:从0开始读,写:先读,最后追加,或者主动seek,从当前指针往后写
- w+:写读,先清空,在写的之后就可以读了,但是需要调整指针位置,因为一写,指针就到最后了
- x+:写读,如果文件存在,会报错,其他与w+一样
- a+:写读,在打开的同时已经把指针移到了最后(这才叫追加)
- 注意:读写与写读是不一样的
#r只读模式 f = open('hello.txt','r',encoding='utf-8') data = f.read()
f.close() print(type(data)) print(data)
------------------------------------
data类型为str类型,encoding不加的话,默认就是utf-8
如果用二进制方式打开(rb方式),就不需要叫encoding了
#rb模式 f = open('hello.txt','rb') data = f.read() f.close() s = str(data,encoding='utf-8') print(data) print(s)
#wb模式 f = open('hello.txt','wb') f.write(bytes('哈哈',encoding='utf-8')) f.close()
当以+模式打开文件时:
在data = f.read()时候,可以在括号中加个数字,表示读几个字符(没有b时)或字(b方式),如data = read(3)就表示读了三个字符,指针就会调整到相应的位置
读的时候可以调整指针,但是写的时候不能调整指针位置,总是把指针移到最后,再进行写操作。
文件在硬盘中本质上是以字节方式存储(也就是二进制方式),所以,在以普通方式对文件进行操作的时候,python内部会事先将二进制转换成字符串(默认是utf-8编码方式);当以字节模式操作的时候,python内部就省略了转换这一步。平时以普通操作方式为主,因为毕竟对字符还的操作还是相对比较容易的。
文件操作:
class TextIOWrapper(_TextIOBase): """ Character and line based layer over a BufferedIOBase object, buffer. encoding gives the name of the encoding that the stream will be decoded or encoded with. It defaults to locale.getpreferredencoding(False). errors determines the strictness of encoding and decoding (see help(codecs.Codec) or the documentation for codecs.register) and defaults to "strict". newline controls how line endings are handled. It can be None, '', '\n', '\r', and '\r\n'. It works as follows: * On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newline mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated. * On output, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string. If line_buffering is True, a call to flush is implied when a call to write contains a newline character. """ def close(self, *args, **kwargs): # real signature unknown 关闭文件,执行完后,指针清零 pass def fileno(self, *args, **kwargs): # real signature unknown 文件描述符 pass def flush(self, *args, **kwargs): # real signature unknown 刷新文件内部缓冲区,可以这样理解,执行write后,内容会放在缓冲区,并不会立马就写入文件里,需要刷新一下才可以,如果没有关闭文件或者有其他不当操作,而又没有执行刷新,可能就白写了。 pass def isatty(self, *args, **kwargs): # real signature unknown 判断文件是否是同意tty设备 pass def read(self, *args, **kwargs): # real signature unknown 读取指定字节数据 pass def readable(self, *args, **kwargs): # real signature unknown 是否可读 pass def readline(self, *args, **kwargs): # real signature unknown 仅读取一行数据 pass def seek(self, *args, **kwargs): # real signature unknown 指定文件中指针位置 pass def seekable(self, *args, **kwargs): # real signature unknown 指针是否可操作 pass def tell(self, *args, **kwargs): # real signature unknown 获取指针位置 pass def truncate(self, *args, **kwargs): # real signature unknown 截断数据,仅保留指定之前数据 pass def writable(self, *args, **kwargs): # real signature unknown 是否可写 pass def write(self, *args, **kwargs): # real signature unknown 写内容 pass def __getstate__(self, *args, **kwargs): # real signature unknown pass def __init__(self, *args, **kwargs): # real signature unknown pass @staticmethod # known case of __new__ def __new__(*args, **kwargs): # real signature unknown """ Create and return a new object. See help(type) for accurate signature. """ pass def __next__(self, *args, **kwargs): # real signature unknown """ Implement next(self). """ pass def __repr__(self, *args, **kwargs): # real signature unknown """ Return repr(self). """ pass buffer = property(lambda self: object(), lambda self, v: None, lambda self: None) # default closed = property(lambda self: object(), lambda self, v: None, lambda self: None) # default encoding = property(lambda self: object(), lambda self, v: None, lambda self: None) # default errors = property(lambda self: object(), lambda self, v: None, lambda self: None) # default line_buffering = property(lambda self: object(), lambda self, v: None, lambda self: None) # default name = property(lambda self: object(), lambda self, v: None, lambda self: None) # default newlines = property(lambda self: object(), lambda self, v: None, lambda self: None) # default _CHUNK_SIZE = property(lambda self: object(), lambda self, v: None, lambda self: None) # default _finalizing = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
除此之外,还有一个很好的操作,f=open(...),当打开一个文件之后,对于对象f,也是可以循环的,一行一行循环
for line in f: print(line)
with关键字:
with open(...) as f:这个可以省去f.close()
在python2.7之后,with支持同时打开两个文件
with open() as f1,open() as f2: pass
这个用处很大,比如有一个大文件需要复制
with open('源文件','r') as obj1,open('新文件','w') as obj2: for line in obj1: obj2.write(line)
这样进行复制,占用内存少
三样东西有助于缓解生命的疲劳:希望、睡眠和微笑。---康德