JackLi07

Python之文件操作

文件基本操作分为读、写、修改及其他操作,在以下几个模式中进行:

1.读模式

f = open(file = 'file.txt', mode = 'r', encoding = 'utf-8')
f.read()  #可以用参数控制读几个,下次再调用read()的时候接着光标的位置读取
f.readline()  # 读取一行,结尾默认带一个"\n"
f.close()
# encoding中,文件是什么编码存的,就用什么编码读。(如果不知道编码,可以通过第三方模块chardet尝试查找)
# mode中,r改为rb则变成二进制模式打开

f.read()将文件全部内容读成一个字符串,若需要分行读的时候,用for循环

for line in f:
     print(line)
# 此处print会自动加一个‘\n’

2.写模式 

f = open(file = 'file.txt', mode = 'w', encoding = 'utf-8')
f.write('A')
# A只能是一个字符串,可以用占位符多个输出
# w模式下,首先会清空原文件内容,再根据代码重新写文件
# w模式下,若文件不存在,会根据文件名自动创建该文件  

3.追加模式(a)

f = open(file = 'file.txt', mode = 'a')
f.write('A')
# A默认追加在文件的最后

4.混合模式

a.读写模式(r+)

f = open(file = 'file.txt', mode = 'r+')
f.read()
f.write()
#read()读过的信息将不会再次读取(相当于一个光标,读完后移到最后)

b.写读模式(w+)

f = open(file = 'file.txt', mode = 'w+')
f.read()
f.write()
f.seek()
f.read(0)
# 第一个read将不会读到任何东西,第二个read时,需要将光标调整到最开头,之后才会读取到write里写的东西,该模式默认清空文件内容后重新写内容

c.追加读模式(a+)

f = open(file = 'file.txt', mode = 'a+')
f.seek(0)  # a+模式下打开文件,光标默认在最后
f.read()
f.seek(0)
f.write()  # a+模式下,无论光标怎么调,都只能在最后添加

文件操作

所有的文件操作都将在以上几种模式中进行,以上介绍了文件的读、写、追加等操作,接下来根据源码介绍下其他用法:

  1 class file(object):
  2 
  3       def close(self): # real signature unknown; restored from __doc__
  4         关闭文件
  5 
  6         """close() -> None or (perhaps) an integer.  Close the file.
  7        
  8         Sets data attribute .closed to True.  A closed file cannot be used for
  9         further I/O operations.  close() may be called more than once without
 10         error.  Some kinds of file objects (for example, opened by popen())
 11         may return an exit status upon closing.
 12         """
 13  
 14      def fileno(self): # real signature unknown; restored from __doc__
 15         文件描述符   
 16 
 17          """fileno() -> integer "file descriptor".
 18         
 19         This is needed for lower-level file interfaces, such os.read(). """
 20         
 21         return 0    
 22 
 23     def flush(self): # real signature unknown; restored from __doc__
 24         刷新文件内部缓冲区
 25         
 26         """ flush() -> None.  Flush the internal I/O buffer. """
 27 
 28         pass
 29 
 30     def isatty(self): # real signature unknown; restored from __doc__
 31         判断文件是否是同意tty设备
 32 
 33         """ isatty() -> true or false.  True if the file is connected to a tty device. """
 34 
 35         return False
 36 
 37     def next(self): # real signature unknown; restored from __doc__
 38         获取下一行数据,不存在,则报错
 39 
 40         """ x.next() -> the next value, or raise StopIteration """
 41 
 42         pass
 43 
 44  
 45 
 46     def read(self, size=None): # real signature unknown; restored from __doc__
 47         读取指定字节数据
 48 
 49         """read([size]) -> read at most size bytes, returned as a string.
 50       
 51         If the size argument is negative or omitted, read until EOF is reached.
 52         Notice that when in non-blocking mode, less data than what was requested
 53         may be returned, even if no size parameter was given."""
 54 
 55         pass
 56 
 57     def readinto(self): # real signature unknown; restored from __doc__
 58         读取到缓冲区,不要用,将被遗弃
 59 
 60         """ readinto() -> Undocumented.  Don't use this; it may go away. """
 61 
 62         pass
 63 
 64  
 65     def readline(self, size=None): # real signature unknown; restored from __doc__
 66         仅读取一行数据
 67         """readline([size]) -> next line from the file, as a string.
 68     
 69         Retain newline.  A non-negative size argument limits the maximum
 70         number of bytes to return (an incomplete line may be returned then).
 71         Return an empty string at EOF. """
 72 
 73         pass
 74 
 75     def readlines(self, size=None): # real signature unknown; restored from __doc__
 76         读取所有数据,并根据换行保存值列表
 77 
 78         """readlines([size]) -> list of strings, each a line from the file.         
 79 
 80         Call readline() repeatedly and return a list of the lines so read.
 81         The optional size argument, if given, is an approximate bound on the
 82         total number of bytes in the lines returned. """
 83 
 84         return []
 85 
 86  
 87 
 88     def seek(self, offset, whence=None): # real signature unknown; restored from __doc__
 89         指定文件中指针位置
 90         """seek(offset[, whence]) -> None.  Move to new file position.
 91        
 92         Argument offset is a byte count.  Optional argument whence defaults to
 93         0 (offset from start of file, offset should be >= 0); other values are 1
 94         (move relative to current position, positive or negative), and 2 (move
 95         relative to end of file, usually negative, although many platforms allow
 96         seeking beyond the end of a file).  If the file is opened in text mode,
 97         only offsets returned by tell() are legal.  Use of other offsets causes
 98         undefined behavior.
 99         Note that not all file objects are seekable. """
100 
101         pass
102 
103  
104 
105     def tell(self): # real signature unknown; restored from __doc__
106         获取当前指针位置
107 
108         """ tell() -> current file position, an integer (may be a long integer). """
109         pass
110 
111 
112     def truncate(self, size=None): # real signature unknown; restored from __doc__
113         截断数据,仅保留指定之前数据
114 
115         """ truncate([size]) -> None.  Truncate the file to at most size bytes.
116 
117         Size defaults to the current file position, as returned by tell()."""
118 
119         pass
120 
121  
122 
123     def write(self, p_str): # real signature unknown; restored from __doc__
124         写内容
125 
126         """write(str) -> None.  Write string str to file.
127        
128         Note that due to buffering, flush() or close() may be needed before
129         the file on disk reflects the data written."""
130 
131         pass
132 
133     def writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__
134         将一个字符串列表写入文件
135         """writelines(sequence_of_strings) -> None.  Write the strings to the file.
136 
137          Note that newlines are not added.  The sequence can be any iterable object
138          producing strings. This is equivalent to calling write() for each string. """
139 
140         pass
141 
142  
143 
144     def xreadlines(self): # real signature unknown; restored from __doc__
145         可用于逐行读取文件,非全部
146 
147         """xreadlines() -> returns self.
148        
149         For backward compatibility. File objects now include the performance
150         optimizations previously implemented in the xreadlines module. """
151 
152         pass          
153 
154 file Code
file code

常用操作:

1.fileno 返回内核中的索引值,在做IO多路复用时用到
2.flush 将内存中的东西强制写入硬盘
3.readable 判断是否可读
4.readline 只读一行,遇到\r或\n停止
5.tell 返回指针(光标)位置(字节)
6.seek:

  当一个参数时,将光标移到指定位置(此处括号内数字代表字节数) #在read()操作中,括号中的数字代表读的字符

  当有两个参数时:

    seek(0, 1):表示光标调整到当前位置

    seek(0, 0):表示光标调整到开始位置

    seek(0, 2):表示光标调整到末尾位置

7.truncate 从光标处开始截断,后面的删去  # 如果truncate()中有值,将从头开始数字节截断

*文件修改操作

由于文件存储的特殊性,因此修改文件内的内容比较特殊,思路如下:

先使用seek()操作将光标移动到要修改的位置,然后再write()写入指定内容,但是!!只能覆盖原来的内容,而不能插入内容。若想插入,只能打开两个文件,用readline()方法,一边插入一边保存到新文件中!

举一个栗子实现修改文件中的内容,将联系方式表中所有兰州的人改为北京,代码实现如下:

原文件内容:

# 占硬盘的方式修改
name = '联系方式'
new_name = '%s_new' % name

f = open('%s.txt' % name, 'r', encoding='utf-8')
f_new = open('%s.txt' % new_name, 'w' ,encoding='utf-8')

old_str = '兰州'
new_str = '北京'

for line in f:  # 通过循环每一行判断是否有需要替换的内容
    if old_str in line:
        line = line.replace(old_str, new_str)
    f_new.write(line)

f.close()
f_new.close()

此方法思路是逐行将文件内容检索(替换),每检索完一行将内容写入新的文件(联系方式_new)中,直至全部完成。

运行结果:

 

另外一种思路,是将全部内容保存在内存中,修改完成后输出,代码如下:

# 占内存的方式修改
old_str = '兰州'
new_str = '北京'

f = open('联系方式.txt', 'r+', encoding='utf-8')
data = f.read()  # 将文件中的内容全部以字符串的形式保存在data中
data = data.replace(old_str, new_str)
f.seek(0)  # 替换完成后,将光标移到文件的开始,覆盖原先的内容
f.write(data)
f.close()

运行结果如下:

此方法相比第一种方法存在两个问题:

1.当str与new_str字数不一致时,文件结尾位置会出现bug,这与文件在硬盘中保存的方式有关。

2.当文件过大时,全部读入内存会拖累计算机运行的速度,甚至将内存撑爆。



posted @ 2018-03-22 14:14  JackLi07  阅读(188)  评论(0编辑  收藏  举报