Python读写文件之换行符
系统的换行符和路径分隔符
os模块可以获取当前系统的换行符和路径分隔符
windows操作系统
>>> os.linesep
'\r\n'
>>> os.sep
'\\'
linux操作系统
>>> import os
>>> os.linesep #换行符
'\n'
>>> os.sep #路径分隔符
'/'
open函数的newline参数
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
读取文件
-
newline = None(默认)
不指定newline,则默认开启Universal newline mode,所有\n, \r, or \r\n被默认转换为\n ;
-
newline = ''
不做任何转换操作,读取到什么就是什么
-
newline = 'the other legal values'
按照给定的换行符界定行
简单而言,读取文件时,newline参数默认,不管换行符是什么,都会被转换为\n
写入文件
-
newline = None(默认)
\n字符会被转换为各系统默认的换行符(os.linesep)
windows的换行符是\r\n,但是写入时,\r\n也会转换,转换为\r\r\n
-
newline = '' 或者newline = '\n'
不做任何操作
-
newline = 'the other legal values'
\n字符会被转换为给定的值
简单而言,使用字符串的rstrip()方法,去掉末尾的各种换行符
然后,加上\n,
写文件时,newline参数默认,\n字符会被转换为各系统默认的换行符(os.linesep)
示例1:编辑软件换行写,python原样读取和转换读取
向test.txt中写入如下内容:
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\nline2'
'line1\nline2'
结果符合预期
写入时
向txt写入时,回车插入\r\n
读取时
newline='',不做转换,原样输出'line1\r\nline2'
newline = None,转换\r\n为\n
示例2:python转换写\n,python原样读取和转换读取
with open('test.txt','w') as f:
f.write('line1\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\nline2'
'line1\nline2'
这个结果符合预期
写入时
newline = None,\n字符会被转换为各系统默认的换行符,会将\n转换为\r\n
读取时
newline='',不会转换\r\n,原样输出
newline = None,会将\r\n转换为\n
示例3:python原样写\n,python原样读取和转换读取
with open('test.txt','w',newline='') as f:
f.write('line1\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\nline2'
'line1\nline2'
结果符合预期
写入时
newline='',不会转换,原样写入'line1\nline2'
读取时
newline='',不会转换,原样输出'line1\nline2'
newline = None,会转换\r\n为\n,但是没有\r\n,所以显示的\n,也没问题
去掉字符串首尾的空白字符
\n,\r,\t,空格等
字符串的strip(),lstrip(),rstrip()
str.strip去掉字符串头和尾的空白字符
>>> help(str.strip)
Help on method_descriptor:
strip(...)
S.strip([chars]) -> str
Return a copy of the string S with leading and trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
str.lstrip 去掉字符串头的空白字符
>>> help(str.lstrip)
Help on method_descriptor:
lstrip(...)
S.lstrip([chars]) -> str
Return a copy of the string S with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
str.rstrip去掉字符串尾的空白字符
>>> help(str.rstrip)
Help on method_descriptor:
rstrip(...)
S.rstrip([chars]) -> str
Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
拓展:linux和windows文件之间的拷贝
假设有一个linux下的unix.txt文件, 那么, 它在文件中的换行标志是:\n, 现在把unix.txt拷贝到Windows上, Windows找不到unix.txt中的\r\n, 所以,对于Windows而言, 压根就没有发现unix.txt有任何换行, 所以, 我们从Windows上看到的unix.txt文件显示在一行里面。win10的txt文件中\n也能识别为换行符了
同理, 假设Windows上有一个dos.txt文件, 那么, 它在文件中的换行标志是\r\n, 现在拷贝到linux下, linux遇到文件中的\n, 认为是换行, 至于其他的, 认为是正常的字符。 如此一来, \r就被当成了文件的正常部分,当这个文件是可执行脚本时,就会报错。
win7中只有\r\n被识别为换行符
>>> with open('test.txt','w',newline='') as f:
f.write('line1\rline2\nline3\r\nline4')
24
>>> with open('test.txt','r',newline='') as f:
f.read()
'line1\rline2\nline3\r\nline4'
win10中,\r,\n,\r\n都可以识别为换行
>>> b'\r'.hex()
'0d'
>>> b'\n'.hex()
'0a'
以上\r十六进制是0d,\n十六进制是0a
示例1:\r
with open('test.txt','w',newline='') as f:
f.write('line1\rline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\rline2'
'line1\nline2'
\r能换行
示例2:\n
with open('test.txt','w',newline='') as f:
f.write('line1\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\nline2'
'line1\nline2'
\n能换行
示例3:\r\n
with open('test.txt','w',newline='') as f:
f.write('line1\r\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\nline2'
'line1\nline2'
示例4:\r\r\n
with open('test.txt','w',newline='') as f:
f.write('line1\r\r\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\r\nline2'
'line1\n\nline2'
\r和\r\n都被识别为换行符
示例5:\r,newline=None
\n字符会被转换为各系统默认的换行符(os.linesep)
这里没有\n
with open('test.txt','w') as f:
f.write('line1\rline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\rline2' #不做转换,原样读取
'line1\nline2' #\r转换为\n
示例6:\n,newline=None
\n字符会被转换为各系统默认的换行符(os.linesep)
with open('test.txt','w') as f:
f.write('line1\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\nline2' #原样读取,不做转换,可以看到\n在写入时转换为\r\n
'line1\nline2' #转换读取,\r\n转换为\n
示例7:\r\n,newline=None
\n字符会被转换为各系统默认的换行符(os.linesep)
with open('test.txt','w') as f:
f.write('line1\r\nline2')
with open('test.txt','r',newline='') as f:
print(repr(f.read()))
with open('test.txt','r') as f:
print(repr(f.read()))
'line1\r\r\nline2' #原样读取,不做转换,\r\r\n并没有转换为\r\r\r\n,检测到了\r\n
'line1\n\nline2' #转换读取,\r\n转换为\n
python中,只有\n被识别为换行符
word中,\r,\n,\r\n都可以识别为换行
>>> print('line1\rline2')
line1 line2
>>> print('line1\nline2')
line1
line2
>>> print('line1\r\nline2')
line1
line2
>>> print('line1\r\r\nline2')
line1
line2