tuple

本节内容

列表、元组操作
字符串操作
字典操作
集合操作
文件操作
字符编码与转码

1. 列表、元组操作

列表是我们最以后最常用的数据类型之一，通过列表可以对数据实现最方便的存储、修改等操作

定义列表：

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
print(names)


#运行的结果是：
['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']

　　通过下标访问列表中的元素，下标从0开始计数。

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
print(names[3])  #通过下标访问列表中的元素，从0开始
print(names[0])
print(names[4])

#运行的结果是：
鹿晗
周杰伦
李易峰

切片:取多个元素　

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']

print(names[0:3]) #取下标0-3的值，包括0不包括3 （顾头不顾尾）
>>>['周杰伦', '陈伟霆', '陈奕迅']
print(names[:3]) #如果是从头开始取，0可以忽略，跟上句效果一样
>>>['周杰伦', '陈伟霆', '陈奕迅']
print(names[0:-1]) #取0到-1的值，不包括-1（-1代表从右往左数第一个值，也就是从左往右数最后一个值）
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗']
print(names[0:]) #如果想取最后一个值，必须不能写-1，只能这么写
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']
print(names[0:-1:2]) #2表示每隔一个取一个值
>>>['周杰伦', '陈奕迅']
print(names[1::3]) #3表示每隔2个取一个值
>>>['陈伟霆', '李易峰']
print(names[0::4]) #4表示每隔3个取一个值
>>>['周杰伦', '李易峰']
print(names[::4]) #和上一句是一个意思
>>>['周杰伦', '李易峰']

追加

在列表中添加新的值，用append方法在最后面追加

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
names.append('海绵宝宝')
print(names)
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰', '海绵宝宝']
names.append('皮皮虾') 
print(names)
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰', '海绵宝宝', '皮皮虾']

　插入

用insert方法强行在前面插入

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
names.insert(0,'我要在最前面')
print(names)
>>>['我要在最前面', '周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']
names.insert(2,'强行从前面插入')
print(names) 
>>>['我要在最前面', '周杰伦', '强行从前面插入', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']

修改

变量名[下标]=新的值

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
names[2] = '该换人了'
print(names)

>>>['周杰伦', '陈伟霆', '该换人了', '鹿晗', '李易峰']

　　删除

三种方法：

1.del 变量名[下标] 删除下标对应的值

2.用remove方法加列表元素值，删除对应的值

3.用pop方法删除列表最后一个值

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
del names[2]   #删除下标对应的元素
print(names)
>>>['周杰伦', '陈伟霆', '鹿晗', '李易峰']
names.remove('鹿晗')   #删除指定的元素
print(names)
>>>['周杰伦', '陈伟霆', '李易峰']
names.pop()   #删除最后一个元素
print(names) 
>>>['周杰伦', '陈伟霆']

4.要删除指定位置的元素，用pop(i)方法，其中i是索引位置：

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
names.pop(0)
print(names)

扩展

通过extend方法添加多个元素到列表中，添加的元素在列表的最后面，元素名可重复。

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
b = [1,2,3]
names.extend(b)
print(names)
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰', 1, 2, 3]

c = ['周杰伦',4,5]
names.extend(c)
print(names)
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰', 1, 2, 3, '周杰伦', 4, 5]

拷贝

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰']
names2 = names.copy()   #复制names并且赋值给names2
print(names2)
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']

names[2] = '陈奕迅拜拜'
print(names)
print(names2)   #names变了，但names2没有变
>>>['周杰伦', '陈伟霆', '陈奕迅拜拜', '鹿晗', '李易峰']
>>>['周杰伦', '陈伟霆', '陈奕迅', '鹿晗', '李易峰']

names2 = names.copy()   #再copy一遍才变
print(names2)
>>>['周杰伦', '陈伟霆', '陈奕迅拜拜', '鹿晗', '李易峰']

统计

统计列表中对象的数量

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰','周杰伦']

num = names.count('周杰伦')
print(num)

>>>2

用len()函数可以获得list元素的个数：

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰','周杰伦']
s = len(names)
print(s)

排序

1、用srot方法排正序

>>> a = [2,5,7,2,1,4,3,]
>>> a.sort()
>>> a
[1, 2, 2, 3, 4, 5, 7]

2、用reverse倒序排列（翻转）

num = [1,2,3,8,5,7,4]
num.reverse()
print(num)

>>>[4, 7, 5, 8, 3, 2, 1]

获取下标

names = ['周杰伦','陈伟霆','陈奕迅','鹿晗','李易峰','周杰伦']
a = names.index('周杰伦')
print(a)

>>>0  #只返回找到的第一个下标

元组

元组其实跟列表差不多，也是存一组数，只不是它一旦创建，便不能再修改，所以又叫只读列表

fruit = ('apple','bananer','orange')
print(fruit)

>>>('apple', 'bananer', 'orange')

它只有2个方法，一个是count,一个是index，完毕。

程序练习

程序：购物车程序

需求:

启动程序后，让用户输入工资，然后打印商品列表
允许用户根据商品编号购买商品
用户选择商品后，检测余额是否够，够就直接扣款，不够就提醒
可随时退出，退出时，打印已购买商品和余额

2. 字符串操作

特性：不可修改　

name.capitalize()  首字母大写
name.casefold()   大写全部变小写
name.center(50,"-")  输出 '---------------------Alex Li----------------------'
name.count('lex') 统计 lex出现次数
name.encode()  将字符串编码成bytes格式
name.endswith("Li")  判断字符串是否以 Li结尾
 "Alex\tLi".expandtabs(10) 输出'Alex      Li'， 将\t转换成多长的空格 
 name.find('A')  查找A,找到返回其索引， 找不到返回-1 

format :
    >>> msg = "my name is {}, and age is {}"
    >>> msg.format("alex",22)
    'my name is alex, and age is 22'
    >>> msg = "my name is {1}, and age is {0}"
    >>> msg.format("alex",22)
    'my name is 22, and age is alex'
    >>> msg = "my name is {name}, and age is {age}"
    >>> msg.format(age=22,name="ale")
    'my name is ale, and age is 22'
format_map
    >>> msg.format_map({'name':'alex','age':22})
    'my name is alex, and age is 22'


msg.index('a')  返回a所在字符串的索引
'9aA'.isalnum()   True

'9'.isdigit() 是否整数
name.isnumeric  
name.isprintable
name.isspace
name.istitle
name.isupper
 "|".join(['alex','jack','rain'])
'alex|jack|rain'


maketrans
    >>> intab = "aeiou"  #This is the string having actual characters. 
    >>> outtab = "12345" #This is the string having corresponding mapping character
    >>> trantab = str.maketrans(intab, outtab)
    >>> 
    >>> str = "this is string example....wow!!!"
    >>> str.translate(trantab)
    'th3s 3s str3ng 2x1mpl2....w4w!!!'

 msg.partition('is')   输出 ('my name ', 'is', ' {name}, and age is {age}') 

 >>> "alex li, chinese name is lijie".replace("li","LI",1)
     'alex LI, chinese name is lijie'

 msg.swapcase 大小写互换


 >>> msg.zfill(40)
'00000my name is {name}, and age is {age}'



>>> n4.ljust(40,"-")
'Hello 2orld-----------------------------'
>>> n4.rjust(40,"-")
'-----------------------------Hello 2orld'


>>> b="ddefdsdff_哈哈" 
>>> b.isidentifier() #检测一段字符串可否被当作标志符，即是否符合变量命名规则
True

3. 字典操作

dict

Python内置了字典：dict的支持，dict全称dictionary，在其他语言中也称为map，使用键-值（key-value）存储，具有极快的查找速度。字典一种key - value 的数据类型，使用就像我们上学用的字典，通过笔划、字母来查对应页的详细内容。

为什么dict查找速度这么快？因为dict的实现原理和查字典是一样的。假设字典包含了1万个汉字，我们要查某一个字，一个办法是把字典从第一页往后翻，直到找到我们想要的字为止，这种方法就是在list中查找元素的方法，list越大，查找越慢。第二种方法是先在字典的索引表里（比如部首表）查这个字对应的页码，然后直接翻到该页，找到这个字。无论找哪个字，这种查找速度都非常快，不会随着字典大小的增加而变慢。dict就是第二种实现方式，给定一个名字，比如'Michael'，dict在内部就可以直接计算出Michael对应的存放成绩的“页码”，也就是95这个数字存放的内存地址，直接取出来，所以速度非常快。你可以猜到，这种key-value存储方式，在放进去的时候，必须根据key算出value的存放位置，这样，取的时候才能根据key直接拿到value。

语法：

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

print(info)

>>>{'ID0001': '周杰伦', 'ID0002': '李荣浩', 'ID0003': '张信哲'}

字典的特性：

dict是无序的
key必须是唯一的,so 天生去重

增加

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

info['ID0004'] = '李宇春'
print(info)

>>>{'ID0001': '周杰伦', 'ID0002': '李荣浩', 'ID0003': '张信哲', 'ID0004': '李宇春'}

修改

注意：如果key重复了，就会覆盖之前的value

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

info['ID0003'] = '李宇春'
print(info)

>>>{'ID0001': '周杰伦', 'ID0002': '李荣浩', 'ID0003': '李宇春'}

删除

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}
info.pop('ID0001')   #标准删除方法
print(info)

>>>{'ID0002': '李荣浩', 'ID0003': '张信哲'}

del info['ID0002']   #换个姿势删除
print(info)

>>>{'ID0003': '张信哲'}

>>>
>>>
>>>

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

info.popitem()  #随机删除
print(info)
>>>{'ID0001': '周杰伦', 'ID0002': '李荣浩'}

查找

1. key in 字典名

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

a = ['ID1001' in info]
print(a)

>>>[False]

#上面是在pycharm上执行的，在idle是可直接返回值

>>> info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}
>>> 'ID0001' in info  #标准用法，有这个key就返回true，反之返回false
True
>>>

2.get方法

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

a = info.get('ID0001')   #有对应的key就返回对应的值
print(a)

>>>周杰伦

b = info.get('ID0004')   #没有对应的key就返回none
print(b)

>>>None

3.不建议使用的一种方法

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

a =  info['ID0001']
print(a)

>>>周杰伦

info = {'ID0001':'周杰伦',
        'ID0002':'李荣浩',
        'ID0003':'张信哲',
}

a =  info['ID0004']   #这种方法ID不存在就会报错
print(a)

>>> a =  info['ID0004']
KeyError: 'ID0004'

food = {'fruit':{'apple':['美味','有营养的','但是有点贵'],
                 'bananer':['很甜','热带水果','很便宜'],
                 'orange':['我最爱的','补充维生素C','跟香蕉一样便宜'],},
        'snacks':{'Potato chips':['黄瓜味最好吃','热量高','屌丝不能天天吃'],
                  'bread':['香香甜甜','可以充饥','吃多了会长胖的'],
                  'candy':['非常甜','补充糖分','吃多了会蛀牙',]}
        }
print(food)
>>>{'fruit': {'apple': ['美味', '有营养的', '但是有点贵'], 'bananer': ['很甜', '热带水果', '很便宜'], 'orange': ['我最爱的', '补充维生素C', '跟香蕉一样便宜']}, 'snacks': {'Potato chips': ['黄瓜味最好吃', '热量高', '屌丝不能天天吃'], 'bread': ['香香甜甜', '可以充饥', '吃多了会长胖的'], 'candy': ['非常甜', '补充糖分', '吃多了会蛀牙']}}

food['fruit']['apple'][1] = '我不喜欢吃'  #替换中的列表中的元素
print(food)
>>>{'fruit': {'apple': ['美味', '我不喜欢吃', '但是有点贵'], 'bananer': ['很甜', '热带水果', '很便宜'], 'orange': ['我最爱的', '补充维生素C', '跟香蕉一样便宜']}, 'snacks': {'Potato chips': ['黄瓜味最好吃', '热量高', '屌丝不能天天吃'], 'bread': ['香香甜甜', '可以充饥', '吃多了会长胖的'], 'candy': ['非常甜', '补充糖分', '吃多了会蛀牙']}}

print(food['fruit']['apple'])  #打印指定的列表
>>>['美味', '我不喜欢吃', '但是有点贵']

字典中的values用法：取字典中的值

info = {'ID01':'张杰',
        'ID02':'谢娜',
        'ID03':'皮皮虾'
        }
a = info.values()
print(a)

>>>dict_values(['张杰', '谢娜', '皮皮虾'])

字典中的keys用法：取字典中的key

info = {'ID01':'张杰',
        'ID02':'谢娜',
        'ID03':'皮皮虾'
        }
a = info.keys()
print(a)

>>>dict_keys(['ID01', 'ID02', 'ID03'])

字典中setdefauit的用法，给字典添加key-values，已存在的key不会被覆盖，也就是说setdefault没有修改已存在的元素的能力

info = {'ID01':'张杰',
        'ID02':'谢娜',
        'ID03':'皮皮虾'
        }

info.setdefault('ID04','海绵宝宝')
print(info)
>>>{'ID01': '张杰', 'ID02': '谢娜', 'ID03': '皮皮虾', 'ID04': '海绵宝宝'}

info.setdefault('ID01','谢娜老公')  #添加已存在的key无效果
print(info)

>>>{'ID01': '张杰', 'ID02': '谢娜', 'ID03': '皮皮虾', 'ID04': '海绵宝宝'}

字典中updat的用法。a.update(b) 将字典b的内容合并到字典a，重复的key将替换成b中的值

info = {'ID01':'张杰',
        'ID02':'谢娜',
        'ID03':'皮皮虾'
        }
b = {1:2,
     3:4,
     'ID01':'谢娜老公'}
info.update(b)
print(info)

>>>{'ID01': '谢娜老公', 'ID02': '谢娜', 'ID03': '皮皮虾', 1: 2, 3: 4}

字典中items的用法。将字典转换为列表

info = {'ID01':'张杰',
        'ID02':'谢娜',
        'ID03':'皮皮虾'
        }

a = info.items()
print(a)

>>>dict_items([('ID01', '张杰'), ('ID02', '谢娜'), ('ID03', '皮皮虾')])

dict.fromkeys 通过列表生成字典，有点坑，少用

a = dict.fromkeys([1,2,3,4],'a')
print(a)

>>>{1: 'a', 2: 'a', 3: 'a', 4: 'a'}

循环dict

}
for i in info:
        print(i,info[i])

for k,v in info.items():        #这两种方法结果是一样的
        print(k,v)

>>>ID0001 周杰伦
ID0002 李荣浩
ID0003 张信哲
ID0001 周杰伦
ID0002 李荣浩
ID0003 张信哲

请务必注意，dict内部存放的顺序和key放入的顺序是没有关系的。

和list比较，dict有以下几个特点：

查找和插入的速度极快，不会随着key的增加而变慢；
需要占用大量的内存，内存浪费多。

而list相反：

查找和插入的时间随着元素的增加而增加；
占用空间小，浪费内存很少。

所以，dict是用空间来换取时间的一种方法。

dict可以用在需要高速查找的很多地方，在Python代码中几乎无处不在，正确使用dict非常重要，需要牢记的第一条就是dict的key必须是不可变对象。

这是因为dict根据key来计算value的存储位置，如果每次计算相同的key得出的结果不同，那dict内部就完全混乱了。这个通过key计算位置的算法称为哈希算法（Hash）。

要保证hash的正确性，作为key的对象就不能变。在Python中，字符串、整数等都是不可变的，因此，可以放心地作为key。而list是可变的，就不能作为key：

程序练习

程序: 三级菜单

要求:

打印省、市、县三级菜单
可返回上一级
可随时退出程序

集合操作----set

集合是一个无序的，不重复的数据组合，它的主要作用如下：

去重，把一个列表变成集合，就自动去重了
关系测试，测试两组数据之前的交集、差集、并集等关系

常用操作

s = set([3,5,9,10])      #创建一个数值集合  
  
t = set("Hello")         #创建一个唯一字符的集合  


a = t | s          # t 和 s的并集  
  
b = t & s          # t 和 s的交集  
  
c = t – s          # 求差集（项在t中，但不在s中）  
  
d = t ^ s          # 对称差集（项在t或s中，但不会同时出现在二者中）

基本操作：  
  
t.add('x')            # 添加一项  
  
s.update([10,37,42])  # 在s中添加多项  
  
   
  
使用remove()可以删除一项：  
  
t.remove('H')  
  
  
len(s)  
set 的长度  
  
x in s  
测试 x 是否是 s 的成员  
  
x not in s  
测试 x 是否不是 s 的成员  
  
s.issubset(t)  
s <= t  
测试是否 s 中的每一个元素都在 t 中  
  
s.issuperset(t)  
s >= t  
测试是否 t 中的每一个元素都在 s 中  
  
s.union(t)  
s | t  
返回一个新的 set 包含 s 和 t 中的每一个元素  
  
s.intersection(t)  
s & t  
返回一个新的 set 包含 s 和 t 中的公共元素  
  
s.difference(t)  
s - t  
返回一个新的 set 包含 s 中有但是 t 中没有的元素  
  
s.symmetric_difference(t)  
s ^ t  
返回一个新的 set 包含 s 和 t 中不重复的元素  
  
s.copy()  
返回 set “s”的一个浅复制

5. 文件操作

对文件操作流程

打开文件，得到文件句柄并赋值给一个变量
通过句柄对文件进行操作
关闭文件

现有文件如下

Somehow, it seems the love I knew was always the most destructive kind
不知为何，我经历的爱情总是最具毁灭性的的那种
Yesterday when I was young
昨日当我年少轻狂
The taste of life was sweet
生命的滋味是甜的
As rain upon my tongue
就如舌尖上的雨露
I teased at life as if it were a foolish game
我戏弄生命 视其为愚蠢的游戏
The way the evening breeze
就如夜晚的微风
May tease the candle flame
逗弄蜡烛的火苗
The thousand dreams I dreamed
我曾千万次梦见
The splendid things I planned
那些我计划的绚丽蓝图
I always built to last on weak and shifting sand
但我总是将之建筑在易逝的流沙上
I lived by night and shunned the naked light of day
我夜夜笙歌 逃避白昼赤裸的阳光
And only now I see how the time ran away
事到如今我才看清岁月是如何匆匆流逝
Yesterday when I was young
昨日当我年少轻狂
So many lovely songs were waiting to be sung
有那么多甜美的曲儿等我歌唱
So many wild pleasures lay in store for me
有那么多肆意的快乐等我享受
And so much pain my eyes refused to see
还有那么多痛苦 我的双眼却视而不见
I ran so fast that time and youth at last ran out
我飞快地奔走 最终时光与青春消逝殆尽
I never stopped to think what life was all about
我从未停下脚步去思考生命的意义
And every conversation that I can now recall
如今回想起的所有对话
Concerned itself with me and nothing else at all
除了和我相关的 什么都记不得了
The game of love I played with arrogance and pride
我用自负和傲慢玩着爱情的游戏
And every flame I lit too quickly, quickly died
所有我点燃的火焰都熄灭得太快
The friends I made all somehow seemed to slip away
所有我交的朋友似乎都不知不觉地离开了
And only now I'm left alone to end the play, yeah
只剩我一个人在台上来结束这场闹剧
Oh, yesterday when I was young
噢 昨日当我年少轻狂
So many, many songs were waiting to be sung
有那么那么多甜美的曲儿等我歌唱
So many wild pleasures lay in store for me
有那么多肆意的快乐等我享受
And so much pain my eyes refused to see
还有那么多痛苦 我的双眼却视而不见
There are so many songs in me that won't be sung
我有太多歌曲永远不会被唱起
I feel the bitter taste of tears upon my tongue
我尝到了舌尖泪水的苦涩滋味
The time has come for me to pay for yesterday
终于到了付出代价的时间 为了昨日
When I was young
当我年少轻狂

基本操作　

f = open('lyrics') #打开文件
first_line = f.readline()
print('first line:',first_line) #读一行
print('我是分隔线'.center(50,'-'))
data = f.read()# 读取剩下的所有内容,文件大时不要用
print(data) #打印文件
 
f.close() #关闭文件

打开文件的模式有：

r，只读模式（默认）。
w，只写模式。【不可读；不存在则创建；存在则删除内容；】
a，追加模式。【可读；不存在则创建；存在则只追加内容；】

"+" 表示可以同时读写某个文件

r+，可读写文件。【可读；可写；可追加】
w+，写读
a+，同a

"U"表示在读取时，可以将 \r \n \r\n自动转换成 \n （与 r 或 r+ 模式同使用）

"b"表示处理二进制文件（如：FTP发送上传ISO镜像文件，linux可忽略，windows处理二进制文件时需标注）

其它语法

def close(self): # real signature unknown; restored from __doc__
        """
        Close the file.
        
        A closed file cannot be used for further I/O operations.  close() may be
        called more than once without error.
        """
        pass

    def fileno(self, *args, **kwargs): # real signature unknown
        """ Return the underlying file descriptor (an integer). """
        pass

    def isatty(self, *args, **kwargs): # real signature unknown
        """ True if the file is connected to a TTY device. """
        pass

    def read(self, size=-1): # known case of _io.FileIO.read
        """
        注意，不一定能全读回来
        Read at most size bytes, returned as bytes.
        
        Only makes one system call, so less data may be returned than requested.
        In non-blocking mode, returns None if no data is available.
        Return an empty bytes object at EOF.
        """
        return ""

    def readable(self, *args, **kwargs): # real signature unknown
        """ True if file was opened in a read mode. """
        pass

    def readall(self, *args, **kwargs): # real signature unknown
        """
        Read all data from the file, returned as bytes.
        
        In non-blocking mode, returns as much as is immediately available,
        or None if no data is available.  Return an empty bytes object at EOF.
        """
        pass

    def readinto(self): # real signature unknown; restored from __doc__
        """ Same as RawIOBase.readinto(). """
        pass #不要用,没人知道它是干嘛用的

    def seek(self, *args, **kwargs): # real signature unknown
        """
        Move to new file position and return the file position.
        
        Argument offset is a byte count.  Optional argument whence defaults to
        SEEK_SET or 0 (offset from start of file, offset should be >= 0); other values
        are SEEK_CUR or 1 (move relative to current position, positive or negative),
        and SEEK_END or 2 (move relative to end of file, usually negative, although
        many platforms allow seeking beyond the end of a file).
        
        Note that not all file objects are seekable.
        """
        pass

    def seekable(self, *args, **kwargs): # real signature unknown
        """ True if file supports random-access. """
        pass

    def tell(self, *args, **kwargs): # real signature unknown
        """
        Current file position.
        
        Can raise OSError for non seekable files.
        """
        pass

    def truncate(self, *args, **kwargs): # real signature unknown
        """
        Truncate the file to at most size bytes and return the truncated size.
        
        Size defaults to the current file position, as returned by tell().
        The current file position is changed to the value of size.
        """
        pass

    def writable(self, *args, **kwargs): # real signature unknown
        """ True if file was opened in a write mode. """
        pass

    def write(self, *args, **kwargs): # real signature unknown
        """
        Write bytes b to file, return number written.
        
        Only makes one system call, so not all of the data may be written.
        The number of bytes actually written is returned.  In non-blocking mode,
        returns None if the write would block.
        """
        pass

字符编码与转码

详细文章:

http://www.cnblogs.com/yuanchenqi/articles/5956943.html

http://www.diveintopython3.net/strings.html

需知:

1.在python2默认编码是ASCII, python3里默认是unicode

2.unicode 分为 utf-32(占4个字节),utf-16(占两个字节)，utf-8(占1-4个字节)， so utf-16就是现在最常用的unicode版本，不过在文件里存的还是utf-8，因为utf8省空间

3.在py3中encode,在转码的同时还会把string 变成bytes类型，decode在解码的同时还会把bytes变回string

上图仅适用于py2

in python2

import sys
print(sys.getdefaultencoding())


msg = "我爱北京天安门"
msg_gb2312 = msg.decode("utf-8").encode("gb2312")
gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk")

print(msg)
print(msg_gb2312)
print(gb2312_to_gbk)

in python3

import sys
print(sys.getdefaultencoding())


msg = "我爱北京天安门"
#msg_gb2312 = msg.decode("utf-8").encode("gb2312")
msg_gb2312 = msg.encode("gb2312") #默认就是unicode,不用再decode,喜大普奔
gb2312_to_unicode = msg_gb2312.decode("gb2312")
gb2312_to_utf8 = msg_gb2312.decode("gb2312").encode("utf-8")

print(msg)
print(msg_gb2312)
print(gb2312_to_unicode)
print(gb2312_to_utf8)

posted @ 2017-06-17 16:06 皮皮虾的海绵宝宝阅读(271) 评论(0) 编辑收藏举报

皮皮虾的海绵宝宝