Python 数据类型之序列类型

序列：表示索引为非负整数的有序对象集合，所有序列都支持迭代

序列类型有：字符串，列表，元组三种
字符串也是一种序列
列表和元组是任意python对象的序列或叫有序集合
字符串和元组不可变序列，列表支持插入、删除和替换元素

序列类型的通用操作方法：

1. 索引运算。s[i] [i]可以使用负数，即倒着取值
2. 切片运算。s[i:j] ，切片后会生成新的对象
3. 扩展切片。s[i:j:stride]，指定步长值

obj[1:] , obj[-2:-1]，obj[0:6:2]

字符串 str

常用操作：

索引
切片
移除空白 obj.strip()
分割 obj.split()
长度 len(obj) , obj.__len__()
返回索引 obj.index()，obj.find()
输出位置 obj.center()，obj.ljust()，obj.rjust()

...

__len__(self, /)
    Return len(self).

 |  capitalize(...)    ''' 首字母大写'''
 |      S.capitalize() -> str
 |      
 |      Return a capitalized version of S, i.e. make the first character
 |      have upper case and the rest lower case.

>>> 'test string'.capitalize()
'Test string'

 |  center(...) 
 |      S.center(width[, fillchar]) -> str

 |      Return S centered in a string of length width. Padding is
 |      done using the specified fill character (default is a space)

>>> print(8*'#')
########
>>> 'test'.center(20,'*')
'********test********'

 |  ljust(...)
 |      S.ljust(width[, fillchar]) -> str   

>>> 'test '.ljust(10,'<')
Out[61]: 'test <<<<<'

 |  rjust(...)
 |      S.rjust(width[, fillchar]) -> str

>>> ' test'.rjust(10,'>')
Out[59]: '>>>>> test'


 |  count(...)  '''统计字符出现的个数'''
 |      S.count(sub[, start[, end]]) -> int

>>> 'test string'.count('s')

In [8]: 'test string'.count('s',1,4)

 |  encode(...)   ''' 指定字符编码 '''
 |      S.encode(encoding='utf-8', errors='strict') -> bytes
 |      
 |      Encode S using the codec registered for encoding. Default encoding is 'utf-8'.

>>> '中文'.encode('gbk')
b'\xd6\xd0\xce\xc4'
''' utf8可以直接转成gbk（内部实现通过unicode）'''


 |  endswith(...)    ''' 判断是否以某个字符后缀结束 '''
 |      S.endswith(suffix[, start[, end]]) -> bool
 |      
 |      Return True if S ends with the specified suffix, False otherwise.

>>> 'test string'.endswith('t',1,4)
True

 |  startswith(...)
 |      S.startswith(prefix[, start[, end]]) -> bool

 |  expandtabs(...)  ''' 将tab转换成空格 '''
 |      S.expandtabs(tabsize=8) -> str

>>> 'te\tst'.expandtabs()
'te      st'


 |  format(...)
 |      S.format(*args, **kwargs) -> str

>>> info='my name is {0}, sex {1}'
>>> info.format('Jack','male')
 'my name is Jack, sex male'

>>> info='my name is {Name}, sex {Sex}'
>>> info.format(Name='Lucy',Sex='female')
'my name is Lucy, sex female'


 |  find(...)    ''' 返回字符的索引, 如有多个，只返回第一个字符的索引，不存在返回-1 ''' 
 |      S.find(sub[, start[, end]]) -> int
 |      
 |      Return the lowest index in S where substring sub is found,
 |      such that sub is contained within S[start:end]. 
 |      Return -1 on failure.

>>> 'test string'.find('t',2,6)

 |  index(...)    ''' 返回字符的索引, 如有多个，只返回第一个字符的索引，不存则报异常 ''' 
 |      S.index(sub[, start[, end]]) -> int
 |      
 |      Like S.find() but raise ValueError when the substring is not found.

>>> 'test string'.index('a')
ValueError: substring not found
>>> 'test string'.find('a')
 -1

 |  rfind(...)
 |      S.rfind(sub[, start[, end]]) -> int

 |  rindex(...)
 |      S.rindex(sub[, start[, end]]) -> int


 |  isalnum(...)
 |      S.isalnum() -> bool

 |  isalpha(...)
 |      S.isalpha() -> bool

 |  isdecimal(...)
 |      S.isdecimal() -> bool

 |  isdigit(...)
 |      S.isdigit() -> bool

 |  islower(...)
 |      S.islower() -> bool
 |      Return True if all cased characters in S are lowercase and there is
 |      at least one cased character in S, False otherwise.

 |  isupper(...)
 |      S.isupper() -> bool   
 |      Return True if all cased characters in S are uppercase and there is
 |      at least one cased character in S, False otherwise.

>>> 'TEST STRING'.isupper()
True

 |  upper(...)
 |      S.upper() -> str

 |  istitle(...)   ''' 判断是否所有单词首字母大写 '''
 |      S.istitle() -> bool
 |      
 |      Return True if S is a titlecased string and there is at least one
 |      character in S

>>> 'Test String'.istitle()
True

 |  title(...)
 |      S.title() -> str
 |      
 |      Return a titlecased version of S

>>> 'test string'.title()
'Test String'

 |  join(...)   ''' 指定连接符将序列的元素连接起来 '''
 |      S.join(iterable) -> str
 |      
 |      Return a string which is the concatenation of the strings in the  iterable.  The separator between elements is S.

>>> ls=['a','b','c']
>>> '-'.join(ls)                                                                                                                 
'a-b-c'


 |  split(...)  ''' 将字符串分割，形成列表, 不指定分隔符默认为空格'''
 |      S.split(sep=None, maxsplit=-1) -> list of strings
 |      
 |      Return a list of the words in S, using sep as the
 |      delimiter string. If sep is not specified, any whitespace string
 |      is a separator.

>>> 'test string'.split()
['test', 'string']
>>> 'test=string'.split('=')
['test', 'string']
>>> 'this is test string'.split(maxsplit=2)
['this', 'is', 'test string']

 |  rsplit(...)   ''' 从右到左进行分割'''
 |      S.rsplit(sep=None, maxsplit=-1) -> list of strings

 |  strip(...)    '''去除首尾的字符，默认为空格'''
 |      S.strip([chars]) -> str
 |      
 |      Return a copy of the string S with leading and trailing
 |      whitespace removed.
 |      If chars is given and not None, remove characters in chars instead.

>>> ' test string '.strip()
'test string
>>> '### test string ###'.strip('#')
' test string '
>>> '### test string ###'.lstrip('#')
' test string ###
>>> '### test string ###'.rstrip('#')
'### test string '

 |  partition(...)
 |      S.partition(sep) -> (head, sep, tail)
 |      
 |      Search for the separator sep in S, and return the part before it, the separator itself, and the part after it.  If the separator is not
 |      found, return S and two empty strings.
>>> '### test ***'.partition('test')
('### ', 'test', ' ***')

 |  rpartition(...)
 |      S.rpartition(sep) -> (head, sep, tail)


 |  replace(...)
 |      S.replace(old, new[, count]) -> str

>>> 'abc'.replace('b','2')
'a2c'

 |  swapcase(...)    ''' 大写转小写，小写转大写'''
 |      S.swapcase() -> str

 |  translate(...)
 |      S.translate(table) -> str

 |  maketrans(x, y=None, z=None, /)
 |      Return a translation table usable for str.translate().

>>> intab='abcd'
>>> outtab='1234'
>>> table=str.maketrans(intab,outtab)
>>> 'abcdefg'.translate(table)
'1234efg'

help(str)

列表 list

创建列表：

ls=['a','b','c'] 或 list(['a','b','c']) 或 list(('a','b','c'))

常用操作：

索引
切片
追加 obj.append()，obj.extend()
插入 obj.insert()
删除 __delitem__()，obj.remove()，obj.pop()
长度 len(obj)
返回索引 obj.index()
循环 for,while
包含 in

注意：append()，extend()，insert()，remove()，pop() 等都是直接修改列表，不会产生新的对象，不能对列表操作这些方法后赋值给另外一个变量，例如：ls2=ls1.append()，如要赋值可以使用__add__()

 |  __contains__(self, key, /)  
 |      Return key in self.

 |  __delitem__(self, key, /)   ''' 删除指定位置的元素 '''
 |      Delete self[key].
>>> ls=['a','b','c']
>>> ls.__delitem__(1)
>>> print(ls)
['a', 'c']


 |  __len__(self, /)    ''' 统计列表的长度（即元素的个数）'''
 |      Return len(self).

 |  append(...)   ''' 在末尾追加单个元素 '''
 |      L.append(object) -> None -- append object to end

 |  clear(...)  
 |      L.clear() -> None -- remove all items from L

 |  copy(...)    ''' 浅拷贝 '''
 |      L.copy() -> list -- a shallow copy of L


 |  count(...)  ''' 统计某个元素的出现个数 '''
 |      L.count(value) -> integer -- return number of occurrences of value
>>> ls=['h','e','l','l','o']
>>> ls.count('l')
2

 |  extend(...)   ''' 从序列里扩展元素 '''
 |      L.extend(iterable) -> None -- extend list by appending elements from the iterable
>>> ls=['h','e','l','l','o']
>>> ls.extend('world')
>>> print(ls)
['h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'l', 'd']

 |  index(...)   ''' 获取某个元素的索引，如有多个，只返回第一个 '''
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value.
 |      Raises ValueError if the value is not present

 |  insert(...)     ''' 在指定的索引迁插入 '''
 |      L.insert(index, object) -- insert object before index

 |  pop(...)   ''' 删除指定位置的元素并获取到这个元素，默认为最后一个元素 '''
 |      L.pop([index]) -> item -- remove and return item at index (default last).
 |      Raises IndexError if list is empty or index is out of range.
>>> ls=['a','b','c',]
>>> ls.pop(2)
'c'
>>> ls.pop()
'b'


 |  remove(...)  ''' 删除指定的元素 '''
 |      L.remove(value) -> None -- remove first occurrence of value.
 |      Raises ValueError if the value is not present.
>>> ls=['a','b','c',]
>>> ls.remove('c')


 |  reverse(...)    ''' 倒序 '''
 |      L.reverse() -- reverse *IN PLACE*
>>> ls=['a','b','c']
>>> ls.reverse()
>>> print(ls)
['c', 'b', 'a']

 |  sort(...)   ''' 对列表内的元素进行排序， 可以指定key '''
 |      L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE*

>>> ls = ['Chr1-10.txt','Chr1-1.txt','Chr1-2.txt','Chr1-14.txt','Chr1-3.txt','Chr1-20.txt','Chr1-5.txt']
>>> ls.sort(key=lambda d : int(d.split('-')[-1].split('.')[0]))
>>> print(ls)
['Chr1-1.txt', 'Chr1-2.txt', 'Chr1-3.txt', 'Chr1-5.txt', 'Chr1-10.txt', 'Chr1-14.txt', 'Chr1-20.txt']

help(list)

元组 tuple

创建元组：

tp=('a','b','c') 或 tp=tuple((1,2,3)) 或 tp=tuple(['x','y','z'])

常用操作：

索引
切片
迭代 for / while
长度 len(obj)
包含 in / not in

注意：元组本身是不可变对象，长度固定，所以代码更安全。所谓的“不变”是指元组的每个元素指向永远不变，元组本身不可变，但元组内嵌套了可变类型的元素，此元素的修改不会返回新元组.

即：

元组的元素只读，不可修改
元组中元素的元素可以修改

如：

>>> tp=(1,'a',['x'])

>>> tp[2].append('y')

>>> print(tp)
(1, 'a', ['x', 'y'])

 |  __add__(self, value, /)
 |      Return self+value. 
>>> tp=(1,2,3)
>>> tp.__add__(('a','b','c'))
(1, 2, 3, 'a', 'b', 'c')

 |  __len__(self, /)
 |      Return len(self).

 |  count(...)
 |      T.count(value) -> integer -- return number of occurrences of value
 

 |  index(...)
 |      T.index(value, [start, [stop]]) -> integer -- return first index of value.
 |      Raises ValueError if the value is not present.

help(tuple)

提示：创建元组或列表后面最好带逗号如 ls=[1,2,3,] 、tp=(1,2,3,)

posted @ 2017-05-13 18:07 bobo0609 Views(508) Comments(0) Edit 收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

bobo0609

Python 数据类型 之 序列类型

公告

Python 数据类型之序列类型