Python内置数据结构---字符串

字符串就是一段文本，由一个个字符组成的有序序列，其中的字符是Unicode码点表示的。

字符串是不可变对象，可以用单引号、双引号、三引号引起来的字符序列。

字符串的定义

In [1]: s1='hello world'

In [2]: s1
Out[2]: 'hello world'

In [3]: s1="hello world"

In [4]: s1
Out[4]: 'hello world'

In [5]: s1='''let's go'''

In [6]: s1
Out[6]: "let's go"

In [7]: s1="let's go"

In [8]: s1
Out[8]: "let's go"

转义

In [11]: s1='let\'s go'

In [12]: s1
Out[12]: "let's go"

In [17]: s1='hello \n world'

In [18]: print(s1)
hello
 world

In [19]: s1=r'hello \n world'

In [20]: print(s1)
hello \n world

In [21]: s1=R'hello \n world'

In [22]: print(s1)
hello \n world

字符串元素访问

由于有序，所以可以使用索引访问

In [24]: s1="select * from mysql.user where user='root';"

In [25]: s1
Out[25]: "select * from mysql.user where user='root';"

In [26]: s1[2]
Out[26]: 'l'

In [27]: s1[1]
Out[27]: 'e'


## 由于不可变，所以无法修改

In [28]: s1[7]='user'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-4697b4dcb7df> in <module>
----> 1 s1[7]='user'

TypeError: 'str' object does not support item assignment

迭代字符串

In [29]: for i in s1:
    ...:     print(i)
    ...:     print(type(i))
    ...:

与其他类型函数结合使用

s1="select * from mysql.user where user='root';"

lst=list(s1)

print(lst)
['s', 'e', 'l', 'e', 'c', 't', ' ', '*', ' ', 'f', 'r', 'o', 'm', ' ', 'm', 'y', 's', 'q', 'l', '.', 'u', 's', 'e', 'r', ' ', 'w', 'h', 'e', 'r', 'e', ' ', 'u', 's', 'e', 'r', '=', "'", 'r', 'o', 'o', 't', "'", ';']

t1=tuple(s1)

print(t1)
('s', 'e', 'l', 'e', 'c', 't', ' ', '*', ' ', 'f', 'r', 'o', 'm', ' ', 'm', 'y', 's', 'q', 'l', '.', 'u', 's', 'e', 'r', ' ', 'w', 'h', 'e', 'r', 'e', ' ', 'u', 's', 'e', 'r', '=', "'", 'r', 'o', 'o', 't', "'", ';')

字符串拼接

In [32]: "hello " + "world"
Out[32]: 'hello world'

In [33]: a="hello "

In [34]: b="world"

字符串方法

join(iterable)合并字符串元素

语法：'string'.join(iterable) -> str

使用string作为分隔符，将可迭代对象iterable连接起来，返回一个新的字符串（其中可迭代对象iterable的元素必须都是字符串）

In [39]: lst=['1','2','3']

In [40]: sep='+'

In [41]: sep.join(lst)
Out[41]: '1+2+3'

In [42]: print('\n'.join(lst))
1
2
3

join方法要求可迭代对象的元素必须是字符串

In [43]: lst=list(range(3))

In [44]: print('\n'.join(lst))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-44-9c134bf59c3c> in <module>
----> 1 print('\n'.join(lst))

TypeError: sequence item 0: expected str instance, int found

join方法要求可迭代对象的元素必须是简单类型，不能是应用类型

In [45]: lst=['1',['1','2'],'3']

In [46]: print('+'.join(lst))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-46-cd667ec468bd> in <module>
----> 1 print('+'.join(lst))

TypeError: sequence item 1: expected str instance, list found

字符串分割----split

语法：split(sep=None,maxsplit=-1) -> list

按指定分割字符串sep（默认是空白字符），从左至右进行分割。maxsplit为分割的次数，默认值：-1，表示遍历整个字符串

In [1]: path="/usr/bin/env"

In [2]: path="/usr/bin/env python"

In [3]: path.split()
Out[3]: ['/usr/bin/env', 'python']

In [4]: path.split('/')
Out[4]: ['', 'usr', 'bin', 'env python']

In [5]: str1="I'm a\t good student. "

In [6]: str1.split()
Out[6]: ["I'm", 'a', 'good', 'student.']

In [7]: str1.split(maxsplit=1)
Out[7]: ["I'm", 'a\t good student. ']

In [8]: str1.split(maxsplit=2)
Out[8]: ["I'm", 'a', 'good student. ']

In [9]: str1.split('\t',maxsplit=2)
Out[9]: ["I'm a", ' good student. ']

rsplit(sep=None, maxsplit=-1) -> list

按指定分割字符串sep（默认是空白字符），从右至左进行分割。maxsplit为分割的次数，默认值：-1，表示遍历整个字符串

In [11]: str1.rsplit(' ',maxsplit=2)
Out[11]: ["I'm a\t good", 'student.', '']

In [12]: str1.rsplit(' a',maxsplit=2)
Out[12]: ["I'm", '\t good student. ']

注意：split方法分隔符不能为空串

In [10]: str1.rsplit('',maxsplit=2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-0d6b0425ee91> in <module>
----> 1 str1.rsplit('',maxsplit=2)

ValueError: empty separator

splitlines([keepends]) -> list

按照行分隔符（\r、\r\n、\n）来分割字符串，keepends ܶ为是否保留行分隔符

In [22]: 'ab c\n\nde fg\rkl\r\n'.splitlines()
Out[22]: ['ab c', '', 'de fg', 'kl']

In [18]: str1
Out[18]: "I'm a\t good student. \nI'm good study."

In [19]: str1.splitlines()
Out[19]: ["I'm a\t good student. ", "I'm good study."]

In [20]: str1.splitlines(True)
Out[20]: ["I'm a\t good student. \n", "I'm good study."]

字符串分割----partition

语法：partition(sep) -> (head,sep,tail)

按指定分隔符sep，从左至右将字符串分割，返回头、分隔符、尾三部分组成的三元组，如果没有找到分隔符，就返回头和两个空字符串组成的三元组

In [23]: str1="I'm a\t good student. "

In [24]: str1.partition('d')
Out[24]: ("I'm a\t goo", 'd', ' student. ')

In [25]: str1.partition('ood')
Out[25]: ("I'm a\t g", 'ood', ' student. ')

In [27]: str1.partition('hehe')
Out[27]: ("I'm a\t good student. ", '', '')

注意：partition方法分隔符不能为空串

In [26]: str1.partition('')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-26-e93649150055> in <module>
----> 1 str1.partition('')

ValueError: empty separator

字符串修改----replace

语法：replace(old,new[,count]) -> str

在字符串中查找指定子串old，并替换为新子串new，返回新字符串。count为替换次数，缺省时全部替换

In [28]: 'www.abc.bbc.com'.replace('bc','')
Out[28]: 'www.a.b.com'

In [30]: 'www.abc.bbc.com'.replace('bc','',1)
Out[30]: 'www.a.bbc.com'

字符串修改----translate

translate方法和replace方法的不同之处，在于translate方法只能进行单字符替换，能够同时替换多个字符，效率比replace高

translate函数需要一张转换表，这张表指明了不同Unicode码点之间的转换关系。要创建转换表，可对字符串类型调用maketrans方法，该方法接受两个参数：两个长度相同的字符串，指明了要将第一个字符串中的每一个字符转换成第二个字符串中的对应字符。代码如下：

In [33]: str.maketrans('ab','py')
Out[33]: {97: 112, 98: 121}

这里看到返回了Unicode码点的映射关系。

In [34]: table=str.maketrans('ab','py')

In [35]: s1.translate(table)
Out[35]: 'www.pyc.yyc.com'

当使用maketrans方法的时候，可以传入第三个参数，指定将哪些字符串删除，如

In [37]: table=str.maketrans('ab','py','c')

In [38]: table
Out[38]: {97: 112, 98: 121, 99: None}

In [39]: s1.translate(table)
Out[39]: 'www.py.yy.om'

字符串修改----strip

语法：strip([chars]) -> str

从字符串两端去除指定字符集chars中的字符，如果chars没有指定，则去除两端的空白字符

In [41]: s1='\r \n \t hello world \n \t'

In [42]: s1
Out[42]: '\r \n \t hello world \n \t'

In [43]: s1.strip()
Out[43]: 'hello world'

In [44]: s2="I'm fine."

In [45]: s2.strip('I.')
Out[45]: "'m fine"

lstrip([chars])从左侧开始

rstrip([chars])从右侧开始

字符串查找-----find

语法：find(sub[,start[,end]]) -> int

在指定区间[start,end),从左至右，查找子串sub，找到就返回索引，找不到返回-1

语法：rfind(sub[,start[,end]]) -> int

在指定区间[start,end),从右至左，查找子串sub，找到就返回索引，找不到返回-1

In [46]: s1="I'm very very happy!"

In [47]: s1.find('very')
Out[47]: 4

In [48]: s1.find('very',5)
Out[48]: 9

In [49]: s1.find('very',5,10)
Out[49]: -1

In [50]: s1.find('very',-1)
Out[50]: -1

In [53]: s1.find('very',-11,-1)
Out[53]: 9

In [54]: s1.find('very',-1,-11)
Out[54]: -1

字符串查找-----index

语法：index(sub[,start[,end]]) -> int

在指定区间[start,end),从左至右，查找子串sub，找到就返回索引，找不到返回值异常ValueError

语法：rindex(sub[,start[,end]]) -> int

在指定区间[start,end),从右至左，查找子串sub，找到就返回索引，找不到返回值异常ValueError

In [56]: s1.index('very',5)
Out[56]: 9

In [57]: s1.index('very')
Out[57]: 4

字符串查找-----count

语法：count(sub[,start[,end]]) -> int

在指定区间[start,end),从左至右，统计子串sub出现的次数

In [58]: s1
Out[58]: "I'm very very happy!"

In [59]: s1.count('very')
Out[59]: 2

In [60]: s1.count('very',5)
Out[60]: 1

In [61]: s1.count('very',4)
Out[61]: 2

字符串判断

语法：endswith(suffix[, start[, end]]) -> bool

在指定区间[start,end)，判断字符串是否以suffix结尾

In [80]: s1.endswith('very',4,14)
Out[80]: False

In [81]: s1.endswith('very',4,13)
Out[81]: True

In [82]: s1.endswith('very',4)
Out[82]: False

In [83]: s1.endswith('y',4,-1)
Out[83]: True

语法：startswith(prefix[, start[, end]]) -> bool

在指定区间[start,end)，判断字符串是否以prefix开头

In [72]: s1
Out[72]: "I'm very very happy!"

In [73]: s1.startswith('very',4,10)
Out[73]: True

In [75]: s1.startswith('')
Out[75]: True

字符串判断is

isalnum() -> bool ީ是否是字母和数字组成

isalpha() ީ是否是字母

isdecimal()是否只包含十进制数字

isdigit() 是否全部数字（0~9）

isidentifier() 是否以字母和下划线开头，其他都是字母、数字、下划线

islower() 是否全部小写

isupper() 是否全部大写

isspace() ީ是否只包含空白字符

posted on 2020-08-02 15:37 hopeless-dream 阅读(254) 评论(0) 收藏举报

刷新页面返回顶部

hopeless-dream

导航

公告