[PY3]——内置数据结构(3)——字符串及其常用操作

字符串及其常用操作xmind图

字符串的定义

1. 单引号/双引号

In [1]: s1='hello world'
In [2]: s1="hello world"

2. 三对单引号/三对双引号

In [8]: s1='''hello
   ...: world''';  print(s1)    #三个单引号支持字符串分行
hello
world

In [9]: s1="""hello world"""

转义

In [24]: s='I\'m jelly';print(s)
I'm jelly

In [25]: s='file path:c:\\windows\\';print(s)
file path:c:\windows\

# 在三对单引号中，可以自由地使用单/双引号而不需要考虑转义引号
In [20]: sql='''select * table where name='jelly' ''';print(sql)
select * table where name='jelly'

In [21]: sql='''select * table where name="jelly" ''';print(sql)
select * table where name="jelly"


# r' ' 原样输出，不转义
In [27]: path=r'c:\windows\system32\???';print(path)
c:\windows\system32\???

访问字符串，字符串不可变

In [1]: print(s)
I'm jelly

In [2]: print(s[0])
I

In [3]: print(s[3])

In [4]: type(s[3])
Out[4]: str

In [5]: s[3]='B'
TypeError: 'str' object does not support item assignment

字符串常用操作

###连接操作###

1. join()

# join是字符串的方法，参数是“内容为字符串”的“可迭代对象”，接收者是分隔符

In [7]: s=str()
In [8]: help(s.join)
join(...) method of builtins.str instance
    S.join(iterable) -> str
    Return a string which is the concatenation of the strings in the
    iterable.  The separator between elements is S.

In [1]: lst=['I','\'m','jelly']

In [2]: print(lst)
['I', "'m", 'jelly']

In [3]: ' '.join(lst)
Out[3]: "I 'm jelly"

In [4]: '/'.join(lst)
Out[4]: "I/'m/jelly"

In [5]: lst   //这个lst的内容是int而不是string所以抛出TypeError
Out[5]: [1, 2, 3, 4, 5, 6, 7, 8]

In [6]: '.'.join(lst)   
TypeError: sequence item 0: expected str instance, int found

###分割操作###

1. split()

In [13]: help(s.split)
split(...) method of builtins.str instance
    S.split(sep=None, maxsplit=-1) -> list of strings
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.

#split默认使用空格分割，并且多个空格会当成一个空格处理
In [17]: s1="I  love  python";s1.split()
Out[17]: ['I', 'love', 'python']

#应当注意若是split指定了分隔符为空格，则每个空格都处理
In [18]: s1="I  love  python";s1.split(' ')
Out[18]: ['I', '', 'love', '', 'python']

#seq参数表示指定分隔符，分隔符可以是任意字符串
In [21]: s2="I love python";s2.split('o')
Out[21]: ['I l', 've pyth', 'n']

#split从左往右分割，maxsplit参数表示分割次数，默认值为-1表示分割所有分隔符
In [25]: s3="i i i i i i"; s3.split(maxsplit=-1)
Out[25]: ['i', 'i', 'i', 'i', 'i', 'i']

In [26]: s3="i i i i i i"; s3.split(maxsplit=1)
Out[26]: ['i', 'i i i i i']

In [27]: s3="i i i i i i"; s3.split(maxsplit=2)
Out[27]: ['i', 'i', 'i i i i']

# 当字符串不含分隔符，以及当字符串只含分隔符的情况
In [11]: ''.split('=')
Out[11]: ['']

In [12]: '='.split('=')
Out[12]: ['', '']

2. rsplit()

In [14]: help(s.rsplit)
rsplit(...) method of builtins.str instance
    S.rsplit(sep=None, maxsplit=-1) -> list of strings
    Return a list of the words in S, using sep as the
    delimiter string, starting at the end of the string and
    working to the front.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified, any whitespace string
    is a separator.
#rsplit与split的方向相反，是从右往左分割，但我们可以看到当不涉及maxsplit参数时，rsplit的结果split完全一致
#但split的效率高于rsplit
In [29]: s1="I  love python";s1.rsplit()
Out[29]: ['I', 'love', 'python']

In [30]: s1="I  love python";s1.rsplit(' ')
Out[30]: ['I', '', 'love', 'python']

In [31]: s2="I love python";s1.rsplit('o')
Out[31]: ['I  l', 've pyth', 'n']

#涉及到maxsplit参数，则rsplit和split的输出结果相反
In [32]: s3="i i i i i i";s3.rsplit(maxsplit=-1)
Out[32]: ['i', 'i', 'i', 'i', 'i', 'i']

In [33]: s3="i i i i i i";s3.rsplit(maxsplit=1)
Out[33]: ['i i i i i', 'i']

In [34]: s3="i i i i i i";s3.rsplit(maxsplit=2)
Out[34]: ['i i i i', 'i', 'i']

3. splitlines()

In [15]: help(s.splitlines)
splitlines(...) method of builtins.str instance
    S.splitlines([keepends]) -> list of strings
    Return a list of the lines in S, breaking at line boundaries.
    Line breaks are not included in the resulting list unless keepends
    is given and true.

In [1]: s="""first line
   ...: second line"""

# splitlines按行分割，默认返回结果不带换行符
In [2]: s.splitlines()
Out[2]: ['first line', 'second line']

# 设置True参数则返回结果带换行符
In [3]: s.splitlines(True)
Out[3]: ['first line\n', 'second line']

4. partition()/rpartition()

In [4]: help(s.partition)
partition(...) method of builtins.str instance
    S.partition(sep) -> (head, sep, tail)
    Search for the separator sep in S, and return the part before it,
    the separator itself, and the part after it.  If the separator is not
    found, return S and two empty strings.

# 将字符串按照传入的分隔符seq分割一次，返回结果总是一个三元组——（head，seq，tail）

In [5]: s="first/second/third";s.partition('/')
Out[5]: ('first', '/', 'second/third')

In [6]: s="first/second/third";s.rpartition('/')
Out[6]: ('first/second', '/', 'third')

# partition常用于分割配置文件
In [8]: cfg='env=PATH=/usr/bin:$PATH'; cfg.partition('=')
Out[8]: ('env', '=', 'PATH=/usr/bin:$PATH')

# 当字符串不含分隔符，以及当字符串只含分隔符的情况
In [9]: ''.partition('=')
Out[9]: ('', '', '')

In [10]: '='.partition('=')
Out[10]: ('', '=', '')         //总之输出的结果总是一个三元组

###字符的转化/排版类操作###

1. 大小写转化

# upper()转化为大写
In [1]: s='test';s.upper()
Out[1]: 'TEST'

# lower()转化为小写
In [3]: s='Test';s.lower()
Out[3]: 'test'

# title()将各单词首字母转化为大写
In [6]: s='i love python';s.title()
Out[6]: 'I Love Python'

# capitalize()仅将首单词首字母转化为大写
In [7]: s='i love python';s.capitalize()
Out[7]: 'I love python'

# casefold()通常用于忽略大小写
In [24]: s='Test TTest';s.casefold()
Out[24]: 'test ttest'

# swapcase()大小写互换
In [27]: s='TEst';s.swapcase()
Out[27]: 'teST'

2. 排版相关（了解即可，用的不多）

# center()
In [9]: s='test';s.center(20)
Out[9]: '        test        '

# zfill()
In [22]: s="700";s.zfill(20)
Out[22]: '00000000000000000700'

# expandtabs(n)将table转化为n个空格
In [29]: '\t'.expandtabs(6)
Out[29]: '      '

###修改操作###

1. replace()

In [2]: help(s.replace)
replace(...) method of builtins.str instance
    S.replace(old, new[, count]) -> str
    Return a copy of S with all occurrences of substring
    old replaced by new.  If the optional argument count is
    given, only the first count occurrences are replaced.

# replace('old','new'),将字符串中的old全部替换成new
In [3]: s='red red green';s.replace('red','yellow')
Out[3]: 'yellow yellow green'

# replace('old','new'count),count可以用来指定替换次数
In [4]: s='red red green';s.replace('red','yellow',1)
Out[4]: 'yellow red green'

2. strip()/lstrip()/rstrip()

In [5]: help(s.strip)
strip(...) method of builtins.str instance
    S.strip([chars]) -> str
    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.

In [6]: help(s.lstrip)
lstrip(...) method of builtins.str instance
    S.lstrip([chars]) -> str
    Return a copy of the string S with leading whitespace removed.
    If chars is given and not None, remove characters in chars instead.

In [7]: help(s.rstrip)
rstrip(...) method of builtins.str instance
    S.rstrip([chars]) -> str
    Return a copy of the string S with trailing whitespace removed.
    If chars is given and not None, remove characters in chars instead.

# strip()用于移除字符串前后的空白
In [1]: s=' hhh hhh ';s.strip()
Out[1]: 'hhh hhh'

In [3]: s='\n \r \t hhh hhh \t \n \r';s.strip()
Out[3]: 'hhh hhh'

# strip()还可以移除指定的多个字符
In [4]: s='###hhh###kkk###';s.strip('#')
Out[4]: 'hhh###kkk'

In [5]: s='{{ hhh }}';s.strip('{}')
Out[5]: ' hhh '

In [6]: s='{{ hhh }}';s.strip('{} ')
Out[6]: 'hhh'

# lstrip()和rstrip()则分别为只移除左/右端
In [8]: s='{{ hhh }}';s.lstrip('{} ')
Out[8]: 'hhh }}'

In [9]: s='{{ hhh }}';s.rstrip('{} ')
Out[9]: '{{ hhh'

2. center()/ljust()/rjust()

In [16]: help(s.center)
center(...) method of builtins.str instance
    S.center(width[, fillchar]) -> str
    Return S centered in a string of length width. Padding is
    done using the specified fill character (default is a space)

In [18]: help(s.ljust)
ljust(...) method of builtins.str instance
    S.ljust(width[, fillchar]) -> str
    Return S left-justified in a Unicode string of length width. Padding is
    done using the specified fill character (default is a space).

In [19]: help(s.rjust)
rjust(...) method of builtins.str instance
    S.rjust(width[, fillchar]) -> str
    Return S right-justified in a string of length width. Padding is
    done using the specified fill character (default is a space).

In [9]: s='test';len(s)
Out[9]: 4

# rjust()填充字符串，原串在右侧（默认用空格填充，填充的字符可指定）
In [10]: s1=s.rjust(10,'#');print(s1);len(s1)
######test
Out[10]: 10

# ljust()填充字符串，原串在左侧
In [11]: s1=s.ljust(10,'#');print(s1);len(s1)
test######
Out[11]: 10

# center()填充字符串，原串在中间
In [13]: s1=s.center(11,'#');print(s1);len(s1)
####test###
Out[13]: 11

# 如果指定的填充字符宽度小于原字符串的宽度，则不做任何操作
In [14]: s1=s.center(3,'#');print(s1);len(s1)
test
Out[14]: 4

###查找操作###

1. find()

In [17]: help(s.find)
find(...) method of builtins.str instance
    S.find(sub[, start[, end]]) -> int
    Return the lowest index in S where substring sub is found,
    such that sub is contained within S[start:end].  Optional
    arguments start and end are interpreted as in slice notation.
    Return -1 on failure.

# find()从左往右查找，找到第一个子串，返回子串首字母的索引
In [22]: s='holler hello love';s.find('hello')
Out[22]: 7

In [23]: s='holler hello love';s.find('h')
Out[23]: 0

# rfind()是从右往左查找
In [30]: s='holler hello love';s.rfind('h')
Out[30]: 7

# find同样可以指定索引范围
In [25]: s='holler hello love';s.find('h',3)
Out[25]: 7

# 查找不到则返回-1
In [24]: s='holler hello love';s.find('hhh')
Out[24]: -1

2. index()/rindex()

In [18]: help(s.index)
index(...) method of builtins.str instance
    S.index(sub[, start[, end]]) -> int
    Like S.find() but raise ValueError when the substring is not found.

# index()和find()用法一样
In [31]: s='holler hello love';s.index('hello')
Out[31]: 7

# 唯一的区别：是找不到子串时抛出ValueError异常
In [32]: s='holler hello love';s.index('hhh')
ValueError: substring not found

3. count()

In [20]: help(s.count)
count(...) method of builtins.str instance
    S.count(sub[, start[, end]]) -> int
    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end].  Optional arguments start and end are
    interpreted as in slice notation.

# count()用于统计字符串出现次数
In [33]: s='holler hello love';s.count('hello')
Out[33]: 1

In [35]: s='holler hello love';s.count('o')
Out[35]: 3

In [37]: s='holler hello love';s.count('o',0,6)
Out[37]: 1

# 找不到时抛出ValueError异常
In [36]: s='holler hello love';s.index('hhh')
ValueError: substring not found

###判断操作###

1. startswith()
In [39]: help(s.startswith)
startswith(...) method of builtins.str instance
    S.startswith(prefix[, start[, end]]) -> bool
    Return True if S starts with the specified prefix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    prefix can also be a tuple of strings to try.

# startswith()用于判断是否以某个字符串开始，返回结果是bool
In [41]: s='holler hello love';s.startswith('h')
Out[41]: True

In [42]: s='holler hello love';s.startswith('holler')
Out[42]: True

In [43]: s='holler hello love';s.startswith('hhh')
Out[43]: False

# 同样可以指定start和end表示索引从何开始、从何结束
In [44]: s='holler hello love';s.startswith('h',2,8)
Out[44]: False

2. endswith( )

In [45]: help(s.endswith)
endswith(...) method of builtins.str instance
    S.endswith(suffix[, start[, end]]) -> bool
    Return True if S ends with the specified suffix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    suffix can also be a tuple of strings to try.

# endswith()用于判断是否以某个字符串结束，返回结果是bool
In [1]: s='holler hello love';s.endswith('e')
Out[1]: True

In [2]: s='holler hello love';s.endswith('love')
Out[2]: True

In [3]: s='holler hello love';s.endswith('o',2,12)
Out[3]: True

3. is*( ) string中is开头的方法用来做判断

# isalnum()判断字符串是否只含有字母或数字
# 可以用来判断是否有空格或其它字符
In [8]: s='holler';s.isalnum()
Out[8]: True

In [9]: s='holler\t';s.isalnum()
Out[9]: False

In [10]: s='holler hello love';s.isalnum()
Out[10]: False

In [11]: s='holler123';s.isalnum()
Out[11]: True

# isalpha()判断字符串是否只含有字母
In [13]: s='123';s.isalpha()
Out[13]: False

In [14]: s='abc';s.isalpha()
Out[14]: True

In [15]: s='abc123';s.isalpha()
Out[15]: False

# isdecimal()判断字符串是否只含有数字
In [19]: s='123';s.isdecimal()
Out[19]: True

In [20]: s='abc';s.isdecimal()
Out[20]: False

In [21]: s='abc123';s.isdecimal()
Out[21]: False

# isidentifier()判断字符是否是一个合法标识符
# 合法标识符：字母/下划线开头，仅包含字母、数字、下划线

In [22]: s='_abc';s.isidentifier()
Out[22]: True

In [23]: s='1abc';s.isidentifier()
Out[23]: False

In [24]: s='abc_123';s.isidentifier()
Out[24]: True

posted @ 2017-03-18 11:10 Jelly_lyj 阅读(285) 评论(0) 编辑收藏举报

刷新页面返回顶部

Jelly_lyj

Thoughts, Stories and Ideas.