Python 字符串多替换时性能基准测试

结论

先说结果, 直接替换是最好的. replace 一层层用, 方法笨了一点, 还可以.

懒得打字, 贴代码就完事了.

基准测试1

from cProfile import run

s = '1 a  2 \n \t \r e34234'


def _replace():
    for x in range(5000000):
        old_value2 = s.replace('\t', '')
        old_value3 = old_value2.replace('\n', '')
        old_value3.replace('\r', '')


def _replace3():
    for x in range(5000000):
        old_value2 = s.replace('\t', '\\t')
        old_value3 = old_value2.replace('\n', '\\n')
        old_value3.replace('\r', '\\r')


def _translate1():
    for x in range(5000000):
        s.translate(str.maketrans({'\t': '', '\n': '', '\r': ''}))


t2 = str.maketrans({'\t': '', '\n': '', '\r': ''})
t3 = str.maketrans({'\t': None, '\n': None, '\r': None})
t4 = str.maketrans({'\t': '\\t', '\n': '\\n', '\r': '\\r'})


def _translate2():
    for x in range(5000000):
        s.translate(t2)


def _translate3():
    for x in range(5000000):
        s.translate(t3)


def _translate4():
    for x in range(5000000):
        s.translate(t4)


print('### replace')
run("_replace()")
print('### replace3')
run("_replace3()")
print('### translate1')
run("_translate1()")
print('### translate2')
run("_translate2()")
print('### translate3')
run("_translate3()")
print('### translate4')
run("_translate4()")

速度: _replace > translate3 > _replace3 > translate2 > translate1 > translate4
结论: translate是个辣鸡~~

运行结果:

### replace
         15000004 function calls in 4.451 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.451    4.451 <string>:1(<module>)
        1    1.721    1.721    4.451    4.451 demo.py:7(_replace)
        1    0.000    0.000    4.451    4.451 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
 15000000    2.730    0.000    2.730    0.000 {method 'replace' of 'str' objects}


### replace3
         15000004 function calls in 4.785 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.785    4.785 <string>:1(<module>)
        1    1.830    1.830    4.785    4.785 demo.py:14(_replace3)
        1    0.000    0.000    4.785    4.785 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
 15000000    2.956    0.000    2.956    0.000 {method 'replace' of 'str' objects}


### translate1
         10000004 function calls in 7.741 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    7.741    7.741 <string>:1(<module>)
        1    1.870    1.870    7.741    7.741 demo.py:21(_translate1)
        1    0.000    0.000    7.741    7.741 {built-in method builtins.exec}
  5000000    1.052    0.000    1.052    0.000 {built-in method maketrans}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  5000000    4.819    0.000    4.819    0.000 {method 'translate' of 'str' objects}


### translate2
         5000004 function calls in 5.284 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.284    5.284 <string>:1(<module>)
        1    0.702    0.702    5.284    5.284 demo.py:31(_translate2)
        1    0.000    0.000    5.284    5.284 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  5000000    4.582    0.000    4.582    0.000 {method 'translate' of 'str' objects}


### translate3
         5000004 function calls in 3.548 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.548    3.548 <string>:1(<module>)
        1    0.720    0.720    3.548    3.548 demo.py:36(_translate3)
        1    0.000    0.000    3.548    3.548 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  5000000    2.828    0.000    2.828    0.000 {method 'translate' of 'str' objects}


### translate4
         5000004 function calls in 5.751 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.751    5.751 <string>:1(<module>)
        1    0.722    0.722    5.751    5.751 demo.py:41(_translate4)
        1    0.000    0.000    5.751    5.751 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  5000000    5.029    0.000    5.029    0.000 {method 'translate' of 'str' objects}

基准测试2

时间消耗:

  • tx2 < tx3 < tx1 < tx4
  • t2 < t3 < t1 < t4

a = '你好的\r\n打分a\r\tdeadaaes\r\n\tttttrb'

k = ('\r', '\n', '\t')


def t1(text):
    for ch in k:
        if ch in text:
            text = text.replace(ch, ' ')
    return text


def t2(old_value1):
    # data reformat
    old_value2 = old_value1.replace('\t', ' ')
    old_value3 = old_value2.replace('\n', ' ')
    return old_value3.replace('\r', ' ')


def t3(old_value):
    # data reformat
    old_value = old_value.replace('\t', ' ')
    old_value = old_value.replace('\n', ' ')
    return old_value.replace('\r', ' ')


def t3_1(old_value):
    # data reformat
    return old_value.replace('\r', ' ').replace('\t', ' ').replace('\n', ' ')


def t4(s):
    t = s.maketrans("\n\t\r", "   ")
    return s.translate(t)


def tx1(x):
    for i in range(0, 100000):
        t1(x)


def tx2(x):
    for i in range(0, 100000):
        t2(x)


def tx3(x):
    for i in range(0, 100000):
        t3(x)


def tx3_1(x):
    for i in range(0, 100000):
        t3_1(x)


def tx4(x):
    for i in range(0, 100000):
        t4(x)


tx1(a)
tx2(a)
tx3(a)
tx3_1(a)
tx4(a)

Profile:

https://stackoverflow.com/questions/3411771/best-way-to-replace-multiple-characters-in-a-string

posted @ 2020-04-07 23:12  Eureka912  阅读(664)  评论(0编辑  收藏  举报