python利用eval方法提升dataframe运算性能

 eval方法可以直接利用c语言的速度,而不用分配中间数组,不需要中间内存的占用.

如果包含多个步骤,每个步骤都要分配一块内存

import numpy as np
import pandas as pd
import timeit


df = pd.DataFrame({'a': np.random.randn(10000000),
'b': np.random.randn(10000000),
'c': np.random.randn(10000000),
'x': 'x'})
# print df
start_time = timeit.default_timer()
df['a']/( df['b']+0.1)-df['c']
end_time = timeit.default_timer()
print (end_time - start_time)
print "___________________"
start_time = timeit.default_timer()
pd.eval("df['a']/( df['b']+0.1)-df['c']")
end_time = timeit.default_timer(http://www.my516.com)
print (end_time - start_time)
运行时间对比 

0.136633455546
___________________
0.087637596342
As of version 0.13 (released January 2014), Pandas includes some experimental tools that allow you to directly access C-speed operations without costly allocation of intermediate arrays.
---------------------

posted on 2019-07-17 04:53  激流勇进1  阅读(765)  评论(0编辑  收藏  举报