PyPy尚需努力:json模块性能评测
众所周知,PyPy通过JIT来为代码优化提速。于是我比较好奇,json模块在PyPy下,能达到什么样的性能。于是取了ujson里的benchmark.py,略作修改(去掉了simplejson,cjson的评测模块,纳入json模块),作了测试。
首先是自用笔记本,Windows 7 32bit,Intel T4400处理器。
CPython:Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)]。ujson用MinGW编译,版本1.9。
PyPy:[PyPy 1.6.0 with MSC v.1500 32 bit] on win32
CPython下ujson的结果
Array with 256 utf-8 strings: ujson encode : 915.33173 calls/sec ujson decode : 705.71633 calls/sec Medium complex object: ujson encode : 7598.78428 calls/sec ujson decode : 4621.07202 calls/sec Array with 256 strings: ujson encode : 18691.59412 calls/sec ujson decode : 15772.86910 calls/sec Array with 256 doubles: ujson encode : 18018.01585 calls/sec ujson decode : 19011.40602 calls/sec Array with 256 True values: ujson encode : 68587.11944 calls/sec ujson decode : 51921.08963 calls/sec Array with 256 dict{string, int} pairs: ujson encode : 10204.08143 calls/sec ujson decode : 8417.51036 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: ujson encode : 32.57329 calls/sec ujson decode : 20.99076 calls/sec
CPython下json模块的结果:
Array with 256 utf-8 strings: json encode : 414.85170 calls/sec json decode : 106.67236 calls/sec Medium complex object: json encode : 209.64361 calls/sec json decode : 89.14245 calls/sec Array with 256 strings: json encode : 1503.30723 calls/sec json decode : 325.17153 calls/sec Array with 256 doubles: json encode : 604.04711 calls/sec json decode : 245.23628 calls/sec Array with 256 True values: json encode : 2169.29151 calls/sec json decode : 435.21783 calls/sec Array with 256 dict{string, int} pairs: json encode : 312.65633 calls/sec json decode : 122.17172 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: json encode : 1.08326 calls/sec json decode : 0.40534 calls/sec
PyPy的结果
Array with 256 utf-8 strings: json encode : 8.39542 calls/sec json decode : 40.09141 calls/sec Medium complex object: json encode : 268.13965 calls/sec json decode : 676.77314 calls/sec Array with 256 strings: json encode : 2218.27867 calls/sec json decode : 2494.38759 calls/sec Array with 256 doubles: json encode : 1076.19460 calls/sec json decode : 1962.32341 calls/sec Array with 256 True values: json encode : 5399.56804 calls/sec json decode : 22291.57348 calls/sec Array with 256 dict{string, int} pairs: json encode : 536.71102 calls/sec json decode : 882.92423 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: json encode : 1.66834 calls/sec json decode : 2.95386 calls/sec
接下来是在某VPS上的结果,Intel(R) Core(TM)2 Quad CPU Q8200 @ 2.33GHz
CPython:Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) [GCC 4.5.2] on linux2
PyPy:[PyPy 1.6.0 with GCC 4.4.3] on linux2
CPython下ujson的结果
Array with 256 utf-8 strings: ujson encode : 1239.34698 calls/sec ujson decode : 832.05649 calls/sec Medium complex object: ujson encode : 3290.51270 calls/sec ujson decode : 2282.98392 calls/sec Array with 256 strings: ujson encode : 6493.67957 calls/sec ujson decode : 6329.66630 calls/sec Array with 256 doubles: ujson encode : 9788.18311 calls/sec ujson decode : 9516.38747 calls/sec Array with 256 True values: ujson encode : 34412.84246 calls/sec ujson decode : 33757.27813 calls/sec Array with 256 dict{string, int} pairs: ujson encode : 5010.94467 calls/sec ujson decode : 4106.74893 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: ujson encode : 30.00179 calls/sec ujson decode : 26.05744 calls/sec
CPython下json模块的结果:
Array with 256 utf-8 strings: json encode : 337.61263 calls/sec json decode : 100.46091 calls/sec Medium complex object: json encode : 1169.62839 calls/sec json decode : 1004.29777 calls/sec Array with 256 strings: json encode : 7054.24055 calls/sec json decode : 3200.81735 calls/sec Array with 256 doubles: json encode : 1343.81730 calls/sec json decode : 4907.52720 calls/sec Array with 256 True values: json encode : 35550.27567 calls/sec json decode : 34206.37262 calls/sec Array with 256 dict{string, int} pairs: json encode : 2166.32626 calls/sec json decode : 1060.32263 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: json encode : 9.45260 calls/sec json decode : 3.78308 calls/sec
PyPy的结果
Array with 256 utf-8 strings: json encode : 7.68852 calls/sec json decode : 30.25831 calls/sec Medium complex object: json encode : 168.16691 calls/sec json decode : 564.45883 calls/sec Array with 256 strings: json encode : 1830.60901 calls/sec json decode : 2375.47971 calls/sec Array with 256 doubles: json encode : 863.26288 calls/sec json decode : 1251.16513 calls/sec Array with 256 True values: json encode : 3071.44699 calls/sec json decode : 10906.26240 calls/sec Array with 256 dict{string, int} pairs: json encode : 319.31524 calls/sec json decode : 527.84091 calls/sec Dict with 256 arrays with 256 dict{string, int} pairs: json encode : 1.26895 calls/sec json decode : 1.74544 calls/sec
最后附上修改后的benchmark.py:
# -*- encoding=UTF-8 -*-
try:
import ujson
except ImportError:
ujson_enabled = False
else:
ujson_enabled = True
import json
import sys
from time import time as gettime
import time
import sys
import random
user = { "userId": 3381293, "age": 213, "username": "johndoe", "fullname": u"John Doe the Second", "isAuthorized": True, "liked": 31231.31231202, "approval": 31.1471, "jobs": [ 1, 2 ], "currJob": None }
friends = [ user, user, user, user, user, user, user, user ]
decodeData = ""
"""=========================================================================="""
def ujsonEnc():
x = ujson.encode(testObject)
#print "ujsonEnc", x
def jsonEnc():
x = json.dumps(testObject)
#print "jsonEnc", x
"""=========================================================================="""
def ujsonDec():
x = ujson.decode(decodeData)
#print "ujsonDec: ", x
def jsonDec():
x = json.loads(decodeData)
#print "jsonDec: ", x
"""=========================================================================="""
def timeit_compat_fix(timeit):
if sys.version_info[:2] >= (2,6):
return
default_number = 1000000
default_repeat = 3
if sys.platform == "win32":
# On Windows, the best timer is time.clock()
default_timer = time.clock
else:
# On most other platforms the best timer is time.time()
default_timer = time.time
def repeat(stmt="pass", setup="pass", timer=default_timer,
repeat=default_repeat, number=default_number):
"""Convenience function to create Timer object and call repeat method."""
return timeit.Timer(stmt, setup, timer).repeat(repeat, number)
timeit.repeat = repeat
if __name__ == "__main__":
import timeit
timeit_compat_fix(timeit)
print "Array with 256 utf-8 strings:"
testObject = []
for x in xrange(256):
testObject.append("نظام الحكم سلطاني وراثي في الذكور من ذرية السيد تركي بن سعيد بن سلطان ويشترط فيمن يختار لولاية الحكم من بينهم ان يكون مسلما رشيدا عاقلا ًوابنا شرعيا لابوين عمانيين ")
COUNT = 2000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Medium complex object:"
testObject = [ [user, friends], [user, friends], [user, friends], [user, friends], [user, friends], [user, friends]]
COUNT = 5000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Array with 256 strings:"
testObject = []
for x in xrange(256):
testObject.append("A pretty long string which is in a list")
COUNT = 10000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Array with 256 doubles:"
testObject = []
for x in xrange(256):
testObject.append(sys.maxint * random.random())
COUNT = 10000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Array with 256 True values:"
testObject = []
for x in xrange(256):
testObject.append(True)
COUNT = 50000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Array with 256 dict{string, int} pairs:"
testObject = []
for x in xrange(256):
testObject.append({str(random.random()*20): int(random.random()*1000000)})
COUNT = 5000
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )
print "Dict with 256 arrays with 256 dict{string, int} pairs:"
testObject = {}
for y in xrange(256):
arrays = []
for x in xrange(256):
arrays.append({str(random.random()*20): int(random.random()*1000000)})
testObject[str(random.random()*20)] = arrays
COUNT = 50
if ujson_enabled:
print "ujson encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonEnc()", "from __main__ import ujsonEnc", gettime,10, COUNT)), )
else:
print "json encode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonEnc()", "from __main__ import jsonEnc", gettime, 3, COUNT)), )
decodeData = json.dumps(testObject)
if ujson_enabled:
print "ujson decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("ujsonDec()", "from __main__ import ujsonDec", gettime,10, COUNT)), )
else:
print "json decode : %.05f calls/sec" % (COUNT / min(timeit.repeat("jsonDec()", "from __main__ import jsonDec", gettime, 3, COUNT)), )