Comparison Profiles for Different Implementations
In all cases the same file of 100,000 Higgs to Z Z events were used.
>>> import hotshot
>>> p= hotshot.Profile("pyroot_100000.profile")
>>> p.run('execfile("test_100000.py")')
>>> stats.print_stats(40)
5371878 function calls (5371819 primitive calls) in 191.914 CPU seconds
Ordered by: internal time, call count
List reduced from 123 to 40 due to restriction <40>
ncalls tottime percall cumtime percall filename:lineno(function)
100000 65.385 0.001 65.385 0.001 roottools.py:55(__readData)
2719749 56.256 0.000 56.256 0.000 roottools.py:63(range)
900000 37.053 0.000 37.053 0.000 test_100000.py:14(<generator expression>)
1 21.373 21.373 191.913 191.913 test_100000.py:1(?)
751048 5.330 0.000 20.676 0.000 test_100000.py:12(<generator expression>)
500000 3.816 0.000 3.816 0.000 test_100000.py:16(<generator expression>)
1 1.204 1.204 1.360 1.360 roottools.py:1(?)
100001 0.431 0.000 0.941 0.000 roottools.py:40(__iter__)
100000 0.395 0.000 65.780 0.001 roottools.py:58(__call__)
100000 0.339 0.000 0.510 0.000 roottools.py:32(__setBranchIndicies)
100000 0.171 0.000 0.171 0.000 roottools.py:52(setIndex)
1 0.053 0.053 0.122 0.122 ROOT.py:1(?)
1 0.023 0.023 0.067 0.067 string.py:20(?)
1 0.023 0.023 0.023 0.023 threading.py:1(?)
1 0.020 0.020 0.031 0.031 sre.py:94(?)
1 0.010 0.010 0.033 0.033 ROOT.py:358(__doGUIThread)
1 0.010 0.010 0.010 0.010 sre_compile.py:11(?)
1 0.010 0.010 0.041 0.041 re.py:1(?)
1 0.002 0.002 0.004 0.004 roottools.py:47(__init__)
25/1 0.001 0.000 191.914 191.914 <string>:1(?)
13/5 0.001 0.000 0.002 0.000 sre_parse.py:374(_parse)
1 0.001 0.001 0.001 0.001 ROOT.py:265(__init__)
1 0.001 0.001 0.001 0.001 ROOT.py:140(std)
293 0.001 0.000 0.001 0.000 sre_parse.py:201(get)
1 0.001 0.001 0.002 0.002 roottools.py:5(createBranchBuffer)
1 0.001 0.001 0.001 0.001 <string>:0(?)
12 0.000 0.000 0.000 0.000 ROOT.py:283(__setattr__)
326 0.000 0.000 0.000 0.000 sre_parse.py:182(__next)
1 0.000 0.000 0.000 0.000 sre_parse.py:11(?)
16/5 0.000 0.000 0.000 0.000 sre_compile.py:27(_compile)
56 0.000 0.000 0.000 0.000 sre_parse.py:195(match)
18/7 0.000 0.000 0.000 0.000 sre_parse.py:140(getwidth)
10/5 0.000 0.000 0.002 0.000 sre_parse.py:301(_parse_sub)
5 0.000 0.000 0.002 0.000 sre_parse.py:658(parse)
5 0.000 0.000 0.003 0.001 sre_compile.py:490(compile)
2 0.000 0.000 0.000 0.000 threading.py:152(__init__)
1 0.000 0.000 0.000 0.000 sre_constants.py:12(?)
34 0.000 0.000 0.000 0.000 sre_parse.py:130(__getitem__)
4 0.000 0.000 0.000 0.000 sre_compile.py:202(_optimize_charset)
6 0.000 0.000 0.000 0.000 ROOT.py:340(__getattr2)
Signal based engine
>>> stats.print_stats(40)
93876431 function calls (56762953 primitive calls) in 643.233 CPU seconds
Ordered by: internal time, call count
List reduced from 243 to 40 due to restriction <40>
ncalls tottime percall cumtime percall filename:lineno(function)
7168825/100002 111.167 0.000 637.478 0.006 signalslot.py:37(__call__)
10137542/100002 90.078 0.000 636.336 0.006 signalslot.py:98(__call__)
100000 66.941 0.001 66.941 0.001 roottools.py:55(__readData)
1968701 42.864 0.000 42.864 0.000 roottools.py:63(range)
9537516/100002 40.397 0.000 635.479 0.006 signalslot.py:90(func)
6868801/100002 39.544 0.000 638.024 0.006 engine.py:158(send)
5337410/4137406 36.529 0.000 362.809 0.000 engine.py:186(__call__)
200010/100002 28.617 0.000 558.867 0.006 engine.py:198(__call__)
13537610 18.764 0.000 18.764 0.000 engine.py:109(stackDepth)
1900030/1200006 17.980 0.000 180.267 0.000 engine.py:224(__call__)
1400000 16.279 0.000 35.980 0.000 engine.py:142(copyWithExtend)
1800054/100002 16.017 0.000 635.096 0.006 engine.py:171(__call__)
4468708 14.896 0.000 14.896 0.000 engine.py:104(__init__)
7637402 11.263 0.000 11.263 0.000 engine.py:115(data)
1868701 10.185 0.000 10.185 0.000 root_engine_test_100000.py:56(<lambda>)
2000006 7.514 0.000 7.514 0.000 engine.py:10(__init__)
1700004 6.203 0.000 12.622 0.000 engine.py:20(copyWithExtend)
2268701 5.513 0.000 5.513 0.000 engine.py:14(increment)
100000 4.107 0.000 4.107 0.000 root_engine_test_100000.py:88(<lambda>)
3100002 4.052 0.000 4.052 0.000 engine.py:111(breadcrumbs)
300029/200014 3.851 0.000 146.628 0.001 engine.py:298(__call__)
500000 3.526 0.000 3.526 0.000 root_engine_test_100000.py:13(accumulate)
1868701 3.316 0.000 3.316 0.000 root_engine_test_100000.py:53(<lambda>)
400000 2.945 0.000 5.946 0.000 engine.py:254(__readyToProcess)
200010/200005 2.414 0.000 3.614 0.000 engine.py:281(if_)
400000 2.288 0.000 2.288 0.000 root_engine_test_100000.py:69(<lambda>)
400000 2.241 0.000 2.241 0.000 root_engine_test_100000.py:67(<lambda>)
400000 2.202 0.000 2.202 0.000 root_engine_test_100000.py:93(<lambda>)
200008 2.078 0.000 130.434 0.001 engine.py:264(right)
399998 2.016 0.000 2.016 0.000 engine.py:55(synchedTo)
100000 1.932 0.000 1.932 0.000 root_engine_test_100000.py:51(<lambda>)
200000 1.898 0.000 125.020 0.001 engine.py:251(__sendSignal)
200000 1.834 0.000 1.834 0.000 root_engine_test_100000.py:71(<lambda>)
200000 1.823 0.000 1.823 0.000 root_engine_test_100000.py:74(<lambda>)
200008 1.653 0.000 6.145 0.000 engine.py:257(left)
200001 1.641 0.000 2.727 0.000 engine.py:137(copyForEnd)
200001 1.618 0.000 2.720 0.000 engine.py:132(copyForBegin)
1 1.383 1.383 1.463 1.463 roottools.py:1(?)
4 1.329 0.332 1.329 0.332 root_engine_test_100000.py:16(value)
100000 1.264 0.000 68.581 0.001 root_engine_test_100000.py:49(<lambda>)
Code writing version of Engine
>>> import hotshot
>>> p = hotshot.Profile("h_100000.profile")
>>> p.run('execfile("root_engine_test_100000.py")')
>>> stats.print_stats(40)
10314162 function calls (10313314 primitive calls) in 166.684 CPU seconds
Ordered by: internal time, call count
List reduced from 239 to 40 due to restriction <40>
ncalls tottime percall cumtime percall filename:lineno(function)
100000 61.659 0.001 61.659 0.001 roottools.py:55(__readData)
1968701 41.051 0.000 41.051 0.000 roottools.py:63(range)
1 23.360 23.360 164.451 164.451 <string>:2(?)
1868701 9.538 0.000 9.538 0.000 root_engine_test_100000.py:56(<lambda>)
100000 3.955 0.000 3.955 0.000 root_engine_test_100000.py:88(<lambda>)
500000 3.328 0.000 3.328 0.000 root_engine_test_100000.py:13(accumulate)
1868701 3.259 0.000 3.259 0.000 root_engine_test_100000.py:53(<lambda>)
400000 2.095 0.000 2.095 0.000 root_engine_test_100000.py:69(<lambda>)
400000 2.068 0.000 2.068 0.000 root_engine_test_100000.py:93(<lambda>)
400000 2.046 0.000 2.046 0.000 root_engine_test_100000.py:67(<lambda>)
100000 1.854 0.000 1.854 0.000 root_engine_test_100000.py:51(<lambda>)
200000 1.750 0.000 1.750 0.000 root_engine_test_100000.py:71(<lambda>)
200000 1.711 0.000 1.711 0.000 root_engine_test_100000.py:74(<lambda>)
2 1.246 0.623 1.246 0.623 root_engine_test_100000.py:16(value)
100000 1.205 0.000 63.248 0.001 root_engine_test_100000.py:49(<lambda>)
1 1.203 1.203 1.280 1.280 roottools.py:1(?)
1 0.774 0.774 166.680 166.680 root_engine_test_100000.py:1(?)
100000 0.624 0.000 0.624 0.000 root_engine_test_100000.py:112(<lambda>)
200000 0.541 0.000 0.541 0.000 root_engine_test_100000.py:38(accumulate)
100001 0.418 0.000 0.931 0.000 roottools.py:40(__iter__)
200000 0.392 0.000 0.392 0.000 root_engine_test_100000.py:23(reset)
100000 0.384 0.000 62.043 0.001 roottools.py:58(__call__)
100000 0.339 0.000 0.513 0.000 roottools.py:32(__setBranchIndicies)
400000 0.336 0.000 0.336 0.000 root_engine_test_100000.py:26(accumulate)
200000 0.262 0.000 0.262 0.000 root_engine_test_100000.py:29(value)
100000 0.241 0.000 0.241 0.000 root_engine_test_100000.py:99(<lambda>)
100000 0.191 0.000 0.191 0.000 root_engine_test_100000.py:109(<lambda>)
100000 0.173 0.000 0.173 0.000 roottools.py:52(setIndex)
100000 0.120 0.000 0.120 0.000 root_engine_test_100000.py:91(<lambda>)
100000 0.116 0.000 0.116 0.000 root_engine_test_100000.py:35(reset)
100000 0.102 0.000 0.102 0.000 root_engine_test_100000.py:41(value)
100000 0.087 0.000 0.087 0.000 root_engine_test_100000.py:103(<lambda>)
1 0.042 0.042 0.044 0.044 ROOT.py:1(?)
1 0.029 0.029 0.136 0.136 inspect.py:24(?)
1 0.023 0.023 0.023 0.023 threading.py:1(?)
1 0.022 0.022 0.063 0.063 string.py:20(?)
1 0.019 0.019 0.029 0.029 sre.py:94(?)
1 0.018 0.018 0.166 0.166 signalslot.py:7(?)
1 0.011 0.011 0.011 0.011 weakref.py:6(?)
1 0.010 0.010 0.033 0.033 ROOT.py:358(__doGUIThread)
Conclusion
The 'python code writing' version is nearly 4 times faster than the signal version and 15% faster than the pyroot version. The = __readData= method is where the TBranch is actually being read (and converted to python by
PyROOT). The
range
is where the C++ iterators are being converted into a python iterator and used to do a loop. Therefore both the
PyROOT and the 'python code writing' version are spending all their time in the Python to C++ based code.
--
ChrisDJones - 29 Oct 2006