2008-12-12 06:21:53 CPLB handler improvements (re-benchmarked)
Michael McTernan (UNITED KINGDOM)
Message: 66661
Hi,
I've reprofiled Bernd's latest patches and the results are awesome:
cplb-c-5.diff
CONFIG_CPLB_SWITCH_TAB_L1=y
cplb-c-5.diff Original ASM Difference
Min 434 1321 33%
Average 462 2960 16%
Max 1045 4113 25%
cplb-c-5.diff - CPLB not in SRAM
CONFIG_CPLB_SWITCH_TAB_L1 is not set
cplb-c-5.diff Original ASM Difference
Min 425 1755 24%
Average 477 12252 4%
Max 1823 18553 10%
(All numbers in nano-seconds)
Attached are some graphs of this, although I think it's pretty clear that the C implementation blows the socks of the ASM version.
Bernd - it's also pretty clear that you've improved on my implementation (both from the code and the stats), so I didn't do a run of my patch vs yours, but from my original data I was getting the average case to be 25% of the ASM implementation, where as you get down to 16% in the comparable version.
What else needs to be done before the patch can get into trunk? I'm going to run it on my 2008 branch over here for a while to ensure there's nothing unexpected, but so far it's flying - many thanks!
Mike
data.pdf
comparison-SRAM.pdf
comparison-noSRAM.pdf
QuoteReplyEditDelete
2008-12-12 08:12:35 Re: CPLB handler improvements (re-benchmarked)
Bernd Schmidt (GERMANY)
Message: 66663
The improvement over the old version should in fact be even greater, since the register saving entry/exit code in entry.S has been reduced. That happens outside of your profiling points (which is as it should be, to allow comparison with your earlier patch).
Since the results are good, I'll port it to trunk and check it in today or early next week.
QuoteReplyEditDelete
2008-12-15 07:53:08 Re: CPLB handler improvements (re-benchmarked)
Michael McTernan (UNITED KINGDOM)
Message: 66713
> Since the results are good
They are really awesome! Must be infrequent that handcoded assembly is converted to C to see a 6x speedup.
Many thanks for your work on this.
Regards,
Mike