2008-12-08 12:58:51     Cycle counts for the basic arithmetic operations are not consistent

Document created by Aaronwu Employee on Aug 9, 2013
Version 1Show Document
  • View in full screen mode

2008-12-08 12:58:51     Cycle counts for the basic arithmetic operations are not consistent

Kiran Kumar B (INDIA)

Message: 66517   

 

Hi all,

 

I was trying to profile the basic arithmetic operation of DSP Blackfin BF527  under uClinux. For testing purpose I only have a single threaded application , just running the multiplication of  variables of same data types ( float, long,  long long, ushort, short, int, uint ).  I have included the floating point library options  in the Make File .

 

I observed that each time I run the application, the cycle count for float multiplication is just 2 cycles and also the cycle count  varies for Othe data types from 18 to 34+ ( each time the application is run, gives different cycle count ).

 

Why is there no consistency in the cycle count ?  are there any system calls called inbetween?  Is there any way to run a bare metal app?  ( but with linux thread scheduler ).  Can anyone share with me any previous profiled results for basic arithmetic operations compiled under gcc / VDSP?

 

Thank you

 

Kiran

 

 

QuoteReplyEditDelete

 

 

2008-12-08 13:42:05     Re: Cycle counts for the basic arithmetic operations are not consistent

Robin Getz (UNITED STATES)

Message: 66518   

 

Kiran:

 

It's a multi-tasking operating system - of course other things can go on. Your application can be swapped out, and not come back until something else (higher priority) is complete.

 

-Robin

QuoteReplyEditDelete

 

 

2008-12-08 15:36:11     Re: Cycle counts for the basic arithmetic operations are not consistent

Mike Frysinger (UNITED STATES)

Message: 66523   

 

there is no way to run bare metal code under Linux.  those two terms together simply dont make sense together.

QuoteReplyEditDelete

 

 

2008-12-09 07:38:28     Re: Cycle counts for the basic arithmetic operations are not consistent

Kiran Kumar B (INDIA)

Message: 66556   

 

Robin:  Thank you for the reply...

 

We have the text  in the L1. The I cache is configured for 16K and D cache for 32K.

 

Does it mean that during the swap out of application, the cache will be flushed? Would this add up cycle count  for the arithmetic operations that I am interested in ?   Does a Bare metal app ( complied with VDSP )  have a greater performance in terms of speed, compared to code running under uClinux ?

 

Is there a method to target/profile cache hits & misses?

 

What percentage improvement can we get by running a bare metal app ?

 

Thanks

 

Kiran

QuoteReplyEditDelete

 

 

2008-12-09 07:45:30     Re: Cycle counts for the basic arithmetic operations are not consistent

Mike Frysinger (UNITED STATES)

Message: 66558   

 

caches do not get flushed on context switches, but they'll certainly get polluted.

 

if you're using cycles, then every instruction executed gets added up ... the cycles registers does not differentiate between user/supervisor mode, nor does it know anything about processes.

 

Blackfin code will run exactly the same cycle count regardless when it is active.  the only difference is that obviously bare metal wont have scheduling issues (unless you add scheduling of course).

 

there is no method atm to track cache hits/misses.

 

we've done no benchmarks for bare-metal vs Linux userspace nor do we plan to.  it just doesnt make any sense to.  there's very little overlap between people who use Linux and people who use bare metal.

QuoteReplyEditDelete

 

 

2008-12-09 07:50:18     Re: Cycle counts for the basic arithmetic operations are not consistent

Robin Getz (UNITED STATES)

Message: 66561   

 

Kiran:

 

The only thing I would add to Mike's comments - is that comparing Linux to bare metal is independant of the toolchain, since both VDSP and bfin-elf-gcc can compile things without an OS.

 

-Robin

Attachments

    Outcomes