Skip navigation

This Question is Answered (go to answer)

1 "helpful" answer available (2 pts)
2,176 Views 5 Replies Last post: Sep 22, 2009 5:08 AM by CraigG RSS
MikeSmithCanada Regular Contributor 141 posts since
Mar 27, 2009
Currently Being Moderated

Sep 20, 2009 5:55 PM

TigerSHARC simulator accuracy

Am running a simple rectify program on TigerSHARC TS201S

 

// C++ code
int *HalfWaveRectifyReleaseMode(int initial_array[], int final_array[], int N) {
    int *return_pt = final_array;
   
    if (N <= 0) return NULL;
   
    for (int count = 0; count < N; count++) {
        if (initial_array[count] > 0)
            final_array[count] = initial_array[count];
        else final_array[count] = 0;
    }
    return return_pt;
}

 

Using the code to demonstrate differences in speed with various modes of programming  -- debug C, release C and custom assembly code -- float and int versions of rectify

 

We then go in and try to identify areas where stalls might occur to understand the behaviour of the architecture

 

Board is ADDS-TS201S-Exlite Rev 1.1 -- back of board says 1-D-1.2

 

There is nearly a factor of two between timings using cycle counter on the board and those provided with the simulator -- any idea why?

Consistent behaviour across all forms of program (C, debug, relase, asm)

 

Since I am counting cycles  then "board speed" does not count -- meaning cycles / us (power save mode) is irrelevant, and I did not think that TigerSHARCs had a power save mode anyway

 

Board results

 

uS / point  Integer Debug C 0.152225,  Release C 0.022700, First ASM 0.016825
    uS / point  Float Debug C 0.157625,  Release C 0.047850, First ASM 0.017125
    us -- averageTime 0.003194, precision (maxTime - minTime) / 2 0.000088, acceptable 0.000512
    Succesful link to test file CodeTimingComparison_Test_cpp.
    Cycles / point  Integer Debug C 75,  Release C 11, First ASM 8
    Total cycles  Integer Debug C 12044,  Release C 1816, First ASM 1358
    Cycles / point  Float Debug C 77,  Release C 23, First ASM 8
    Total Cycles  Float Debug C 12474,  Release C 3828, First ASM 1352
    Cycles averageTime 0, precision (maxTime - minTime) / 2 0, acceptable 5
    Succesful link to test file CycleCounter_CodeTimingComparison_Test_cpp.
    Succesful link to test file ExploreTigerSHARCASM_Test_cpp.
    Succesful link to test file Rectify_cpp.
    Success: 17 blackbox tests passed.
    Blackbox Assert statistics: 0 Failures, 0 Expected Failures, 28 Successes.
    Whitebox Assert statistics: 0 Failures, 0 Expected Failures, 0 Successes. (Includes C Test statistics)
    Test time: 0.00104717 seconds.

Simulator results

 

uS / point  Integer Debug C 0.072787,  Release C 0.010525, First ASM 0.010450
    uS / point  Float Debug C 0.086925,  Release C 0.031625, First ASM 0.010525
    us -- averageTime 0.001835, precision (maxTime - minTime) / 2 0.000044, acceptable 0.000512
    Succesful link to test file CodeTimingComparison_Test_cpp.
    Cycles / point  Integer Debug C 36,  Release C 5, First ASM 5
    Total cycles  Integer Debug C 5791,  Release C 826, First ASM 831
    .\CycleCounterCodeTimingComparison_Test.cpp(64): Error: Failure in CycleCounter_CodeTimingComparison_Int: integerRelease > firstIntegerAssembly
    Cycles / point  Float Debug C 43,  Release C 15, First ASM 5
    Total Cycles  Float Debug C 6945,  Release C 2531, First ASM 838
    Cycles averageTime 0, precision (maxTime - minTime) / 2 0, acceptable 5
    Succesful link to test file CycleCounter_CodeTimingComparison_Test_cpp.
    Succesful link to test file ExploreTigerSHARCASM_Test_cpp.
    Succesful link to test file Rectify_cpp.
    FAILURE: 1 out of 17 blackbox tests failed.
    Blackbox Assert statistics: 1 Failures, 0 Expected Failures, 27 Successes.
    Whitebox Assert statistics: 0 Failures, 0 Expected Failures, 0 Successes. (Includes C Test statistics)
    Test time: 0.00064387 seconds.

 

 

In case I am doing something obviously wrong -- Example tests looks like this.

This is a TigerSHARC variant of the UnitTest++ testing framework found at SoureForge

 

#define NUMPOINTS 160                         

 

TEST(CycleCounter_CodeTimingComparison_Int)
{
   
    int initialArray[NUMPOINTS] = {
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,
                    1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6,   1, 2, 3, -4, 6               
    };
  int finalArray[NUMPOINTS] = {0, 0, 0, 0, 0};

 

    __int64 measuredTimes[5];
   
    measuredTimes[0] = __ReadCycleCounter64( );
        HalfWaveRectifyDebugMode(initialArray, finalArray, NUMPOINTS);   
    measuredTimes[1] = __ReadCycleCounter64( );
        HalfWaveRectifyReleaseMode(initialArray, finalArray, NUMPOINTS);
    measuredTimes[2] = __ReadCycleCounter64( );
        HalfWaveRectifyASM_Int(initialArray, finalArray, NUMPOINTS);
    measuredTimes[3] = __ReadCycleCounter64( );
    measuredTimes[4] = __ReadCycleCounter64( );
   
    __int64 timerOverHead =  (measuredTimes[4] - measuredTimes[3]);
    __int64 integerDebug =         (measuredTimes[1] - measuredTimes[0] - timerOverHead) / NUMPOINTS;
    __int64 integerRelease =     (measuredTimes[2] - measuredTimes[1] - timerOverHead) / NUMPOINTS;
    __int64 firstIntegerAssembly =
                                 (measuredTimes[3] - measuredTimes[2] - timerOverHead) / NUMPOINTS;   
    printf("Cycles / point  Integer Debug C %d,  Release C %d, First ASM %d\n",
        (int) integerDebug, (int) integerRelease, (int) firstIntegerAssembly);   

 

    timerOverHead =  (measuredTimes[4] - measuredTimes[3]);
    integerDebug =         (measuredTimes[1] - measuredTimes[0] - timerOverHead);
    integerRelease =     (measuredTimes[2] - measuredTimes[1] - timerOverHead);   
    firstIntegerAssembly = (measuredTimes[3] - measuredTimes[2] - timerOverHead);
    printf("Total cycles  Integer Debug C %d,  Release C %d, First ASM %d\n",
        (int) integerDebug, (int) integerRelease, (int) firstIntegerAssembly);    
   
    CHECK( integerDebug > integerRelease);                   // RELEASE FASTER                    
    CHECK( integerDebug > firstIntegerAssembly);   // OUR ASM FASTER
    CHECK( integerRelease > firstIntegerAssembly); // OUR ASM FASTER
}

 

Thanks

 

Mike

AndyC Analog Employee 31 posts since
Jun 2, 2009
Currently Being Moderated
2. Sep 21, 2009 3:51 AM in response to: Michael Smith
Re: TigerSHARC simulator accuracy

Hi Mike,

 

Have you enabled the Branch Target Buffer (BTB) and the cache on the hardware platform?

 

For the TigerSHARC the simulation environment runs a boot kernel to bring the processor into a similar state as what it would in a hardware environment, this involves the enabling of the BTB. When you connect with an emulation session you may have the emulator session resetting the core. This will result in the BTB being disabled in the hardware session.

 

There is a header file "cache_macros.h" included in some of the ADSP-TS201 example projects included in the VisualDSP++ installation (fft_flp32_C for example). The cache and BTB can be enabled as shown below, please see the comments in the cache_macros.h file for further details.

 

/*in the case of TS201, at the beginning of the program the
cache must be enabled. The procedure is contained in the
cache_enable macro that uses the refresh rate as input parameter
      -if CCLK=500MHz, refresh_rate=750
      -if CCLK=400MHz, refresh_rate=600
      -if CCLK=300MHz, refresh_rate=450
      -if CCLK=250MHz, refresh_rate=375
*/

#ifdef __ADSPTS201__
  asm("#include <defts201.h>");
  asm("#include <cache_macros.h>" );
  asm("cache_enable(750);");

#endif

 

Regards

Andy

CraigG Analog Employee 444 posts since
Jan 29, 2009
Currently Being Moderated
5. Sep 22, 2009 5:08 AM in response to: Michael Smith
Re: TigerSHARC simulator accuracy

MikeSmithCanada wrote:

 

*****************

There does not appear to be any  <cache_macros.h> in any include directory under my VisualDSP V5.0 install

 

There are 23 individual copies in directories such as  TS/Examples, but none in the include directory

 

If you had said -- did I remove it from that directory last year accidently doing a move  rather than an intended copy  -- the answer is possibly,

 

However I would have thought the updates would have fixed that.

 

**************************

 

I included a local copy of cache_macros.h in the test build, activated the cache, and the results match between board and simulator. Thanks for the suggestion

 

Hi Mike,

 

These header files are not part of the libraries, so do not appear in the include directory. They are provided by the applications group as part of the examples for TigerSHARC, so only exist in the examples directory. Sorry for any confusion, but thanks for letting us know that including this file evened out your results.

More Like This

  • Retrieving data ...

Bookmarked By (0)