AnsweredAssumed Answered

Fast Table lookup

Question asked by dmphelps on Jun 23, 2009
Latest reply on Oct 1, 2009 by dmphelps

I need to perform a lookup translation of 5M 10-bit values (in 16-bit words) utterly as quickly as possible.  That, of course, calls for an assembly language implementation of this C code.  This is my first project using Blackfin Assembler so some assistance will be appreciated.  Is this the fastest way for me to do this?  Later, I'll ask about how to alternate chunks through L2 cache while this munches on the other chunk.  Thanks!

 

extern unsigned short LookupTable[1024];

void Translate( unsigned short *pIObits, int iLength )

{

    int i;

    for( i = 0; i < Length; i++ )

    {

        IObits[i] = LookupTable[ IObits[i] ] ;

    }

}

 

Below is my Assembly code version but the Assembler complains

"Preg read after write which requires 4 extra cycles" at the assignment into R0.H.

 

.extern _LookupTable;


_Translate_asm:
    P2 = R1;    // Length

    I0 = R0;    // pIObits

    P3.L = _LookupTable

    P3.H = _LookupTable;

    R2.H = 0;     // Clear high word

    M0 = 2;

 

    // Translating two words per iteration
    LSETUP( LoopTop, LoopBottom) LC0 = P2 >> 1;
LoopTop:

    R0 = [I0];  // Load two 10-bit values in separate words
    R0 <<= 1;   // Use each as an index into a word table


    R1 = PACK( R2.H, R0.L);     // Sample N+1

    R0 = PACK( R2.H, R0.H);     // Sample N


    P1 = R1;
    P0 = R0;
   
    R0.H = W[P0 ++ P3]; // Preg read after write which requires 4 extra cycles
    R0.L = W[P1 ++ P3];
   
    [I0++M0] = R0;  // Store translated words
LoopBottom:

 

    RTS;

_Translate_asm.end:

 

Is this the best, fastest way to do this?  What can I do to get rid of the warning?

Advice and suggestions will be appreciated!

 

Bruising my nose on the learning curve...

 

dmp

Outcomes